1. Introduction

PDF (Portable Document Format) files are widely used for sharing documents due to their platform independence and formatting capabilities. However, the size of PDF files can sometimes be large, making them difficult to share or store.

In this tutorial, we will explore techniques for optimizing PDF file sizes on Linux to reduce their sizes without compromising the quality or content.

2. Benefits of PDF Optimization in Linux

Optimizing PDF files has several benefits, including:

  • less storage space, enabling efficient data management
  • faster and more reliable file transfers
  • being easier to open and view on low-resource devices
  • faster upload and download times
  • lower storage costs, decreased network bandwidth usage, and potentially reduced cloud storage subscription fees

Linux supports the ghostscript, qpdf, and exiftool tools for optimizing PDF files.

3. Using ghostscript

ghostscript provides various options to optimize PDF file sizes. Let’s take a closer look.

3.1. Installation

Before installing tools in Linux, we use sudo apt update to update our system.

On Debian-based distributions, we install ghostscript using:

$ sudo apt install ghostscript poppler-utils 

On Fedora/CentOS-based distributions, we run:

$ sudo dnf install ghostscript poppler-utils

3.2. Optimization

To optimize a PDF using ghostscript, we use the gs command:

gs [switches][input]

Here’s a breakdown of the different components of its syntax:

  • [switches] are optional command-line options that modify the behavior of ghostscript. These options are preceded by a hyphen.
  • [input] is the path to the input file(s) we want to optimize.

Let’s check out some switches in an example:

$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

Let’s break down what each of these switches means:

  • -sDEVICE=pdfwrite specifies the output device to be used. In this case, pdfwrite indicates that the output will be in PDF format
  • -dCompatibilityLevel sets the compatibility level for the optimized PDF. In the example, the compatibility level is set to 1.4, which corresponds to the PDF version 1.4
  • -dPDFSETTINGS determines the quality and compression settings for the optimized PDF. Here, we’re optimizing for on-screen viewing. Other possible settings include /ebook (for PDF files that will be viewed on e-book readers), /printer (for producing PDF files for high-quality printing), and /default (a setting that offers a balance between file size and quality)
  • -dNOPAUSE prevents ghostscript from pausing between pages
  • -dQUIET suppresses informational messages
  • -dBATCH prevents ghostscript from exiting after processing the input file
  • -sOutputFile=output.pdf specifies the output file name

4. Using the qpdf Command

qpdf is another useful tool for optimizing PDF file sizes on Linux. It provides advanced optimization or compression options to further reduce the size of PDF files.

4.1. Installation

After updating our system with sudo apt update, we install qpdf based on our Linux distribution:

On Debian-based distributions, we run:

$ sudo apt install qpdf

On Fedora/Centos-based distributions, we run:

$ sudo dnf install qpdf

4.2. Optimization

The syntax for optimizing documents is:

qpdf [options] [input] [output]

Here’s a breakdown of the different components of the qpdf syntax:

  • [options] refers to optional command-line options we use to modify the behavior of qpdf. These options are preceded by a hyphen or double hyphen.
  • [input] is a path to the input PDF.
  • [output] is the path where the output PDF will be saved.

The arguments can be in any order, but the input filename must precede the output filename.

For example, let’s compress the document.pdf file:

$ qpdf --compress-streams=y --object-streams=generate document.pdf qpdf_compressed.pdf

Let’s take a closer look at the arguments:

  • –compress-streams=y instructs qpdf to compress the content streams within the PDF file. Content streams contain the actual data, such as text and images, within the PDF document.
  • –object-streams=generate specifies the handling of object streams in the PDF file. The generate option tells qpdf to generate new object streams during the optimization process, which further reduces the file size.
  • document.pdf is the input file to be optimized.
  • qpdf_compressed.pdf is the output or optimized file.

5. Using the exiftool Command

PDF files can contain metadata such as author names, creation dates, and other information that may contribute to the file size. We can remove this metadata to reduce the PDF file size further using exiftool.

5.1. Installation

We first need to install exiftool.

On Debian-based distributions, we install exiftool by running:

$ sudo apt install libimage-exiftool-perl

On Fedora/Centos-based distributions, we run:

$ sudo dnf install perl-Image-ExifTool

5.2. Optimization

The syntax for optimization is:

exiftool [options] [input]
  • [options] are optional command-line options that modify the behavior of exiftool, preceded by a hyphen.
  • [input] is the path to the document we want to optimize.

For example, to remove metadata from a PDF file document.pdf using exiftool, we execute:

$ exiftool -all:all= document.pdf

In this example, -all:all= specifies the tag name to be modified.

Note that, unlike ghostscript* and qpdf, e*xiftool doesn’t create a new PDF file** but optimizes the same PDF document in place.

6. Differences Between ghostscript, qpdf, and exiftool

Let’s summarize the differences between these three tools*:*

Basis

ghostscript

qpdf

exiftool

Focus

PostScript and PDF files

PDF optimization and manipulation

metadata manipulation

Supported formats

PostScript (PS), Encapsulated PostScript (EPS), PDF

PDF

wide range of file formats, including PDF

Functionality

interpreter for PostScript and PDF page description languages

PDF file manipulation and optimization

metadata manipulation

Output

a new PDF

a new PDF file

the original PDF is optimized

7. Conclusion

In this article, we explored three popular tools for reducing PDFs: ghostscript, qpdf, and exiftool. Optimizing the sizes of PDF files has many benefits, such as more efficient sharing and using less storage.