1. Introduction

PDF (Portable Document Format) files are widely used for sharing and distributing documents due to their platform-independent nature.

Sometimes, we may need to remove the last page of a PDF file to modify or extract content. In this tutorial, we’ll explore how to remove the last page of a PDF file from the command line in Linux using the pdftk and ghostscript commands.

2. Benefits of Using Command Line to Remove the Last Page of a PDF File

Removing the last page of a PDF file from the command line can be helpful in various scenarios. If the last page of our PDF file contains sensitive information, removing the last page ensures that such sensitive information remains private. Also, if the last page of our PDF is unnecessary, removing it can reduce the size of the PDF file.

When dealing with many PDF files, we can process this task as a batch, which makes it easier and faster.

3. Using the pdftk Command

pdftk is a valuable tool for manipulating PDF files in the command line.

3.1. Installation

The first step is to update our system with sudo apt update. We can install pdftk on Ubuntu with:

$ sudo apt-get install pdftk

On Fedora-based distributions, pdftk isn’t available on the official repositories. But, we can still install this tool using EPEL (Extra Packages for Enterprise Linux), which is a third-party repository.

We’ll first need first to enable the EPEL repository:

$ sudo dnf install epel-release

Then, we’re ready to install pdftk:

$ sudo dnf install pdftk

3.2. Removing the Last Page From a PDF File

The general syntax for removing the last page of a PDF file using pdftk is:

pdftk <inputFile> [options] [outputFile]

Let’s break down this syntax further:

  • inputFile is the name of the file whose last page we want to remove.
  • options specify other commands like cat for concatenating PDF pages.
  • outputFile is the name of the desired PDF without the last page.

For options, we’ll specify:

  • 1-r2 as the page range, where 1 refers to the first page, the hyphen (-) specifies a range, r indicates pages should be reversed, and r2 denotes the second-to-last page of our document
  • output to enable us to add a custom name to the resulting PDF

So, if we want to remove the last page of main.pdf, we’ll run:

$ pdftk main.pdf cat 1-r2 output mainRemoved.pdf

This generates our desired result in the file mainRemoved.pdf.

3.3. Removing the Last Pages From Multiple PDF Files

In some cases, we may have several PDF files to process. Let’s say that they’re in the same directory. We can do this using a Bash script we’ll name batch_pdftk.sh:

#!/bin/bash
command -v pdftk >/dev/null 2>&1 || { echo >&2 "pdftk is required but not installed. Aborting."; exit 1; }

for file in *.pdf; do
    echo "Removing last page from $file..."
    pdftk "$file" cat 1-r2 output "${file%.pdf}_output.pdf"
done

echo "All PDF files processed successfully."

The first line checks if pdftk is installed. If not, we exit with an error message. If yes, we iterate through all the PDFs in the current directory and remove their last pages. The output files will be saved with the suffix _output added to their names.

The next step is to make the script batch_pdftk.sh executable by running chmod +x batch_pdftk.sh. To use it, we run the script from the folder containing our PDF files:

$ ./batch_pdftk.sh

4. Using the ghostscript Command

ghostscript is a powerful interpreter for the PostScript language and PDF files that allows us to manipulate PDF files from the command line.

4.1. Installation

After updating our system with sudo apt update, we can install ghostscript based on our Linux distribution.

We install ghostscript on Ubuntu-based distribution with:

$ sudo apt install ghostscript poppler-utils 

On Fedora-based, we use:

$ sudo dnf install ghostscript poppler-utils

4.2. Removing the Last Page From a PDF File

To remove the last page of a PDF file, we use the gs command:

gs [switches][inputFile]

Here are the arguments for its syntax:

  • [switches] are optional command-line options that modify the behavior of ghostscript. These options are preceded by a hyphen.
  • [inputFile] is the path to the input file(s) we want to optimize.

We’ll use the following switches to achieve our goal:

  • -sDEVICE=pdfwrite specifies the output device to be used. Here, we specify PDF format.
  • dNOPAUSE prevents ghostscript from pausing between pages.
  • -dBATCH disables ghostscript from exiting after processing the input file.
  • -dSAFER executes ghostscript in a safer mode to prevent potentially unsafe PostScript operations.
  • -dLastPage=1 specifies we are excluding the last page.

For example, to remove the last page of main.pdf, we run:

$ gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dLastPage=1 -sOutputFile=output.pdf main.pdf

The resulting PDF, with the last page removed, is output.pdf.

4.3. Removing the Last Pages From Multiple PDF Files

Let’s create a script named batch_gs.sh to iterate through all PDF files in the current directory and remove their last pages:

#!/bin/bash
command -v gs >/dev/null 2>&1 || { echo >&2 "Ghostscript is required but not installed. Aborting."; exit 1; }

for file in *.pdf; do
    echo "Removing last page from $file..."
    gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dLastPage=1 -sOutputFile="${file%.pdf}_output.pdf" "$file"
done

echo "All PDF files processed successfully."

The first line of code checks if ghostscript is installed. The logic and usage are the same as for the script we created for pdfkt.

5. Conclusion

In this article, we explored two methods for removing the last page from a PDF file using ghostscript and pdftk in Linux. Additionally, we learned how to use a Bash script to process multiple PDF files in a batch.