1. Overview
Cropping PDF files can be useful to remove unwanted margins, adjust the layout, or make the content fit a particular format.
In this tutorial, we’ll see how to crop PDF files using the Linux command line. We’ll look at how to crop single and multi-page PDFs, both automatically and manually, and how to set different cropping margins for different pages.
2. Prerequisites
Let’s get the necessary software and three sample PDFs.
2.1. pdfcrop, pdfjam, pdfinfo, pdfunite and datamash
To crop PDF pages, we need pdfcrop and pdfjam, which on Debian-based distributions are included in the extra utilities of the TeX Live distribution.
We also need pdfinfo to get the page size of a PDF and pdfunite to merge PDF files. Both are provided by the poppler-utils package.
Finally, we need datamash to do command-line calculations. We can install all of them with apt:
$ sudo apt install texlive-extra-utils poppler-utils datamash
texlive-extra-utils is a very large package, so it may take some time to download and install.
Instead, on Fedora-based distributions, we have two lightweight packages for pdfcrop and pdfjam, plus the poppler-utils and datamash packages. In this case, let’s use dnf for the installation:
$ sudo dnf install texlive-pdfjam texlive-pdfcrop poppler-utils datamash
The following examples are based on pdfcrop 1.40, pdfjam 3.03, pdfinfo 22.02.0, and datamash 1.7.
2.2. Sample PDFs
Let’s download these sample PDFs:
$ wget -O 'single-page.pdf' 'https://getsamplefiles.com/download/pdf/sample-2.pdf'
$ wget -O 'multi-page.pdf' 'https://getsamplefiles.com/download/pdf/sample-1.pdf'
$ wget -O 'test.pdf' 'https://raw.githubusercontent.com/Baeldung/posts-resources/main/linux-articles/crop-pdf-files-in-linux-cli/test.pdf'
Let’s see what single-page.pdf and multi-page.pdf look like:
Both documents have wide margins around the text, making them easy to crop.
Instead, test.pdf contains only one line of text on the first page and three lines of text on the second page:
We’ll see later that automatic cropping can’t distinguish decorative and superfluous images, such as those in multi-page.pdf, from the actual content of the document. That’s why test.pdf is more suitable for automatic cropping than the other two sample PDFs.
2.3. pdfSize.sh
Unfortunately, pdfinfo returns the page size of a PDF in pts instead of mm:
$ pdfinfo multi-page.pdf | grep "Page size"
Page size: 612 x 792 pts (letter)
Also, we need to be aware that a PDF can contain pages of different sizes. To easily check the size in mm of all pages in a PDF, let’s save the following script as pdfSize.sh:
#!/bin/bash
# This script takes a PDF file as a parameter and prints the dimensions in millimeters of each page.
# Checks for the presence of the pdfinfo program on the system
if [ -z "$(which pdfinfo)" ]; then
echo "pdfinfo is missing. It's included in the \"poppler-utils\" package of Debian and Fedora"
exit 1
fi
# Check if the parameter is a valid PDF file
if [[ -f "$1" && "$1" == *.pdf ]]; then
# Loop through each page of the PDF file
for page in $(seq 1 $(pdfinfo "$1" | grep Pages | awk '{print $2}')); do
# Extract the width and height in points (1/72 inch) of the current page
width=$(pdfinfo -f $page -l $page "$1" | grep "Page.*size" | awk '{print $4}')
height=$(pdfinfo -f $page -l $page "$1" | grep "Page.*size" | awk '{print $6}')
# Convert the width and height to millimeters (25.4 mm = 1 inch)
width_mm=$(echo "$width * 25.4 / 72" | bc)
height_mm=$(echo "$height * 25.4 / 72" | bc)
# Print the dimensions of the current page
echo "Page $page: $width_mm x $height_mm mm"
done
else
# Print an error message if the parameter is not a valid PDF file
echo "Please provide a valid PDF file as a parameter."
fi
So, let’s check the size of our example PDFs:
$ ./pdfSize.sh single-page.pdf
Page 1: 215 x 279 mm
$ ./pdfSize.sh multi-page.pdf
Page 1: 215 x 279 mm
Page 2: 215 x 279 mm
$ ./pdfSize.sh test.pdf
Page 1: 210 x 297 mm
Page 2: 210 x 297 mm
Now we’re ready to crop them.
3. Cropping a Single-Page PDF
Let’s start with the simplest case, a single-page PDF.
3.1. Auto-Cropping
The easiest way to crop a PDF file is to use pdfcrop, which automatically detects the bounding box of the content and removes the surrounding margins:
$ pdfcrop single-page.pdf single-page-autocropped.pdf
[...]
$ ./pdfSize.sh single-page-autocropped.pdf
Page 1: 166 x 209 mm
This only works as expected if we consider the green line above the text to be part of the content:
If we aren’t satisfied with the result, we need to do manual cropping.
3.2. Manual Cropping
To choose the crop margins according to our needs, we can use pdfjam. For example, we can do a crop similar to pdfcrop, but also remove the green line at the top and leave some space between the text and the edge of the paper:
$ pdfjam --keepinfo --trim "20mm 48mm 55mm 25mm" --fitpaper true single-page.pdf --outfile single-page-manualcropped.pdf
[...]
$ ./pdfSize.sh single-page-manualcropped.pdf
Page 1: 140 x 206 mm
Let’s take a closer look at these options:
- –keepinfo → preserves the metadata of the original PDF file, such as title, author and date
- –trim “20mm 48mm 55mm 25mm” → removes the specified amount of space from the left, bottom, right and top edges of each page, respectively
- –fitpaper true → adjusts the paper size according to the trimmed space (without this option, the cropped PDF will be scaled to A4)
Let’s see the result:
This time, the result may be a little more satisfying.
4. Cropping Multi-Page PDFs (Uniform Crop)
Now let’s see how to crop a multi-page PDF file evenly. That is, we want to apply the same crop margins to all pages of the PDF file.
4.1. Auto-Cropping
pdfcrop won’t automatically crop our multi-page.pdf because it doesn’t distinguish between images that have only an aesthetic function and the actual content that is represented only by text:
$ pdfcrop multi-page.pdf multi-page-autocropped.pdf
[...]
$ ./pdfSize.sh multi-page-autocropped.pdf
Page 1: 215 x 279 mm
Page 2: 215 x 279 mm
However, even for a document containing only text, pdfcrop applies a different crop for each page based on the detected content. Therefore, a more sophisticated solution is needed, which is represented by the following script, a slightly modified version of Jethro Kuan’s pdf_crop.md. Let’s save it as pdfUniformCrop.sh:
#!/bin/bash
# Checks for the presence of pdfcrop and datamash
for cmd in pdfcrop datamash; do
if [ -z "$(which $cmd)" ]; then
echo "Please install \"$cmd\""
exit 1
fi
done
pdfcrop --bbox "$(
pdfcrop --verbose "$@" |
grep '^%%HiResBoundingBox: ' |
cut -d' ' -f2- |
LC_ALL=C datamash -t' ' min 1 min 2 max 3 max 4
)" "$@"
The code may seem cryptic, so let’s understand it:
- The inner pdfcrop command prints the bounding box coordinates of each page to stdout
- The –verbose option causes pdfcrop to output the %%HiResBoundingBox line, which has four numbers: left, bottom, right, and top
- grep filters only lines that begin with %%HiResBoundingBox:
- cut removes the %%HiResBoundingBox: prefix, leaving only the four numbers separated by spaces
- The LC_ALL=C option sets the locale to C, which ensures that decimal numbers are formatted with a dot, not a comma
- datamash calculates the smallest bounding box that contains all the pages of the PDF file
- The outer pdfcrop command takes the four numbers calculated by datamash and uses them as the –bbox option, which specifies the bounding box to crop on all pages
Let’s test pdfUniformCrop.sh with test.pdf:
$ ./pdfUniformCrop.sh test.pdf test-cropped.pdf
[...]
$ ./pdfSize.sh test-cropped.pdf
Page 1: 10 x 12 mm
Page 2: 10 x 12 mm
From the last output, we have confirmation that the two pages have been cropped identically. Let’s take a look at the result:
The cropped PDF is as expected. As an addendum, pdfUniformCrop.sh accepts other options that it passes to pdfcrop, such as –margins, but we won’t get into those.
4.2. Manual Cropping
To crop a multi-page PDF file manually, we can use pdfjam again:
$ pdfjam --keepinfo --trim "23mm 40mm 25mm 17mm" --fitpaper true multi-page.pdf --outfile multi-page-manualcropped-uniform.pdf
[...]
$ ./pdfSize.sh multi-page-manualcropped-uniform.pdf
Page 1: 167 x 222 mm
Page 2: 167 x 222 mm
Let’s see the result:
This is one of those cases where more precise manual cropping could be differentiated page by page.
5. Cropping Multi-Page PDFs (Different Margins)
Finally, let’s explore how to crop a multi-page PDF file with different margins for different pages.
5.1. Auto-Cropping
As discussed earlier, pdfcrop doesn’t crop our multi-page.pdf. However, we can try again with our test.pdf:
$ pdfcrop test.pdf test-autocropped.pdf
[...]
$ ./pdfSize.sh test-autocropped.pdf
Page 1: 10 x 3 mm
Page 2: 10 x 12 mm
The last output shows that the pages have been cropped differently. Let’s see the result:
Again, the result is as expected.
5.2. Manual Cropping
One way to manually crop the PDF pages differently is to use pdfjam with the desired margins for each page and then merge them into a single file with pdfunite:
$ pdfjam --keepinfo --trim "23mm 40mm 25mm 17mm" --fitpaper true multi-page.pdf 1 --outfile page1.pdf
[...]
$ pdfjam --keepinfo --trim "23mm 150mm 25mm 30mm" --fitpaper true multi-page.pdf 2 --outfile page2.pdf
[...]
$ pdfunite page1.pdf page2.pdf multi-page-manualcropped-diffmargins.pdf
$ ./pdfSize.sh multi-page-manualcropped-diffmargins.pdf
Page 1: 167 x 222 mm
Page 2: 167 x 99 mm
This way each page is cropped as we want it:
Now we can delete the temporary files page1.pdf and page2.pdf if we don’t need them.
6. Conclusion
In this article, we learned how to crop PDF files using the Linux command line.
We saw how to use pdfcrop and pdfjam to crop single and multi-page PDFs, both automatically and manually, and how to set different cropping margins for different pages. We also discussed how to install the packages that contain all the necessary tools.
None of this is trivial and requires some attention. However, these are handy skills for manipulating PDFs with personal scripts without relying on online services or graphics programs that are difficult to automate.