1. Overview

Metadata stored in a file provides information about the file. This information may be different depending on the type of file. File size, creator or author of the file, and data quality might be some of the metadata information.

Sometimes, we may want to conceal the metadata in a file from others for security reasons. For example, we may want to delete the location where an image file was taken.

In this tutorial, we’ll discuss how to delete the metadata of a file. We’ll examine the exiftool, exiv2, fmpeg, mat2, and cpdf utilities for deleting the metadata in different types of files.

2. Using exiftool

We can use the exiftool utility to display and update the metadata in a file.

We’ll use a PDF file, example.pdf, as an example. Let’s first list the metadata of this file using exiftool:

$ exiftool example.pdf
…
Title                           : Introduction to Linux
Creator                         : .
Subject                         : Linux,Bash
…
$ exiftool example.pdf | wc -l
32

We listed only three of the fields. The wc command shows that there are 32 fields in total.

Instead of displaying all fields, we can display a specific field of metadata, for example, the Title, by explicitly specifying it in the command line:

$ exiftool –title example.pdf
Title                           : Introduction to Linux

We can clear a specific field by overwriting it with empty data:

$ exiftool –title= example.pdf
    1 image files updated

We seem to have updated the file successfully according to the message in the output. Let’s verify it using exiftool:

$ exiftool –title example.pdf

As expected, the Title field doesn’t exist this time.

It’s also possible to clear all metadata information:

$ exiftool –all= example.pdf
Warning: [minor] ExifTool PDF edits are reversible. Deleted tags may be recovered! - example.pdf
    1 image files updated

The PDF file was updated according to the message in the output. The -all= option clears all metadata. Let’s check the metadata of example.pdf once more:

$ exiftool example.pdf
ExifTool Version Number         : 12.60
...
Warning                         : [Minor] Ignored duplicate Info dictionary
$ exiftool example.pdf | wc -l
17

The number of fields is now 17. The Title, Creator, and Subject fields don’t exist.

2.1. The Necessity for Using qpdf

There was a warning in the output of exiftool -all= example.pdf reporting that the edits are reversible, and the deleted tags may be recovered. Recovery of the deleted metadata is possible using exiftool again:

$ exiftool -pdf-update:all= example.pdf
    1 image files updated
$ exiftool example.pdf | grep Title
Title                           : Introduction to Linux

The -pdf-update:all= option of exiftool reverts the deletion of tags. As is apparent from the output of exiftool example.pdf | grep Title, the previously deleted Title field now exists. We filtered the output of exiftool using the grep command.

If the recovery of the deleted fields is a problem in our usage scenario, we can use the qpdf command after deleting the tags using exiftool:

$ exiftool –all= example.pdf
Warning: [minor] ExifTool PDF edits are reversible. Deleted tags may be recovered! - example.pdf
    1 image files updated
$ qpdf --linearize example.pdf example_linearized.pdf

The qpdf utility converts a PDF file to another PDF file. It can perform several transformations like linearization, encryption, and decryption.

Linearization is a method of optimizing PDF files so that they can be efficiently streamed on the Web. The –linearize option of qpdf is for this purpose.

The name of the linearized PDF file in our example is example_linearized.pdf. We can’t retrieve the previously deleted tags once we linearize a PDF file.

Let’s try to get back the deleted tags in example_linearized.pdf using the -pdf-update:all= option of exiftool:

$ exiftool -pdf-update:all= example_linearized.pdf 
Error: File contains no previous ExifTool update - example_linearized.pdf
    0 image files updated
    1 files weren't updated due to errors
$ exiftool example_linearized.pdf | grep Title

Using exiftool together with the -pdf-update:all= option for example_linearized.pdf gives an error now since deleted tags can’t be recovered for a linearized PDF file. The output of exiftool example_linearized.pdf | grep Title shows that the Title field doesn’t exist.

We can use exiftool to remove the metadata of other files like image and video files. Moreover, the deletion of metadata of image and video files using only exiftool is irreversible.

3. Using exiv2

Another option for removing metadata is using the exiv2 command. However, exiv2 removes the metadata of only image files. It supports several types of images. We’ll use a JPEG file, example.jpg, as an example.

The metadata information in an image file may contain information depending on the application or device we use to create the image file. Here are some examples of metadata information:

  • The camera or phone model
  • When the photo was taken
  • Technical settings such as focal length

Let’s first list the metadata information in example.jpg using exiftool:

$ exiftool example.jpg
…
Shutter Speed Value             : 1/33
Aperture Value                  : 2.2
Brightness Value                : 1.61
…
$ exiftool example.jpg | wc -l
84

There are 84 fields in the image file. Now, let’s remove the metadata information in example.jpg using exiv2:

$ exiv2 rm example.jpg
$ exiftool example.jpg
ExifTool Version Number         : 12.16
...
Megapixels                      : 6.0
$ exiftool example.jpg | wc -l
19

The rm option of exiv2 used in exiv2 rm example.jpg deletes the metadata in example.jpg. There are 19 fields now in the output of exiftool example.jpg. For example, the Shutter Speed Value, Aperture Value, and Brightness Value fields are missing in the last output.

Therefore, exiv2 is another option for deleting the metadata in image files.

4. Using ffmpeg

ffmpeg is a versatile utility for handling multimedia files. We can also use it for removing the metadata information in multimedia files.

We’ll use a video file in MP4 format, example.mp4, as an example:

$ ffmpeg –i example.mp4 –c:a copy –c:v copy –map_metadata -1 example_modified.mp4 >& /dev/null

The -i option ffmpeg specifies the input file, which is example.mp4 in our case.

The -c option is for selecting the encoder for a specific stream. The -c:a copy part of the command in our example specifies to copy the audio stream as-is to the output file without recoding.

Similarly, the -c:v copy part specifies to copy the video stream as-is to the output file without recoding.

The -map_metadata option is for updating the metadata information. Passing the special value -1 to this option removes all metadata.

Finally, the last part of the command, example_modified.mp4, is the name of the output file.

We directed the output of the command to /dev/null to discard the details of the copy operation.

Let’s compare one of the fields of the metadata information in example.mp4 and example_modified.mp4 using exiftool:

$ exiftool example.mp4 | grep Android
Android Version                 : 6.0.1
$ exiftool example_modified.mp4 | grep Android

A metadata field named Android Version exists in the input image example.mp4. However, this field doesn’t exist in the output file example_modified.mp4. Therefore, we’re successful in deleting the metadata information in a video file using the -map_metadata option of ffmpeg.

5. Using mat2

Another alternative we can use is mat2, a Python-based tool for deleting metadata for different file types. It works by generating a clean version of the input file without the metadata.

So, we can use the apt-get package manager to install mat2 on Debian-based systems:

$ sudo apt-get install mat2

Next, let’s display the metadata information for the example file, file.pdf:

$ exiftool file.pdf
...
Create Date                     : 2022:07:22 13:02:01+03:00
Modify Date                     : 2022:07:22 13:02:01+03:00
Document ID                     : uuid:C25DADCE-3776-4DDB-988D-543451DE7AFE
...
$ exiftool file.pdf | wc -l
25

So, the number of fields in the metadata information is 25. Now, let’s utilize the mat2 tool:

$ mat2 file.pdf

Above, mat2 processes file.pdf and creates a cleaned version named file.cleaned.pdf:

$ exiftool file.cleaned.pdf | wc -l
14

In this cleaned version, there are only 14 fields in the metadata information. For instance, fields like Create Date, Modify Date, and Document ID are absent from the metadata information of the file.cleaned.pdf.

6. Using cpdf

With the help of Coherent PDF Command-Line Tools (cpdf), we can manipulate PDF files, removing their metadata.

Before proceeding, we need to ensure cpdf is installed on our system. Once this is done, we can proceed to display the metadata information on the example file, project.pdf:

$ exiftool project.pdf
...
Creator Tool                    : Microsoft® Word 2019
Document ID                     : uuid:273BDBB0-E069-40AA-B5D6-38CECDCEB2A5
Instance ID                     : uuid:273BDBB0-E069-40AA-B5D6-38CECDCEB2A5
...
$ exiftool project.pdf | wc -l
25

At this point, let’s remove the metadata:

$ cpdf -remove-metadata project.pdf -o project_cleaned_version.pdf

Let’s discuss the command above:

  • -remove-metadata – this option removes metadata from project.pdf
  • project.pdf – represents the input file
  • -o project_cleaned_version.pdf – specifies the output PDF file (after metadata removal)

Now, let’s verify metadata has been removed:

$ exiftool project_cleaned_version.pdf | wc -l
21

Above, we see there are now only 21 metadata fields. In this case, fields like Creator Tool and Document ID are missing from the metadata information.

7. Conclusion

In this article, we discussed how to delete the metadata information from a file.

First, we discussed the exiftool utility. We saw that we could clear all metadata in a PDF file using exiftool. However, the deletion of metadata can be reverted. So we used the qpdf utility together with its –linearize option to prevent the recovery of deleted metadata.

Next, we discussed the exiv2 utility. It’s useful for deleting metadata in image files.

Then, we saw that we could use the ffmpeg utility for deleting metadata in video files. After that, we explored mat2, a Python-based tool useful for deleting metadata in different file types.

Finally, we showed how to use cpdf to delete metadata in PDF files.