1. Overview

In this tutorial, we’ll learn how to search through the Git commit history for occurrences of a certain text pattern. Specifically, we’ll learn how the git log command can filter commits with a text pattern in either the commit message or the commit diff.

2. Searching Commit Message or Diff Using git log

When working with Git, it’s often necessary to search for a specific text within a commit message or a commit diff. This can be helpful for finding commits that change a line of code or finding a commit that introduces a bug.

Let’s look at a sample Git commit object:

$ git show 20ec5cf2aaf273eaa941337af1213b8033fab2b5
commit 20ec5cf2aaf273eaa941337af1213b8033fab2b5
Author: Ilotana Ainebab <[email protected]>
Date:   Thu Dec 26 13:22:48 2019 +0300

    Simplify nil check for slice

    len() for nil slices is defined as zero (gosimple)

diff --git a/runtime/ui/key/binding.go b/runtime/ui/key/binding.go
index 8697145..5d4605a 100644
--- a/runtime/ui/key/binding.go
+++ b/runtime/ui/key/binding.go
@@ -65,7 +65,7 @@ func NewBindingFromConfig(gui *gocui.Gui, influence string, configKeys []string,
                if err != nil {
                        return nil, err
                }
-               if keys != nil && len(keys) > 0 {
+               if len(keys) > 0 {
                        parsedKeys = keys
                        break
                }

The Git commit object has a header that shows the commit hash, the author, and the date. Then, it has the body of the message that the author entered when the commit was made. Finally, there’s a commit diff at the end of the commit object that shows the changes this commit introduced.

The commit messages and commit diffs can be searched using the git log command.

The git log command is a command that shows the commit log in chronological order. It starts from the latest commit on the current branch and walks backward until it reaches the first commit. Additionally, the command accepts options such as –grep-S, and -G to further filter the commits based on a text pattern.

In the subsequent section, we’ll see how to use git log to search for text in either the commit messages or the commit diffs.

3. The –grep Option

The git log command takes the –grep option and shows commits with the commit message that matches the pattern specified by the option’s argument.

For instance, we can search for other commits with the word “Simplify” in its commit message using git log –grep Simplify:

$ git log --grep Simplify
commit 20ec5cf2aaf273eaa941337af1213b8033fab2b5
Author: Anatoli Babenia <[email protected]>
Date:   Thu Dec 26 13:22:48 2019 +0300

    Simplify nil check for slice

    len() for nil slices is defined as zero (gosimple)

commit 68acfcdd64131693c198bb6a415c301b04f7d128
Merge: 3752d64 1fa41a3
Author: Alex Goodman <[email protected]>
Date:   Sun Jul 21 15:48:25 2019 -0400

    Merge pull request #206 from muesli/linter-fixes

    Simplify code

Since the –grep option takes a regex pattern as an argument, we can use any valid regular expression syntax to do pattern matching.

For example, we can match the word “Simplified” or “Simplify” using the \w* expression:

$ git log --grep Simpl\w*
commit 20ec5cf2aaf273eaa941337af1213b8033fab2b5
Author: Anatoli Babenia <[email protected]>
Date:   Thu Dec 26 13:22:48 2019 +0300

    Simplify nil check for slice

    len() for nil slices is defined as zero (gosimple)

commit 68acfcdd64131693c198bb6a415c301b04f7d128
Merge: 3752d64 1fa41a3
Author: Alex Goodman <[email protected]>
Date:   Sun Jul 21 15:48:25 2019 -0400

    Merge pull request #206 from muesli/linter-fixes

    Simplify code

commit 4ab1ce983b5c05a2cfe2efc90201a3181f89c808
Merge: 0ca94f2 04d4881
Author: Alex Goodman <[email protected]>
Date:   Sun Jul 21 15:45:53 2019 -0400

    Merge pull request #202 from muesli/simplified-fixes

    Simplified boolean comparisons

The last commit message contains the word “Simplified” which matches the regex pattern Simpl\w*.

3.1. Case-Insensitive Matching

The git log –grep command supports case-insensitive matching through the -i option:

$ git log -i --grep simplified

The command above returns commits that have the word “simplified” in their commit message regardless of their case.

3.2. Different Flavors of Regular Expression

In Linux, there are generally 3 popular flavors of regular expressions. By default, the –grep command interprets the argument as the basic regular expressions (BRE).

To use extended regular expressions (ERE), we can use the -E option along with the –grep option in the git log command:

$ git log -E --grep (simplified|simplify)

Additionally, we can also use Perl-compatible regular expressions (PCRE) using the -P option:

$ git log -P --grep simplified(?=code)

Finally, to turn off the regular expression interpretation and make the search only consider fixed strings, we can use the -F option:

$ git log -F --grep simplif.*

With the -F option, the command won’t evaluate the argument as a regular expression. Therefore, the above command only matches the “*simplif.**” word verbatim.

3.3. Misconception About the –grep Option

One common misconception one might have is the idea that the git log –grep will search for everything we can see in the output of the git log. Therefore, one would be wondering if we can also search for the diffs by passing the -p to the git log command. Since the -p option makes the git log command also print the commit diffs, could we use the same –grep function to search for both commit message and commit diffs?

This however won’t work, because the –grep command looks only at the message body for matches and not the commit diffs even if it’s printed on the console output.

For example, let’s consider a history with only one commit:

$ git log -p
commit 2f477ef3b6db96040e1dd908a958566b4f5f0e77 (HEAD -> master)
Author: mjchi7 <[email protected]>
Date:   Sun Apr 2 09:59:16 2023 +0800

    initialize repository

    Setting up the project

diff --git a/main.py b/main.py
new file mode 100644
index 0000000..0349a44
--- /dev/null
+++ b/main.py
@@ -0,0 +1 @@
+import json

In the command above, we pass the -p option to git log to also print the commit diffs. Let’s try running the –grep option on the commit diff:

$ git log -p --grep json

The command didn’t return any result because the word “json” doesn’t appear in the commit’s message.

4. The Git Pickaxe Functionality

To search for a string in the commit diffs, we’ll need to use the pickaxe functionality. The pickaxe functionality in Git looks for commits where a given string pattern is added, removed, or moved in the files. In other words, it inspects the diff of a commit and performs the matching.

The git log supports the pickaxe functionality through the -S and -G options. Let’s look at the nuances of these two options for the pickaxe functionality.

4.1. Changes in the Number of Occurrences

The -S option only considers a commit as a match if the number of occurrences of the string is changed in that commit.

To better understand this option, Let’s consider a simple Git repository with the following history:

$ git log 
commit 8fe0d4e37bf00f20798e9db8cb383f76f0b72c6e (HEAD -> master)
Author: mjchi7 <[email protected]>
Date:   Sat Apr 1 11:08:34 2023 +0800

    remove import

diff --git a/main.py b/main.py
index d820b59..e69de29 100644
--- a/main.py
+++ b/main.py
@@ -1 +0,0 @@
-import matplotlib.pyplot as plt

commit 909b0a026e4115d6ee29a9fd9eeeee389806eccc
Author: mjchi7 <[email protected]>
Date:   Sat Apr 1 11:08:10 2023 +0800

    modify import line

diff --git a/main.py b/main.py
index 0e92f4a..d820b59 100644
--- a/main.py
+++ b/main.py
@@ -1 +1 @@
-import matplotlib.pyplot
+import matplotlib.pyplot as plt

commit 60d9e1be2d0607d8443f1481aee0c8ad3cb0dda8
Author: mjchi7 <[email protected]>
Date:   Sat Apr 1 11:07:31 2023 +0800

    add new file

diff --git a/main.py b/main.py
new file mode 100644
index 0000000..0e92f4a
--- /dev/null
+++ b/main.py
@@ -0,0 +1 @@
+import matplotlib.pyplot

The history consists of three commits. The first commit adds the new file main.py and in the file, we add the line import matplotlib.pyplot. Then, the second commit modifies the line. Finally, the third commit removes the import.

Let’s run the git log -S command on the repository to look for commits where the number of occurrences of the string “matplotlib” changed:

$ git log -S matplotlib
commit 8fe0d4e37bf00f20798e9db8cb383f76f0b72c6e (HEAD -> master)
Author: mjchi7 <[email protected]>
Date:   Sat Apr 1 11:08:34 2023 +0800

    remove import

commit 60d9e1be2d0607d8443f1481aee0c8ad3cb0dda8
Author: mjchi7 <[email protected]>
Date:   Sat Apr 1 11:07:31 2023 +0800

    add new file

From the output, we see that the command only returns the first and the last commits. This is because in the first commit, the number of occurrences of the string change from zero to one. Similarly, in the last commit, the number of occurrences of the string changed from one to zero.

On the other hand, the second commit isn’t a match because the change in the second commit doesn’t alter the number of occurrences of the word.

Notably, the -S option treats the argument as a fixed string. To make the option interprets the argument as a regular expression, we can additionally pass the –pickaxe-regex option:

$ git log  --pickaxe-regex -S "(matplotlib|mpl)"

The command above returns all the commits if the changeset contains the word “matplotlib” or “mpl”.

If we want to find out all the commits where the word appears in the changeset, we can use the -G option.

4.2. Appearance of the String in Commit Diff

The -G option is another flavor of the pickaxe functionality. In contrast to the -S option, the -G option returns the commits as long as the word we are searching for appears in the commit changeset.

For the same scenario as the previous subsection, running the command git log -G would give us the following matches:

git log -G matplotlib
commit 8fe0d4e37bf00f20798e9db8cb383f76f0b72c6e (HEAD -> master)
Author: mjchi7 <[email protected]>
Date:   Sat Apr 1 11:08:34 2023 +0800

    remove import

commit 909b0a026e4115d6ee29a9fd9eeeee389806eccc
Author: mjchi7 <[email protected]>
Date:   Sat Apr 1 11:08:10 2023 +0800

    modify import line

commit 60d9e1be2d0607d8443f1481aee0c8ad3cb0dda8
Author: mjchi7 <[email protected]>
Date:   Sat Apr 1 11:07:31 2023 +0800

    add new file

The git log command considers all three as a match because all three commits have the string “matplotlib” in their commit diffs.

Additionally, the -G option readily interprets the argument as a regular expression without any additional flags.

5. Conclusion

In this article, we’ve looked at the different ways we can search for a string in the Git history.

Specifically, we’ve learned that the git log –grep command searches for a string among the commit messages. Then, we looked at the pickaxe functionality in the git log command to perform searches in the commit changeset. Furthermore, we’ve also learned about the nuances between the two flavors of the -S and -G pickaxe functionality.