1. Introduction
Git is a version control system (VCS) that also features tracking, merging, and restoring changes. It does this partially by resolving conflicts between iterations of the same file. For example, one main point of divergence often comes down to whitespace. Yet, sometimes, we might not place much importance on this particular difference.
In this tutorial, we explore ways to ignore whitespace when dealing with different versions of the same file in git. First, we briefly refresh our knowledge about line endings and how Git handles them. After that, we go through one of the main subcommands for comparing file versions. Next, we discuss some non-fundamental operations and their treatment of whitespaces. Then, we turn to one of the most important subcommands for synchronizing changes across different paths. Finally, we talk about the current Git status and how to filter it through our file whitespace requirements.
We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. It should work in most POSIX-compliant environments unless otherwise specified.
2. Line Endings and Git
There are three main combinations that are usually considered as a possible EOL (End-Of-Line):
In fact, it’s fairly well-known that what constitutes a new line changes between platforms, operating systems, and languages:
- Microsoft Windows:
** , \r\n - Apple macOS: initially
(\r), now - Linux:
(\n) - Python: \n (not
)
Notably, the Python \n character doesn’t relate to the original meaning of \n. Instead, the sequence gets replaced by the correct line ending for the platform that the interpreter runs on. This isn’t the case for languages like Perl, where an explicit layer should convert the endings.
Similarly, although Git can recognize line endings, the user often has to pick how to synchronize them across files when using a non-aligning platform such as Microsoft Windows:
- checkout
** , commit: perform conversion - checkout as-is, commit
: potentially perform conversion - checkout as-is, commit as-is: do not perform conversion
Assuming we choose the last option and don’t bother with the line endings, we might still want to ignore them when differences come up. For example, other contributors to the same repository could be performing a conversion between these formats. In such cases, Git would either directly overwrite or prompt the user about what to do with potential conflicts.
Since trailing whitespace is rarely a deciding factor when doing a merge or similar operations, we might want to ignore it.
3. When Using git diff or git rebase
Since diff is a fundamental part of the way Git checks for changes, knowing how to direct the subcommand is one way to generally affect comparisons in the system. In fact, Git rebase subcommand which applies certain commits over another base uses the same switches that we talk about below.
For brevity, we only show examples with more low-level diff. In particular, diff provides many ways to reduce the effect of whitespace on its results.
3.1. Ignore Line End Changes
To begin with, the –ignore-cr-at-eol flag skips any
$ git diff crlf lf
diff --git a/crlf b/lf
index af8c8f8..090219e 100644
--- a/crlf
+++ b/lf
@@ -1,3 +1,3 @@
-Line 1.
-Line 2.
+Line 1.
+Line 2.
$ git diff --ignore-cr-at-eol crlf lf
$
This way,
To be even more flexible, we can ignore all differences at the end of a line via –ignore-space-at-eol:
$ git diff --ignore-space-at-eol crlf lf
$
Here, diff wouldn’t consider any number of spaces before the newline characters a difference.
3.2. Ignore Spacing
Of course, we might want to almost entirely ignore spacing altogether via –ignore-space-change (-b):
$ git diff --ignore-space-change crlf lf
$
Notably, this option still differentiates between the lack and presence of whitespace.
To completely ignore the existence of horizontal whitespaces, we use –ignore-all-space (-w):
$ git diff --ignore-all-space crlf lf
$
In this case, there wouldn’t be a difference between a line and the same line with its whitespace removed. Yet, vertical whitespace is still significant.
3.3. Flexible Ignoring
Finally, we can also –ignore-matching-lines=
However, this option ignores complete lines, so using a regular expression that matches each line, such as one with EOL characters, would dismiss whole files.
4. When Using git blame
The blame subcommand shows revision numbers and authors per line:
$ git blame file
c666beef (root 2024-01-10 10:00:01 -0600 1) repo
Since sometimes one author may only add or remove whitespace on a given line, adding a flag to ignore such changes may be beneficial:
$ git blame file
c666beef (user2 2024-01-10 10:00:01 -0600 1) repo
$ git blame -w file
^0667dea (user1 2024-01-10 10:01:00 -0600 1) repo
In this case, we see user2 as the last author for a given line (1). However, if we ignore the [-w]hitespace changes, user1 turns out to have performed last considerable modification.
5. When Using git apply
The Git apply subcommand applies a patch within a tree.
When doing so, we might want to modify the behavior depending on any whitespace changes the patch introduces or expects. To do that, we use the –whitespace=
- nowarn: disables warnings
- warn (default): warns of some issues, applies patch
- fix: warns of some issues, fixes issues, applies patch
- error: warns of some issues, doesn’t apply patch
- error-all: warns of all issues, doesn’t apply patch
What constitutes such errors depends on the core.whitespace configuration. There are several error conditions by default:
- trailing whitespace
- lines that only comprise whitespace
- tab after a space in the indentation
While this doesn’t directly relate to newlines, it does pertain to spacing in general.
Notably, the –whitespace option is passed from subcommands like rebase down to apply, when needed.
6. When Using git merge
When using merge, Git has a more complex algorithm.
However, the merge subcommand provides the -X option for passing arguments to diff, the actual tool that performs the conflict checks before the merge operation.
So, we can use the same options we already discussed:
- ignore-space-change
- ignore-all-space
- ignore-space-at-eol
- ignore-cr-at-eol
In particular, merge uses so so-called merge strategies that dictate how the data should be synchronized. There are default ones, but we can also specify a list via the –strategy=
In any case, -X passes parameters to the diff command at the base of any strategy:
$ git merge -Xignore-cr-at-eol b1
In this example, we pass –ignore-cr-at-eol while merging branch b1.
7. When Using git status and Others
Still, there are commands in Git that don’t have native support for whitespace settings. One example is the status subcommand since it shows the working tree status along with all changes.
Despite that, we can use a workaround that pipes and processes the output of status:
$ for modfile in $(git status | awk '$0 ~ /modified/ {print $2}'); do
if test -z "$(git diff --ignore-all-space $modfile)"; then
echo $modfile
fi
done
Here, we get the current git status and pass it to an awk script. The latter checks $0 each line against the modified keyword, printing the $2 second column in case of a match. In short, the for loop iterates through all $modfile modified files as returned by git status.
After that, we test whether the output of a git diff with the –ignore-all-space flag for each modified file is an empty string. This means that the changes the file contains don’t amount to more than whitespace. If so, we print the filename. Of course, we can pick other related diff flags as well.
As a result, we get a list of files the changes of which might not be of concern.
8. Conclusion
In this article, we went over many fundamental git subcommands and explored how to handle file whitespace changes with each.
In conclusion, Git is a versatile platform that enables the use of different options to tweak even seemingly minor details related to file versioning.