1. Introduction
The ubiquitous Git versioning system can track, merge, and restore changes to files and directories as part of its features. After making any modifications, we commit them to the current tree. Each commit has a unique identifier (ID), so we can restore and merge the changes from different commits. However, this ID can be hard to remember, so there are other methods to indicate a given commit.
In this tutorial, we talk about Git commit identification and tagging. First, we briefly refresh our knowledge about commits. After that, we explore ways to see details about a commit. Next, we look at commit identification. Then, we move on to references. Finally, we explain and show practical examples of commit tagging.
We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. Unless otherwise specified, it should work in most POSIX-compliant environments.
2. Git Commits
Again, a commit is a snapshot of the repository changes submitted at a given time.
Essentially, commits form the repository history as a trail or chain of modifications to the structure.
To understand this concept, let’s use git with the log subcommand:
$ git log
commit e76fd96c911d0ab2d36156066604b691efa86aa1 (HEAD -> master)
Author: x <[email protected]>
Date: Tue Feb 13 11:27:34 2024 -0500
major modifications
commit dbe16c5120caa5a93d5fcbbdeadbeeff7e2b3b59
Author: x <[email protected]>
Date: Tue Feb 13 11:00:00 2024 -0500
minor modifications
commit 2dac0d5faf151885a0ed1432196674cacbe88168
Author: x <[email protected]>
Date: Sat Feb 10 10:00:00 2024 -0500
init commit
Here, we see three (3) commits: an initial one and two modifications. Each commit has an identifier in the form of a unique 40-character hexadecimal code. In addition, we can see the Author and Date, as well as a commit message.
3. Git Commit Details
Of course, we can check additional data about a commit via the show subcommand:
$ git show e76fd96c911d0ab2d36156066604b691efa86aa1
commit e76fd96c911d0ab2d36156066604b691efa86aa1 (HEAD -> master)
Author: x <[email protected]>
Date: Tue Feb 13 11:27:34 2024 -0500
major modifications
diff --git a/file b/file
index 8bcebf6..6436b28 100644
--- a/file
+++ b/file
@@ -1,3 +1,7 @@
-initial content
+Major Changes
-minor changes
+-deleted all other content
+-introduced new structure
+-added signature
+
+sig
When checking details, we can see the specific modifications that a given commit introduced in the standard diff format for –git.
Notably, we used the commit identifier to view and manipulate it.
4. Git Commit Identifier
As we already saw, the Git commit identifier is a SHA-1 string comprising 160 bits or 40 hexadecimal characters:
$ git show e76fd96c911d0ab2d36156066604b691efa86aa1
However, git accepts a shorter unique prefix of the whole ID:
$ git show e76fd96c
According to official documentation, eight (8) or more characters should be sufficient for unique commit identification within a project.
However, we should use at least four (4):
$ git show e76
fatal: ambiguous argument 'e76': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
Still, huge projects like the Linux kernel can require more than 12 characters to uniquely identify a commit.
However, we might also want to add more data to a commit, either to identify or describe it better.
5. Git Commit Refs (References)
Git commits can also have so-called refs (references). A ref (reference) is a descriptive and usually more human-friendly name for a certain commit.
The link between a given commit and its ref or refs is established in different ways.
5.1. Explore Commit Refs
Actually, we already saw one ref in the output of git log:
$ git log
commit e76fd96c911d0ab2d36156066604b691efa86aa1 (HEAD -> master)
Author: x <[email protected]>
Date: Tue Feb 13 11:27:34 2024 -0500
major modifications
[...]
As we can see, the latest commit of our repository has an ID, but also two alternative names:
- HEAD: special ref, indicating the top of the commit history, i.e., the last commit
- master: default branch name which can also be main or another custom string
Effectively, we can use either of the above to refer to the last commit:
$ git show HEAD
commit e76fd96c911d0ab2d36156066604b691efa86aa1 (HEAD -> master)
[...]
$ git show master
commit e76fd96c911d0ab2d36156066604b691efa86aa1 (HEAD -> master)
[...]
This way, we can quickly walk through the main points of a repository.
Further, the rev-parse subcommand returns the ID of a commit from its ref:
$ git rev-parse master
e76fd96c911d0ab2d36156066604b691efa86aa1
Naturally, HEAD moves with each commit, while master moves as long as we work on that branch.
5.2. Special Refs
There are several special built-in refs:
- HEAD: current commit and branch
- ORIG_HEAD: backup reference to HEAD
- FETCH_HEAD: most recent branch of remote repository
- MERGE_HEAD: commit or commits being merged
- CHERRY_PICK_HEAD: cherry-picking commit
Notably, cherry-picking is the act of selecting and transferring a single commit instead of working with branches and requests.
In addition, we can use relative refs via a suffix:
- ~ with a number (or that number of ~) refers to the first ancestor up the chain with that number
- ^ with a number (or that number of ^) can be used to refer to any ancestor up the chain
In essence, ~ only considers first ancestors, while ^ provides a way to indicate a specific ancestor at the same horizontal level:
C1 C2 C3
\ | /
\ | /
\|/
C4 C5
\ /
\/
C6
Now, let’s see some examples:
+--------+--------+--------+------+-------+
| Commit | ^ | ^ alt | ~ | ~ alt |
+--------+--------+--------+------+-------+
| C6 | C6^0 | | | |
| C4 | C6^1 | C6^ | C6~ | C6~1 |
| C5 | C6^2 | | | |
| C1 | C6^1^1 | C6^^ | C6~~ | C6~2 |
| C2 | C4^2 | C6^^2 | | |
| C3 | C4^3 | C6^^3 | | |
+--------+--------+--------+------+-------+
Notably, we can use an ID, partial ID, or ref in all cases.
5.3. Create Commit Ref
To create a commit ref, we can just create a new branch:
$ git branch branch1
At this point, branch1 is also just a name for the last commit:
$ git show branch1
commit e76fd96c911d0ab2d36156066604b691efa86aa1 (HEAD -> master, branch1)
[...]
In fact, any Git branch is just a pointer to a commit. Contrary to what the term might seem to indicate, a branch doesn’t refer to more than one commit, just the act of branching out.
5.4. Reflog
Notably, based on the refs, Git maintains a reflog, i.e., a backup-like structure that contains nearly all changes in a repository:
$ git reflog
e76fd96 (HEAD -> master, branch1) HEAD@{0}: commit: major modifications
dbe16c5 HEAD@{1}: commit: minor modifications
2dac0d5 HEAD@{2}: commit (initial): init commit
The reflog command displays a brief overview of the entire log of such modifications. In fact, even uncommitted changes to the repository can remain in this log.
5.5. Refs Storage
The refs tree is located in the .git/refs repository subdirectory:
$ tree .git/refs/
.git/refs/
├── heads
│ ├── branch1
│ └── master
└── tags
3 directories, 2 files
Here, we use the tree command to check the current repository refs:
- heads/branch1
- heads/master
We can see the same with more detail via the show-ref subcommand:
$ git show-ref
e76fd96c911d0ab2d36156066604b691efa86aa1 refs/heads/branch1
e76fd96c911d0ab2d36156066604b691efa86aa1 refs/heads/master
Both refs we see are a product of branches. Yet, we also see another directory with the name tags, which is currently empty. Let’s explore that further.
6. Tagging Git Commits
In addition to refs, Git commits can have tags. A tag is a special name for a commit that can then be used as a reference to it. It can comprise any number of characters, as long as it would be a valid path name within the system.
6.1. Tag Types and Data
Tags can be of two varieties:
- lightweight
- annotated
Lightweight tags are only pointers to specific commits.
On the other hand, annotated tags are entire objects with data and metadata:
- tag
- checksum
- tagger name
- tagger email
- date
- tag message
This also enables signing and verifying tags.
Importantly, the date of an annotated tag is the one in GIT_COMMITTER_DATE:
$ export GIT_COMMITTER_DATE="$(git log -1 --format=%aI <COMMIT>)"
In this case, we set the variable via command substitution. Within, we use the log subcommand for just the -1 single latest version of COMMIT, so we can output its date in a given –format.
If GIT_COMMITTER_DATE is unset or invalid, git uses the current date and time. This can be problematic for some commit organizations.
6.2. Create and Delete Tag
Tag creation happens via the tag subcommand.
Lightweight tags are fairly simple to create:
$ git tag <LWTAG>
At this point, the current HEAD commit can also be referred to by LWTAG. Of course, even if HEAD moves, LWTAG remains the same.
To create annotated tags of the current HEAD, we use –annotate:
$ git tag --annotate <TAG> --message=<TAG_MESSAGE>
Notably, the TAG_MESSAGE is mandatory.
Of course, we can also place tags on earlier commits:
$ git tag <TAG> <COMMIT>
Here, we place a lightweight TAG on COMMIT. As expected, COMMIT can be a ref, an ID, or even another tag. Alternatively, we can check out a commit and then tag it. In both cases, it’s a good practice to set and export the GIT_COMMITTER_DATE variable with the date of the commit via $(git log -1 –format=%aI).
Of course, we can delete a tag via the –delete (-d) flag:
$ git tag --delete <TAG>
Notably, the tag is only removed locally.
6.3. Tag Checkout
Interstingly, we can check out tags. However, unlike branches, this leads to a detached state:
$ git checkout <TAG>
In particular, the state is called detached HEAD and comes with a few warnings:
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
Turn off this advice by setting config variable advice.detachedHead to false
Thus, caution is advised when checking out tags.
6.4. Show Tags
To view tags, we can use several commands.
Perhaps the best way to check tags is the Git log subcommand:
$ git --all --decorate --oneline --graph
* e76fd96 (HEAD -> master, tag: antag) major modifications
* dbe16c5 (tag: minor) minor modifications
* 2dac0d5 init commit
Another simple way to see all tags is the –list (-l) flag of the tag subcommand:
$ git tag --list
antag
minor
Let’s use tree with the refs subdirectory to check the tags under .git:
$ tree .git/refs
.git/refs
├── heads
│ ├── branch1
│ └── master
└── tags
├── antag
└── minor
3 directories, 4 files
Now, we can check further details for each one with the show subcommand:
$ git show antag
tag antag
Tagger: x <[email protected]>
Date: Thu Feb 15 10:15:00 2024 -0500
annotated tag
commit e76fd96c911d0ab2d36156066604b691efa86aa1 (HEAD -> master, tag: antag, branch1)
[...]
$ git show minor
commit dbe16c5120caa5a93d5fcbbdeadbeeff7e2b3b59 (tag: minor)
[...]
Thus, we can see the additional information for the annotated tag antag when compared with the lightweight minor tag.
6.5. Transfer Tags
Importantly, push doesn’t transfer tags to remote servers by default.
However, we can explicitly request this feature via the –tags flag:
$ git push origin --tags
Similarly, we can delete a remote tag after removing it locally:
$ git push origin --delete <TAG>
This way, we ensure the same tags are available.
7. Summary
In this article, we talked about Git commits, as well as ways to identify, reference, and tag them.
In conclusion, since commits are one of the central Git objects, knowing how to manage and organize them can be crucial to productivity and stability.