1. Introduction
Git provides ways to specify whether and which local files and directories are part of a given repository. Still, Git itself tracks files, not directories. Thus, it’s non-trivial to add a directory without any files at all in it to a repository.
In this tutorial, we explore Git objects to come up with ways to add an empty and almost empty directory to a Git repository. First, we delve into the main Git objects. After that, we briefly mention the link between Git and filesystem objects. Next, we set up a sample repository. Finally, we go through different ways of creating or preserving an empty directory in Git.
We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. Unless otherwise specified, it should work in most POSIX-compliant environments.
2. Sample Repository
For demonstration purposes, let’s clone and list a sample Git repository:
$ git clone https://github.com/f/awesome-chatgpt-prompts && cd awesome-chatgpt-prompts
[...]
$ tree
.
├── CNAME
├── _config.yml
├── CONTRIBUTING.md
├── _layouts
│ └── default.html
├── LICENSE
├── prompts.csv
├── README.md
└── scripts
└── find-prompt
3 directories, 8 files
Here, the tree command shows several files and two (2) directories: _layouts and find-prompt. Importantly, we don’t see the .git subdirectory or other hidden objects.
3. Git Objects
Git uses three kinds of objects to construct, organize, and maintain data:
- blob: binary data storage
- tree: low-level index tree of blobs, not commit tree
- commit: a snapshot of potentially multilevel trees
These objects usually live under .git/objects/.
Although annotated tags are also objects, we won’t go into them.
Let’s briefly understand each of the others.
3.1. Git Blob
The main function of blobs is to store data and reduce redundancy while increasing flexibility.
New blobs appear in several situations:
- add new file to track
- add tracked file to staging
- change synchronization
Blobs have a fairly basic format:
blob <SIZE_BYTES>\0<CONTENT_BINARY>
As an object, a blob usually remains until no object points to it. For example, a rebase with a drop or squash might lead to such a situation.
3.2. Git Tree
Although users rarely need to understand or directly use them, Git is based around tree structures. In this case, we talk about the low-level trees that describe data in a repository.
Such trees in Git associate blobs with file paths and permissions that belong to them. Effectively, this is what a directory does in most conventional filesystems.
Tree formats build on top of blob definitions by adding a tree header:
tree <SIZE_BYTES>\0
<FILE_1_MODE> <1_PATH>\0<1_BLOB_HASH>
<FILE_1_MODE> <2_PATH>\0<2_BLOB_HASH>
...
<FILE_N_MODE> <N_PATH>\0<N_BLOB_HASH>
Here, we can already understand why an empty, unreferenced directory won’t serve a purpose in Git.
In fact, commit operations trigger the creation of a new tree after a blob is already cached, i.e., a file is added. This is called a root tree and serves as a snapshot of the working directory:
$ git ls-tree 9f94573322353b1f1ccb298c7f8383fc64a589e8
040000 tree 112461b5254d5c2929e158e20f396e8594095ab2 .github
100644 blob 3571f7ca907e841f7aa19052d8ca842175ee8f50 CNAME
100644 blob 157836ef5c6f9f77330c4b775317b290985176d4 CONTRIBUTING.md
100644 blob 0e259d42c996742e9e3cba14c677129b2c1b6311 LICENSE
100644 blob 12ad110a86ef5d9d6195d251d1bf74a8c7d848ed README.md
100644 blob 37c2dff3edf8dda54bbc5affaaaa39e72c0e772f _config.yml
040000 tree ef29873b74edc934cf13b5cff2d498d6c1fbc5eb _layouts
100644 blob afbe51859a3a341210262125756d603f444990b2 prompts.csv
040000 tree 285f36b3cb794bedc3ee98bea91455ee7deca681 scripts
Here, the ls-tree subcommand lists the root tree, which includes two object types:
- tree, i.e., subdirectory definition
- blob, i.e., file (change) definition
From left to right, each object has several fields:
- permissions
- type specifier
- hash identifier
- filesystem name
Similar to other objects, we can use the show subcommand on tree hashes:
$ git show 285f36b3cb794bedc3ee98bea91455ee7deca681
find-prompt
In this case, we see that the find-prompt file is part of the given subdirectory tree. In this output, subdirectories have a / forward slash suffix.
3.3. Git Commits
Effectively, a commit just points to the root tree object of a given snapshot. From there, Git orients itself around the structure to handle the data.
As we already saw, by using a commit identifier (ID), we can see the snapshot tree. That’s because the snapshot root tree is the commit ID:
$ git rev-parse 285f36b3cb794bedc3ee98bea91455ee7deca681^{tree}
285f36b3cb794bedc3ee98bea91455ee7deca681
As the rev-parse subcommand shows, the identifiers are identical.
4. Git Files and Directories
In general, filesystem objects are of two main types:
- file: non-directory objects with data
- directory: contains file lists as metadata
Even though both types are seen as files by an operating system (OS) like Linux, directories are actually file containers.
From the viewpoint of Git, repositories are local directories. However, the root directory of a repository isn’t of consequence.
What usually matters to Git is the file tree.
5. Attempt to Directly add Empty Directory
Let’s attempt to add the void empty subdirectory to the repository we have:
$ mkdir void
$ git add void
No issues so far. Also, we can see the directory in the local filesystem:
$ tree
.
├── CNAME
├── _config.yml
├── CONTRIBUTING.md
├── _layouts
│ └── default.html
├── LICENSE
├── prompts.csv
├── README.md
├── scripts
│ └── find-prompt
└── void
4 directories, 8 files
Now, we commit the changes:
$ git commit --all --message 'empty directory'
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean
As expected, Git ignores void and refuses to add and track it as a new object in the repository since it’s not related to any files – the items that Git is supposed to track.
There are some situations in which this might become a problem. Depending on the context and restrictions, we can work around it in different ways.
6. Force-Add Empty Directory as Tree
Although not a standard or established workflow, we can add a completely empty directory to a Git repository via the *tree subcommands.
Again, we use void as the name of our empty directory:
$ EMPTYDIR=void
To begin with, we use the mktree subcommand to create an empty tree object and a parent tree that contains it:
$ emptytree=$(cat /dev/null | git mktree)
roottree=$(printf '040000 tree %s\t%s' $emptytree $EMPTYDIR | git mktree)
In both cases, we just pipe the correct syntax to git m**ktree.
Next, we commit the tree and check the structure:
$ initcommit=$(git commit-tree $roottree -m 'tree with empty directory')
$ git branch branch1 $commit
$ git checkout branch1
Let’s confirm the existence of the empty directory tree:
$ git ls-tree HEAD
040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904 void
At this point, we have an empty directory in the Git metadata but no reflection of that in the filesystem. This is rarely useful, but leveraging commit-tree is the only way to even have the directory object in Git.
All other methods involve at least one file.
7. Using the Official .gitkeep Method
When we need to preserve a directory regardless of its contents as part of a repository structure, we use the .gitkeep file.
To clarify, let’s see how we go about doing that for the void empty directory:
$ mkdir void
$ touch void/.gitkeep
There are several benefits to using .gitkeep:
- standard Git file
- hidden in UNIX-like environments
- empty file
- filename relates to Git
- filename declares purpose
Still, there are times when we might want to ensure no other files are present in the directory.
8. Adding a .gitignore
Continuing the trend of placeholders, we can go a step further and use another Git-specific metafile:
$ mkdir void
$ touch void/.gitignore
While the ignore suffix can be confusing, there are several benefits to this approach as well:
- standard Git file
- hidden in UNIX-like environments
- small file
- filename relates to Git
- ability to control directory contents with regard to Git
In particular, we can populate the .gitignore file with specific contents, assuming the empty directory is void:
$ cat void/.gitignore
# ignore all directory files
*
# except .gitignore
!.gitignore
This way, we ensure that Git ignores any other files in the directory we want empty.
9. Using a Placeholder
If a given directory has a special function apart from being needed by other parts of the repository, we can also add a placeholder file.
Let’s see an example:
$ mkdir void
$ touch void/placeholder
Of course, we can keep it hidden in UNIX-like environments by prefixing it with a period. Such a file would be akin to the .placeholder files in the standard cron directories.
In this case, we use the placeholder file to ensure Git tracks the directory that contains it:
$ git add void
$ git commit --all --message 'almost empty directory'
[main b506662] almost empty directory
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 void/placeholder
Even at this point, Git doesn’t separately refer to void, only to the file within.
One variation of this method is a README file with a description of the need for this directory.
This solution works when we don’t mind having a custom object within the empty directory.
10. Summary
In this article, we explored the concepts of Git objects with the main example of creating and maintaining an empty directory within a repository.
In conclusion, although Git doesn’t track directories, we have ways to work around this limitation and add an empty or almost empty directory to a Git repository.