Streamlining Git Sparse Checkout with git-sparse

Sparse What?

Sparse checkout is probably not a well used feature in git. In a nutshell, the feature allows one to checkout only certain subdirectories in a git repository. For the most part, I don’t envision people use this feature much. The only reason why I use the git sparse checkout feature is because of work.

At work, we use Subversion as the VCS (imagine my horror)^[1]. The SVN repository layout follows the standard recommended layout (e.g. trunk, branches, and tags). There are upwards of 30+ codebases under trunk, and most of the time I only have to work with a handful of the codebases. Since it has been a long time since I use Subversion, I decided to use git svn rather than vanilla Subversion.

Normal Sparse Checkout Workflow

The sparse checkout feature is not enabled by default, one needs to enable it first. So use the following the command to enable sparse checkout

>$ git config core.sparsecheckout true (1)

1	Only enable sparse checkout in the current git repository

Once it is enabled, create a file called sparse-checkout in $GITDIR/info. For the most part, $GITDIR is .git, but that is not always the case especially if one uses the git worktree feature. The content of the sparse-checkout file is where one specifies the subdirectories to checkout in the working tree. Then, use the following command to update the working tree, or get rid of the subdirectories you do not want.

>$ git read-tree -mu HEAD

To summarize the steps:

Enable git sparse out
Create a file, sparse-checkout, inside $GITDIR
Edit and specify the subdirectories in sparse-checkout
Run read-tree to update the working tree

As you can see the workflow is quite clunky. Especially, when I want to checkout a new subdirectory or un-checkout one of the subdirectories in my current working tree. I need to navigate to $GITDIR/info/sparse-checkout and edit its content. That is quite inconvenient because I use git worktree a lot, and the $GITDIR is somewhere else altogether.

Streamlining the Workflow

As outlined above, getting sparse checkout working is clunky but very mechanical. As most developers do when facing mundane tasks, I decide to automate the steps. I create a custom git command to make my life easier.

Creating a custom git command is very simple. Place an executable with name git-<command> in your $PATH.

The custom command I created is called git-sparse, it performs steps 1-3 outlined in the previous section.

Using git-sparse is simple. Navigate to the git repository, then run git sparse in the terminal. The first thing it does is to turn on sparse checkout feature, then use git to determine the correct location of $GITDIR, so git-sparse works with git worktree. Finally, it opens up an editor, so you can immediately start editing the sparse-checkout file. The last step which git-sparse does not do for you is running the read-tree command. You can see git-sparse in action in the following animated GIF.

Subcommand

There is a subcommand to git-sparse, add. The add subcommand is a convenient way to add subdirectories or files to sparse-checkout file.

>$ git sparse add $subdir1 $subdir2 $file1

Implementation

I do not want to dive into details about the implementation in this post, because I plan on writing a few technical posts about how git-sparse is implemented. To keep this short, git-sparse is written in Haskell. While it can be easily implemented using Shell scripts or any other scripting languages, implementing it in Haskell gives me a chance to write more software using a functional programming language, and I want to get some practice in for Haskell. I thought about using Clojure, but the startup time will be too slow.

Building and installing

Haskell stack is required to build git-sparse. Once stack is installed, clone the git-sparse repository. At the root of the repository, run stack install. Stack will build and install an executable called git-sparse in $HOME/.local/bin. Finally, make sure to add $HOME/.local/bin to your $PATH environment variable.

1. SVN is not a right tool when coupled with the company development culture