Blog News GitLab Gitaly project now supports the SHA 256 hashing algorithm
August 28, 2023
4 min read

GitLab Gitaly project now supports the SHA 256 hashing algorithm

Gitaly now supports SHA-256 repositories. Here's why it matters.

git-241.jpg

We've taken a huge step in SHA-256 support in GitLab: The Gitaly project now fully supports SHA-256 repositories. While there is still some work we need to do in other parts of the GitLab application before SHA-256 repositories can be used, this milestone is important.

What is SHA-256?

SHA-256 is a hashing algorithm. Given an input of data, it produces a fixed-length hash of 64 characters with hexadecimal digits. Git uses hashing algorithms to generate IDs for commits and other Git objects such as blobs, trees, and tags.

Git uses the SHA-1 algorithm by default. If you've ever used Git, you know that commit IDs are a bunch of hexademical digits. A git log command yields something like the following:

commit bcd64dba39c90daee2e1e8d9015809b992174e34 (HEAD -> main, origin/main, origin/HEAD)
Author: John Cai <[email protected]>
Date:   Wed Jul 26 13:41:34 2023 -0400

    Fix README.md

The bcd64dba39c90daee2e1e8d9015809b992174e34 is the ID of the commit and is a 40-character hash generated by using the SHA-1 hashing algorithm.

In SHA-256 repositories, everything is the same except, instead of a 40-character ID, it's now a 64-character ID:

commit e60501431d52f6d06b4749cf205b0dd09141ea0b3155a45b9246df24eee9b97b (HEAD -> master)
Author: John Cai <[email protected]>
Date:   Fri Jul 7 12:56:52 2023 -0400

    Fix README.md

Why SHA-256?

SHA-1, which has been the algorithm that has been used until now in Git, is insecure. In 2017, Google was able to produce a hash collision. While the Git project is not yet impacted by these kinds of attacks due to the way it stores objects, it is only a matter of time until new attacks on SHA-1 will be found that would also impact Git.

Federal regulations such as NIST and CISA guidelines, which FedRamp enforces, set a due date in 2030 to stop using SHA-1, and encourage agencies to move away from it sooner if possible.

In addition, SHA-256 has been labeled experimental in the Git project for a long time, but as of Git 2.42.0, the project has decided to remove the experimental label.

What does this mean for developers?

From a usability perspective, SHA-256 and SHA-1 repositories really don't have a significant difference. For personal projects, SHA-1 is probably fine. However, companies and organizations are likely to switch to using SHA-256 repositories for security reasons.

See SHA-256 in action

If you have sha256sum(1) installed, you can generate such a hash on the command line:

> printf '%s' "please hash this data" | sha256sum
62f73749b40cc70f453320e1ffc37e405ba50474b5db68ad436e64b61fbb8cf0  -

We can also see this in action in a Git repository. Let's create a repository, add an initial commit, and inspect the contents of the commit object. Note: If you try this yourself, the commit IDs will be different because the date of the commit is part of the hash calculation.

> git init test-repo
> cd test-repo
> echo "This is a README" >README.md
> git add .
> git commit -m "README"
[main (root-commit) 328b61f] README
 1 file changed, 1 insertion(+)
 create mode 100644 README.md
> zlib-flate -uncompress < ./git/objects/32/8b61f2449205870f69b5981f58bd8cdbb22f95
commit 159tree 09303be712bd8e923f9b227c8522257fa32ca7dc
author John Cai <[email protected]> 1688748132 -0400
committer John Cai <[email protected]> 1688748132 -0400

README

In the last step, we uncompress the actual commit file on disk. Git zlib compresses object files before storing them on disk.

zlib-flate(1) is a utility that comes packaed with qpdf that uncompresses zlib compressed files.

Now, if we feed this data back into the SHA-1 algorithm, we get a predictable result:

> zlib-flate -uncompress < .git/objects/32/8b61f2449205870f69b5981f58bd8cdbb22f95 | sha1sum
328b61f2449205870f69b5981f58bd8cdbb22f95  -

As we can see, the result of this is the commit ID.

The recommendation by NIST was to replace SHA-1 with SHA-2 or SHA-3. The Git project has undergone this effort, and the current state of the feature is that it's fully usable in Git and no longer deemed experimental.

In fact, you can create and use repositories with SHA-256 as the hashing algorithm to see it in action on your local machine:

> git init --object-format=sha256 test-repo
> cd test-repo
> echo "This is a README" >README.md
> git add .
> git commit -m "README"
[main (root-commit) e605014] README
 1 file changed, 1 insertion(+)
 create mode 100644 README.md
> git log
commit e60501431d52f6d06b4749cf205b0dd09141ea0b3155a45b9246df24eee9b97b (HEAD -> master)
Author: John Cai <[email protected]>
Date:   Fri Jul 7 12:56:52 2023 -0400

    README

We want to hear from you

Enjoyed reading this blog post or have questions or feedback? Share your thoughts by creating a new topic in the GitLab community forum. Share your feedback

Ready to get started?

See what your team could do with a unified DevSecOps Platform.

Get free trial

New to GitLab and not sure where to start?

Get started guide

Learn about what GitLab can do for your team

Talk to an expert