GitBot – automating boring Git operations with CI

Git is super useful for anyone doing a bit of development work or just trying to keep track of a bunch of text files. However, as your project grows you might find yourself doing lots of boring repetitive work just around Git itself. At least that’s what happened to me and so I automated some boring Git stuff using our continuous integration (CI) system.

There are probably all sorts of use cases for automating various Git operations but I’ll talk about a few that I’ve encountered. We’re using GitLab and GitLab CI so that’s what my examples will include, but most of the concepts should apply to other systems as well.

Automatic rebase

We have some Git repos with source code that we receive from vendors, who we can think of as our upstream. We don’t actually share a Git repo with the vendor but rather we get a tar ball every now and then. The tar ball is extracted into a Git repository, on the master branch which thus tracks the software as it is received from upstream. In a perfect world the software we receive would be feature complete and bug free and so we would be done, but that’s usually not the case. We do find bugs and if they are blocking we might decide to implement a patch to fix them ourselves. The same is true for new features where we might not want to wait for the vendor to implement it.

The result is that we have some local patches to apply. We commit such patches to a separate branch, commonly named ts (for TeraStream), to keep them separate from the official software. Whenever a new software version is released, we extract its content to master and then rebase our ts branch onto master so we get all the new official features together with our patches. Once we’ve implemented something we usually send it upstream to the vendor for inclusion. Sometimes they include our patches verbatim so that the next version of the code will include our exact patch, in which case a rebase will simply skip our patch. Other times there are slight or major (it might be a completely different design) changes to the patch and then someone typically needs to sort out the patches manually. Mostly though, rebasing works just fine and we don’t end up with conflicts.

Now, this whole rebasing process gets a tad boring and repetitive after a while, especially considering we have a dozen of repositories with the setup described above. What I recently did was to automate this using our CI system.

The workflow thus looks like:

human extracts zip file, git add + git commit on master + git push
CI runs for master branch
- clones a copy of itself into a new working directory
- checks out ts branch (the one with our patches) in working directory
- rebases ts onto master
- push ts back to origin
this event will now trigger a CI build for the ts branch
when CI runs for the ts branch, it will compile, test and save the binary output as “build artifacts”, which can be included in other repositories
GitLab CI, which is what we use, has a CI_PIPELINE_ID that we use to version built container images or artifacts

To do this, all you need is a few lines in a .gitlab-ci.yml file, essentially;

stages:
  - build
  - git-robot

... build jobs ...

git-rebase-ts:
  stage: git-robot
  only:
    - master
  allow_failure: true
  before_script:
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    - eval $(ssh-agent -s)
    - ssh-add <(echo "$GIT_SSH_PRIV_KEY")
    - git config --global user.email "[email protected]"
    - git config --global user.name "Mr. Robot"
    - mkdir -p ~/.ssh
    - cat gitlab-known-hosts >> ~/.ssh/known_hosts
  script:
    - git clone [email protected]:${CI_PROJECT_PATH}.git
    - cd ${CI_PROJECT_NAME}
    - git checkout ts
    - git rebase master
    - git push --force origin ts

We’ll go through the Yaml file a few lines at a time. Some basic knowledge about GitLab CI is assumed.

This first part lists the stages of our pipeline.

stages:
  - build
  - git-robot

We have two stages, first the build stage, which does whatever you want it to do (ours compiles stuff, runs a few unit tests and packages it all up), then the git-robot stage which is where we perform the rebase.

Then there’s:

git-rebase-ts:
  stage: git-robot
  only:
    - master
  allow_failure: true

We define the stage in which we run followed by the only statement which limits CI jobs to run only on the specified branch(es), in this case master.

allow_failure simply allows the CI job to fail but still passing the pipeline.

Since we are going to clone a copy of ourselves (the repository checked out in CI) we need SSH and SSH keys set up. We’ll use ssh-agent with a password-less key to authenticate. Generate a key using ssh-keygen, for example:

ssh-keygen

kll@machine ~ $ ssh-keygen -f foo
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in foo.
Your public key has been saved in foo.pub.
The key fingerprint is:
SHA256:6s15MZJ1/kUsDU/PF2WwRGA963m6ZSwHvEJJdsRzmaA kll@machine
The key's randomart image is:
+

GitBot – automating boring Git operations with CI

Automatic rebase

More to explore

Quick setup of a GKE Cluster with ArgoCD pre-installed using Terraform

Inside the improved CI logs management experience for multi-line commands

Introducing the GitLab CI/CD Catalog Beta

We want to hear from you

Ready to get started?

GitBot – automating boring Git operations with CI

Automatic rebase

Sign up for GitLab’s newsletter

More to explore

Quick setup of a GKE Cluster with ArgoCD pre-installed using Terraform

Inside the improved CI logs management experience for multi-line commands

Introducing the GitLab CI/CD Catalog Beta

We want to hear from you

Ready to get started?