Deploying a change to gitlab.rb

From time to time we deploy changes that result in modifications to gitlab.rb. These are distilled notes from one such occurrence which may provide context and confidence to future SRE’s. It is by no means fully comprehensive, but should help let you know which questions still need to be asked.

Lay of the land

Ominbus Installations

/etc/gitlab/gitlab.rb is the source of truth which Omnibus uses to generate the configuration of the various GitLab components. Chef is involved twice, in different contexts, which is an important point.

Kubernetes Installations

The end result is various YAML files that will be used by the workloads (equivalent to what would, on VMs, be generated by gitlab-ctl reconfigue) which are managed via configuration changes in the k8s-workloads/gitlab-com repository

Procedure

GitLab’s infrastructure level chef installation runs on nodes roughly every 30 minutes. This:
1. Deploys a change to gitlab.rb, typically because a cookbook changed or a secret was changed in GKMS encrypted vaults or similar.
2. Triggers a gitlab-ctl reconfigure (from gitlab.rb changing)
The reconfigure will run a local chef deployed by the omnibus package, with omnibus recipes. This:
1. Updates the configuration files of various components (commonly YAML) in their canonical locations, based on the contents of gitlab.rb
2. Restarts/reloads components if necessary
For Kubernetes Infrastructure, follow the guide in the repository’s README

Implications

For GitLab.com, we mostly deploy individual components to distinct sets of machines (e.g. gitaly, postgres), controlled by various ‘enabled’ flags in gitlab.rb. The corollary of this is that gitlab-ctl reconfigure will only touch the configuration files of components that are affected by the change to gitlab.rb. So if, for example, your change only affects the gitlab-rails component on the frontend web machines, it’s quite safe to only manually shepherd the change on that class of machines, and let it just be deployed naturally on all the others. Of course determining the class of machines affected can still be challenging; you may have to ask around, or go spelunking through omnibus-gitlab to confirm.

A non-obvious detail: some components, notably gitlab-rails (running under puma) and possibly gitaly also (TBC), have a safe and clean restart operation; it’s actually a HUP, and can be done safely without any extra effort (e.g. no need to drain from the load balancer while the restart occurs)

Known process

As almost all our rails applications now run in Kubernetes, you will need to make a change to the gitlab-com repository following the documentation in that repo.

Rolling out changes that require restarts

Refer to ../uncategorized/deploycmd.md

Where can I find out more?

In slack:

#g_distribution are experts in omnibus packaging.
#infrastructure-lounge (just barely) contains the SRE’s who between them have a fair amount of experience with doing these things for real, and are happy to help

Omnibus source code: The chef content is in files/gitlab-cookbooks/
GitLab Helm Chart