Skip to content

Running Post-deployment Migrations Manually through a Change Request

This is a process which can be used to execute post-deployment migrations for GitLab.com that have failed repeatedly when executed using the PDM pipeline. The reason for repeated failure is (usually) an inability to acquire the lock required for a PDM when user traffic is high.

Such PDMs should be executed manually through a Change request at the lowest traffic times, usually early on Monday morning in APAC.

The PDM will be executed in a Rails console inside the Toolbox pod, in the main stage of the gprd environment. Before executing the PDM, it is essential to ensure that the Toolbox pod is protected from voluntary eviction during PDM execution using a Pod Disruption Budget Kubernetes resource. This resource is only added temporarily and must be removed once the PDM execution is complete.

You will need access to the main stage of gprd environment through kubectl to follow this runbook.

Follow the guide to get Kubernetes API access and authenticate with the GKE cluster using kubectl.

You will also need access to run kubectl apply in gprd.

2. Connect to the Regional cluster in gprd

Section titled “2. Connect to the Regional cluster in gprd”

Connect to the regional cluster in the gprd environment: gke_gitlab-production_us-east1_gprd-gitlab-gke

3. Prepare Pod Disruption Budget resource manifest

Section titled “3. Prepare Pod Disruption Budget resource manifest”

Add a PDB targeting the gitlab-migrations-toolbox Deployment:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: gitlab-migrations-toolbox-pdb
namespace: gitlab
spec:
maxUnavailable: 0
selector:
matchLabels:
app: toolbox

Save this manifest as /tmp/pdb.yaml.

This manifest will work as-is. You can confirm that the gitlab-migrations-toolbox is the only Deployment targeted by this PDB by running this command locally:

Terminal window
$ kubectl get deploy --selector='app=toolbox' --no-headers --namespace gitlab
gitlab-migrations-toolbox 1/1 1 1 77d

Apply the above manifest using kubectl apply:

Terminal window
# PDB manifest saved to /tmp/pdb.yaml
$ kubectl apply --dry-run='server' -f /tmp/pdb.yaml
poddisruptionbudget.policy/gitlab-migrations-toolbox-pdb created (server dry run)
$ kubectl apply -f /tmp/pdb.yaml
poddisruptionbudget.policy/gitlab-migrations-toolbox-pdb created

This is the process for opening a Rails console in the Toolbox pod:

Terminal window
$ kubectl exec -it deploy/gitlab-migrations-toolbox --container toolbox --namespace gitlab -- /bin/bash
git@gitlab-migrations-toolbox-58d74dbc76-6868v:/$ gitlab-rails console
--------------------------------------------------------------------------------
Ruby: ruby 3.2.8 (2025-03-26 revision 13f495dc2c) [x86_64-linux]
GitLab: 18.5.0-rc43-ee (6ca6c213697) EE
GitLab Shell: 14.45.3
PostgreSQL: 16.9
------------------------------------------------------------[ booted in 36.25s ]
Loading production environment (Rails 7.1.5.2)
irb(main):001>

Now that the PDB has been created, the Toolbox pod is protected from voluntary evictions.

Within the Rails console, run the following commands:

require Rails.root.join('db/post_migrate/20250101001122_post_deployment_migration.rb')
PostDeploymentMigration.new.up

Once PDM execution is completed successfully, proceed to the next step.

A successful run outputs something like:

== 20250101001122 PostDeploymentMigration: migrating ==============================
...
== 20250101001122 PostDeploymentMigration: migrated (X.XXXs) ================

If the migration raises an exception, do not delete the PDB yet. Escalate to the team that authored the migration to investigate. Only delete the PDB once the situation is resolved.

The PDB resource will prevent the Toolbox pod from being evicted when we need it to be evicted for a legitimate reason (such as migration to a different node due to node autoscaling)

So, delete the PDB resource from the cluster:

Terminal window
$ kubectl delete -f /tmp/pdb.yaml