Cells and Auto-Deploy
This document is to describe how Auto-Deploy works and measures we can take to pause or stop operations if a need arises. All content in this document is specific to Cells and does not impact the Auto-Deploy process in its entirety.
Workflow
Section titled “Workflow”The current workflow is not optimal, but there is work in progress to improve it. Currently there are 4 total child pipelines, each of which have a unique engagement.
sequenceDiagram
participant a as release-tools
participant c as cells-tissue
critical Trigger ring deploy
a ->>a: deploy:ringY: upgrade
a ->>c: deploy:ringY
end
critical Generate upgrade ring Y
c ->>c: generate_child_pipeline
c ->>c: execute_child_pipeline
end
critical Generate cell X upgrade pipeline
c ->>c: cell X ring Y
end
critical Upgrade cell X
c ->>c: authorize
c ->>c: auto-deploy
c ->>c: PDM
end
Trigger ring deployupdates thepre_releaseversion in thecells/tissuerepo and generates a given ring deployment.- Note this does not perform the upgrade, but only commits the desired version. No pipelines are triggered at this point as the commit is performed with a
skip-ciannotation to prevent any pipeline from starting.
- Note this does not perform the upgrade, but only commits the desired version. No pipelines are triggered at this point as the commit is performed with a
Generate upgrade ring Ygenerates and triggers the pipelines required for all Cells of a given Ring.Generate cell X upgrade pipelinethen generates a child pipeline with the actual work to be had.Upgrade cell Xcontains the actual deployment jobs.- We first check if a deploy shall continue (more on this below).
- We then perform the upgrade operation on the Cell itself. The upgrade operation calls a special command
upgrade-gitlab. - After this, the Post Deployment Migrations (PDM) are run per Cell. The PDM calls a special command
post-deploy-db-migration.
Post Deployment Migrations
Section titled “Post Deployment Migrations”This is currently run automatically on Cells after a Deployment job completes.
The behavior can be modified by setting the Environment Variable PDM_AUTO_EXECUTE to a value of false.
:warning: This must be done prior to the creation of the Upgrade cell X pipeline! :warning:
This is configured inside of cells/tissue: Cells/Tissue CI Variables
There is no automated procedure to set this value back to true, so please ensure any followup item resets this value as desired.
Graduation
Section titled “Graduation”Currently, auto-deployments are only triggered for Ring 0 and 1. Any further ring does not receive an auto-deploy package automatically. Instead, a chunk of packages will begin to collect and an automated procedure creates a merge request to be reviewed that performs two operations:
- it selects the latest package and removes a restriction, enabling it to be promoted to further rings
- it cleans up any other remaining auto-deploy package patch file as they would no longer be needed
This procedure occurs on a nightly basis, thus an MR should exist daily for this. Here’s an example Merge Request. These MR’s require approval based on the standard set of CODEOWNER Rules for this project.
References:
Pausing Auto-Deploy
Section titled “Pausing Auto-Deploy”This can only be done prior to the Trigger ring deploy pipeline having been triggered.
If a pipeline has already begun, it would be unwise to stop it mid-flight as the state of the Cell would need to be heavily investigated to determine what measures are needed to bring the Cell back into a sane state.
How-To
Section titled “How-To”We leverage an environment variable SKIP_RING_AUTO_DEPLOYMENT at the pipeline level to prevent ourselves from committing a change to the cells/tissue repo.
Preventing this from running is crucial to prevent a Cell from running a version that differs than what we’ve defined and also preventing any future configure jobs from coming in and performing an upgrade when not intended.
- Release Tools CI Variables
- Look for
SKIP_RING_AUTO_DEPLOYMENTand set it to a value oftrue
There is no automated procedure to set this value back to false, so please ensure any followup item resets this value as desired.
This prevents the authorize-ring:X job from succeeding.
If later it is deemed safe to perform the deploy, simply reset the environment variable and you can either wait for the next Auto-Deploy, or retry the failed authorize-ring:X job.
Rollbacks
Section titled “Rollbacks”Rollbacks are currently a manually initiated by way of the tool ringctl.
It is advised to work with a Release Manager to determine which version to rollback to and if it is safe or not.
After this is established we can initiate a rollback procedure following these steps:
- Change to the directory where your
cells/tissueproject resides - Ensure you are on the latest -
git pull main - Ensure
ringctlis updated -mise install ringctl rollback <VERSION> --ring <INT> [--pause-after-ring <INT>]- Example:ringctl rollback 17.7.202412021200-1921419d268.2414fa154a7 --ring 0- This rolls the version of GitLab back to
17.7.202412021200-1921419d268.2414fa154a7starting at Ring0asynchronously targeting each ring until all rings (except for quarantine) have completed. - Leverage the
--pause-after-ringto prevent further rings from being patched unnecessarily.
- This will automatically create a Merge Request
- Seek approvals and merge as standard procedure
- Watch the associated pipelines on the Ops instance to watch it proceed to rollback on the desired rings.
:warning: This procedure will currently target an entire ring, this is not capable of being Cell specific at the moment. :warning:
Also, consider that unless Auto-Deploy is paused, we may upgrade to a package which may not have an appropriate fix.
Thus one should consider adding the SKIP_RING_AUTO_DEPLOYMENT environment variable described above until a fix has made it into a package.