OpenBao for GitLab Secrets Manager Service
- Service Overview
- Alerts: https://alerts.gitlab.net/#/alerts?filter=%7Btype%3D%22secrets-manager%22%2C%20tier%3D%22sv%22%7D
- Label: gitlab-com/gl-infra/production~“Service::RunwayOpenBao”
Logging
Section titled “Logging”We suggest the following filters to focus on relevant project audit logs:
resource.labels.service_name="secrets-manager"jsonPayload.request.path != ""jsonPayload.type = "response"Summary
Section titled “Summary”GitLab Secrets Manager is a built-in secrets management solution for CI pipelines. Secrets are created and managed using GitLab UI, and consumed by CI jobs.
GitLab Secrets Manager relies on the secrets-manager Runway service.
The service is configured and deployed using the
gitlab-secrets-manager-container project.
secrets-manager runs OpenBao, which is a fork of HashiCorp Vault.
The source code of OpenBao lives in
openbao-internal,
a build project that is intended to modify the upstream OpenBao releases.
Architecture
Section titled “Architecture”The Rails backend and runners connect to the secrets-manager service (running OpenBao)
through the CloudFlare WAF and Runway.
OpenBao stores data on the Cloud SQL instance provided by Runway, and gets the unseal key from Google KMS.
OpenBao posts audit logs to the Rails backend.
The GitLab Secrets Manager design docs provides request flow diagrams.
flowchart TB
CloudFlare(CloudFlare: secrets.gitlab.com)
KMS[GCP KMS]
PostgreSQL[GCP CloudSQL from Runway]
Rails-- Manage OpenBao -->CloudFlare
CloudFlare-- https://secrets-manager.production.runway.gitlab.net -->Runway
Runway-->OpenBao
Runner-- Fetch Pipeline Secrets -->CloudFlare
OpenBao-- Decrypt Unseal Key -->KMS
OpenBao-- Storage -->PostgreSQL
OpenBao-->Runway
The service runs multiple OpenBao nodes:
- a single active node
- multiple standby nodes
Nodes connect to the PostgreSQL backend to store data and to acquire a lock.
flowchart TD
Ingress
Service_OB([HTTP API])
subgraph OpenBao
OB_1[Primary]
OB_2[Standby A]
OB_3[Standby B]
Service_Primary([Primary gRPC])
end
Ingress --> Service_OB
Service_OB --> OB_1
Service_OB --> OB_2
Service_OB --> OB_3
OB_2 -. forward .-> Service_Primary
OB_3 -. forward .-> Service_Primary
Service_Primary --> OB_1
OB_1 -->Service_DB
OB_1 -. lock maintenance .->Service_DB
OB_2 -. lock monitor .->Service_DB
OB_3 -. lock monitor .->Service_DB
Service_DB([PostgreSQL]) --> DB[(PostgreSQL)]
OB_1 -- auto-unseal --> KMS
OB_2 -- auto-unseal --> KMS
OB_3 -- auto-unseal --> KMS
Performance
Section titled “Performance”Benchmarking and sizing recommendations are covered by gitlab#568356.
Scalability
Section titled “Scalability”The service is deployed using Runway and its scaling is handled by Cloud Run.
Scalability is configured in runway.yml.
Availability
Section titled “Availability”GitLab Secrets Manager is limited to the Ultimate tier. The feature needs to be enabled in a project.
The service is configured
to be deployed to the us-east1 region.
Durability
Section titled “Durability”Runway performs backup and backup restore validation as configured
for the secrets-manager service.
On Runway backups are always on.
Backup procedure:
- Back up Cloud SQL PostgreSQL database.
- Back up the unseal key material stored on Google Cloud KMS. See runbooks for our internal Vault service, which similarly relies on Google Cloud KMS.
For restore, we suggest the following steps:
- Stop OpenBao.
- Perform the PostgreSQL restore.
- Start OpenBao.
Security/Compliance
Section titled “Security/Compliance”The Cloud SQL PostgreSQL database only contains encrypted data, and the unseal is stored on Google KMS.
Monitoring/Alerting
Section titled “Monitoring/Alerting”The service comes built-in Runway observability:
- runway-service dashboard filtered on
secrets-manager - secrets-manager dashboard
Direct OpenBao Access
Section titled “Direct OpenBao Access”Unlike Vault, we do not intend for OpenBao to be directly modified or accessed by SREs to protect customer secrets.
Only in the event of OIDC issuer reconfiguration of the global auth/gitlab_rails_jwt authentication mount would this be necessary.
To do so, use Terraform in config-mgmt to use recovery keys to create a highly privileged root token and remediate problems via Terraform for auditability.