Cloud Connector - Authentication
Ownership: Authentication Group.
Cloud Connector uses JSON Web Tokens to authenticate requests in backends. Our primary backend is AI Gateway. Therefore we use it in examples and links below.
For multi-tenant customers on gitlab.com, the Rails application issues and signs these tokens. For single-tenant customers (SM/Dedicated), CustomersDot issues and signs these tokens.
To validate tokens, Cloud Connector backends fetch the corresponding keys from the gitlab.com and CustomersDot Rails applications respectively.
When the customer request reaches our backend (routed from Cloudflare), we need to authenticate it.
The code that performs auth is stored in a separate gitlab-cloud-connector repo
and injected into “cloud connected” backends via gitlab-cloud-connector
package.
We only support Python backends through this library, however, there are other Cloud Connector backends such as the SAST Scanner Service that perform similar tasks.
Please refer to a separate page for in-depth JWKS fetch mechanism overview and potential failure modes.
Auth-related alerts and troubleshooting
Section titled “Auth-related alerts and troubleshooting”Refer to dedicated page to understand when the alert is sent and how to troubleshoot.
Key rotation
Section titled “Key rotation”Keys should be rotated on a 6 month schedule both in staging and production.
Rotating keys for gitlab.com
Section titled “Rotating keys for gitlab.com”Do not start key rotation if there is an active JWKS-related incident.
Keys must be rotated in staging and production. The general steps in both environments are:
- Run
sudo gitlab-rake cloud_connector:keys:list
to verify there is exactly one key. - Run
sudo gitlab-rake cloud_connector:keys:create
to add a new key to rotate to. - Run
sudo gitlab-rake cloud_connector:keys:list
to verify there are exactly two keys. - Ensure validators have fetched the new key via OIDC Discovery. Since keys are cached both in HTTP
caches and application-specific caches, this may require waiting at least 24 hours for these
caches to expire. This process can be expedited by:
- Restarting/redeploying backend services to evice their in-memory caches.
- Purging HTTP caches in Cloudflare
for the
/oauth/discovery/keys
endpoint.
- For the AI Gateway only, ensure this dashboard shows no events.
- Run
sudo gitlab-rake cloud_connector:keys:rotate
to swap current key with new key, enacting the rotation. - Monitor affected systems:
- Ensure Puma and Sidekiq processes have swapped to the new key. This may take some time due keys being cached in process memory.
- Ensure all Puma and Sidekiq workers are now using the new key to sign requests.
- Do not proceed with the process until:
- Keys in use to sign requests have converged fully to the new key.
- Backends should not see elevated rates of
401 Unauthorized
responses.
- Run
sudo gitlab-rake cloud_connector:keys:trim
to remove the now unused key. - Monitor affected systems as before to ensure the rotation was successful.
Rotating keys in staging
Section titled “Rotating keys in staging”- Run
/change declare
in Slack and create a C3 Change Request. - Teleport to
console-01-sv-gstg
. - Run steps outlined above.
- Close the CR issue.
Rotating keys in production
Section titled “Rotating keys in production”- Run
/change declare
in Slack and create a C2 Change Request. - Teleport to
console-01-sv-gprd
. - Run steps outlined above.
- Close the CR issue.
- Create a Slack reminder in
#g_cloud_connector
set to 6 months from now with a link to this runbook.
Rotating keys for customers.gitlab.com
Section titled “Rotating keys for customers.gitlab.com”Follow instructions here.