Skip to content

Cloud Connector - Authentication

Ownership: Authentication Group.

Cloud Connector uses JSON Web Tokens to authenticate requests in backends. Our primary backend is AI Gateway. Therefore we use it in examples and links below.

For multi-tenant customers on gitlab.com, the Rails application issues and signs these tokens. For single-tenant customers (SM/Dedicated), CustomersDot issues and signs these tokens.

To validate tokens, Cloud Connector backends fetch the corresponding keys from the gitlab.com and CustomersDot Rails applications respectively.

When the customer request reaches our backend (routed from Cloudflare), we need to authenticate it. The code that performs auth is stored in a separate gitlab-cloud-connector repo and injected into “cloud connected” backends via gitlab-cloud-connector package. We only support Python backends through this library, however, there are other Cloud Connector backends such as the SAST Scanner Service that perform similar tasks.

Please refer to a separate page for in-depth JWKS fetch mechanism overview and potential failure modes.

Refer to dedicated page to understand when the alert is sent and how to troubleshoot.

Keys should be rotated on a 6 month schedule both in staging and production.

Do not start key rotation if there is an active JWKS-related incident.

Keys must be rotated in staging and production. The general steps in both environments are:

  1. Run sudo gitlab-rake cloud_connector:keys:list to verify there is exactly one key.
  2. Run sudo gitlab-rake cloud_connector:keys:create to add a new key to rotate to.
  3. Run sudo gitlab-rake cloud_connector:keys:list to verify there are exactly two keys.
  4. Ensure validators have fetched the new key via OIDC Discovery. Since keys are cached both in HTTP caches and application-specific caches, this may require waiting at least 24 hours for these caches to expire. This process can be expedited by:
  5. For the AI Gateway only, ensure this dashboard shows no events.
  6. Run sudo gitlab-rake cloud_connector:keys:rotate to swap current key with new key, enacting the rotation.
  7. Monitor affected systems:
    • Ensure Puma and Sidekiq processes have swapped to the new key. This may take some time due keys being cached in process memory.
    • Ensure all Puma and Sidekiq workers are now using the new key to sign requests.
    • Do not proceed with the process until:
      1. Keys in use to sign requests have converged fully to the new key.
      2. Backends should not see elevated rates of 401 Unauthorized responses.
  8. Run sudo gitlab-rake cloud_connector:keys:trim to remove the now unused key.
  9. Monitor affected systems as before to ensure the rotation was successful.
  1. Run /change declare in Slack and create a C3 Change Request.
  2. Teleport to console-01-sv-gstg.
  3. Run steps outlined above.
  4. Close the CR issue.
  1. Run /change declare in Slack and create a C2 Change Request.
  2. Teleport to console-01-sv-gprd.
  3. Run steps outlined above.
  4. Close the CR issue.
  5. Create a Slack reminder in #g_cloud_connector set to 6 months from now with a link to this runbook.

Follow instructions here.