Atlantis Setup Guide for Infrastructure Deployments
This guide outlines the steps to set up a dedicated Atlantis instance for managing Terraform deployments. Assumes the project is running on ops.gitlab.net
Prerequisites
Section titled “Prerequisites”- Access to GitLab infrastructure repositories
- Vault access for secret management
- Kubernetes cluster access for Atlantis deployment
- GCP project(s) for infrastructure resources
- Appropriate permissions for creating service accounts in google project
- Access to
config-mgmt
repository for Atlantis service account setup - Access to
gitlab-helmfiles
repository for Atlantis workload configuration - Access to
infra-mgmt
repository for target project service accounts
Step 1: Configure Atlantis Workload and Secrets
Section titled “Step 1: Configure Atlantis Workload and Secrets”1.1 Create Atlantis Configuration File
Section titled “1.1 Create Atlantis Configuration File”Create a configuration file for your Atlantis instance in the gitlab-helmfiles
repository (e.g., ops-[service-name].yaml.gotmpl
):
---atlantisUrl: https://atlantis-ops-[service-name].{{ .Environment.Name }}.gke.gitlab.net
apiSecretName: atlantis-api-ops-[service-name]
vcsSecretName: ops-gitlab-net-[service-name]
orgAllowlist: [RepositoryURL eg. ops.gitlab.net/gitlab-com/gl-infra/cells/topology-service-deployer]
resources: requests: cpu: 4000m memory: 2Gi limits: cpu: 8000m memory: 4Gi
volumeClaim: dataStorage: 10Gi
ingress: enabled: false
serviceAccount: annotations: iam.gke.io/gcp-service-account: atlantis-ops-[service-name]@gitlab-ops.iam.gserviceaccount.com
podTemplate: labels: deployment: atlantis-ops-[service-name]
statefulSet: annotations: secret.reloader.stakater.com/reload: ops-gitlab-net-[service-name],terraformrc labels: deployment: atlantis-ops-[service-name]
1.2 Configure Repository Workflow
Section titled “1.2 Configure Repository Workflow”Update repo-configs/ops.yaml
to add the service repository configuration:
repos: - id: ops.gitlab.net/gitlab-com/gl-infra/cells/[service-name]-deployer allowed_overrides: [delete_source_branch_on_merge] apply_requirements: [approved, mergeable] delete_source_branch_on_merge: true policy_check: true repo_locks: mode: on_apply workflow: [service-name]
workflows: [service-name]: plan: steps: - *env-terraform - *env-tf-comment-args - *env-tf-in-automation - *env-tf-input - *env-tf-plugin-cache-dir - &env-tf-var-vault-secrets-path-[service-name] env: name: TF_VAR_vault_secrets_path command: echo "ops-gitlab-net/gitlab-com/gl-infra/cells/[service-name]-deployer/${PROJECT_NAME}" - &env-tf-var-google-impersonated-account-[service-name] env: name: TF_VAR_google_impersonated_account value: atlantis-ops-[service-name]@gitlab-ops.iam.gserviceaccount.com - *env-vault-addr - *env-vault-auth-path - &env-vault-auth-role-[service-name] env: name: VAULT_AUTH_ROLE value: atlantis-ops-[service-name] - *env-vault-token - *cleanup-plugin-cache - *terraform-init - *terraform-plan - *tf-summarize - *terraform-show - *terraform-validate apply: steps: - *env-tf-in-automation - *env-tf-input - *env-tf-plugin-cache-dir - *env-tf-var-vault-secrets-path-[service-name] - *env-tf-var-google-impersonated-account-[service-name] - *env-vault-addr - *env-vault-auth-path - *env-vault-auth-role-[service-name] - *env-vault-token - apply: extra_args: ["-parallelism=20"]
1.3 Add Helm Release Configuration
Section titled “1.3 Add Helm Release Configuration”Update helmfile.yaml.gotmpl
to include your new Atlantis instance:
releases: - name: atlantis-ops-[service-name] chart: atlantis/atlantis namespace: atlantis version: {{ .Values | get "atlantis.chart_version" nil }} installed: {{ .Values | get "atlantis.installed" false }} labels: tier: inf app: atlantis values: - values.yaml.gotmpl - ops-gitlab-net.yaml.gotmpl - ops-[service-name].yaml.gotmpl
1.4 Configure Ingress and Certificates
Section titled “1.4 Configure Ingress and Certificates”Update ingress configuration to include your new Atlantis instance:
ingress: hosts: - host: atlantis-ops-[service-name].{{ .Environment.Name }}.gke.gitlab.net paths: ["/*"] service: atlantis-ops-[service-name]
# Add to managed certificatesgoogle: managedCertificate: domains: - atlantis-ops-[service-name].{{ .Environment.Name }}.gke.gitlab.net
# Add to RBACrbac: serviceAccountNames: - atlantis-ops-[service-name]
1.5 Configure External Secrets
Section titled “1.5 Configure External Secrets”Update the external secret configurations in values-secrets.yaml.gotmpl
:
externalSecrets: ops-gitlab-net-[service-name]: refreshInterval: 1h secretStoreName: atlantis-secrets target: creationPolicy: Owner deletionPolicy: Delete data: - remoteRef: key: "env/{{ .Values | get \"env_prefix\" .Environment.Name }}/ns/atlantis/ops-gitlab-net" property: api_token secretKey: gitlab_token - remoteRef: key: "env/{{ .Values | get \"env_prefix\" .Environment.Name }}/ns/atlantis/webhooks/ops-[service-name]" property: secret secretKey: gitlab_secret
atlantis-api-ops-[service-name]: refreshInterval: 1h secretStoreName: atlantis-shared-secrets target: creationPolicy: Owner deletionPolicy: Delete data: - remoteRef: key: "atlantis/ops-[service-name]/api" property: secret secretKey: apisecret
Step 2: Create Atlantis Service Account and Permissions (via config-mgmt)
Section titled “Step 2: Create Atlantis Service Account and Permissions (via config-mgmt)”2.1 First MR: Configure Base Atlantis Service Account
Section titled “2.1 First MR: Configure Base Atlantis Service Account”Create the first merge request in the config-mgmt
repository to set up the Atlantis service account and Vault permissions.
Create the Google Cloud service account with Kubernetes workload identity binding in environments/ops/iam.tf
:
module "atlantis-ops-[service-name]-sa" { source = "terraform-google-modules/kubernetes-engine/google//modules/workload-identity" version = "37.0.0"
project_id = var.project name = "atlantis-ops-[service-name]" namespace = "atlantis" k8s_sa_name = "atlantis-ops-[service-name]"
use_existing_k8s_sa = true annotate_k8s_sa = false}
Configure Vault authentication and policies to give Atlantis access to read project secrets and write deployment outputs in environments/vault-production/atlantis.tf
:
locals { # ... existing paths ... atlantis_ops_[service_name]_ro_paths = [ "ci/ops-gitlab-net/gitlab-com/gl-infra/cells/[service-name]-deployer/*", ] atlantis_ops_[service_name]_rw_paths = [ "ci/ops-gitlab-net/gitlab-com/gl-infra/cells/[service-name]-deployer/outputs/*", "ci/ops-gitlab-net/gitlab-com/gl-infra/cells/[service-name]-deployer/+/outputs/*", ]}
# Kubernetes auth backend roleresource "vault_kubernetes_auth_backend_role" "atlantis-ops-[service-name]" { backend = "kubernetes/ops-gitlab-gke" role_name = "atlantis-ops-[service-name]"
bound_service_account_names = ["atlantis-ops-[service-name]"] bound_service_account_namespaces = ["atlantis"]
token_ttl = 3600 token_max_ttl = 7200
token_policies = [ vault_policy.atlantis-ops-[service-name].name, ]
depends_on = [module.vault-config]}
# Vault policy documentdata "vault_policy_document" "atlantis-ops-[service-name]" { # Child token creation by Terraform rule { path = "auth/token/create" capabilities = ["update"] }
# Allow to self lookup token rule { path = "auth/token/lookup-self" capabilities = ["read"] }
# Read-only access dynamic "rule" { for_each = local.atlantis_ops_[service_name]_ro_paths content { path = replace(rule.value, local.vault_kv_v2_expand_regex, "$1/data/") capabilities = ["list", "read"] } } dynamic "rule" { for_each = local.atlantis_ops_[service_name]_ro_paths content { path = replace(rule.value, local.vault_kv_v2_expand_regex, "$1/metadata/") capabilities = ["list", "read"] } }
# Read-write access dynamic "rule" { for_each = local.atlantis_ops_[service_name]_rw_paths content { path = replace(rule.value, local.vault_kv_v2_expand_regex, "$1/data/") capabilities = ["list", "read", "create", "patch", "update", "delete"] } } dynamic "rule" { for_each = local.atlantis_ops_[service_name]_rw_paths content { path = replace(rule.value, local.vault_kv_v2_expand_regex, "$1/metadata/") capabilities = ["list", "read", "create", "patch", "update", "delete"] } } dynamic "rule" { for_each = local.atlantis_ops_[service_name]_rw_paths content { path = replace(rule.value, local.vault_kv_v2_expand_regex, "$1/delete/") capabilities = ["update"] } } dynamic "rule" { for_each = local.atlantis_ops_[service_name]_rw_paths content { path = replace(rule.value, local.vault_kv_v2_expand_regex, "$1/undelete/") capabilities = ["update"] } } dynamic "rule" { for_each = local.atlantis_ops_[service_name]_rw_paths content { path = replace(rule.value, local.vault_kv_v2_expand_regex, "$1/destroy/") capabilities = ["update"] } }}
# Vault policyresource "vault_policy" "atlantis-ops-[service-name]" { name = "atlantis-ops-[service-name]" policy = data.vault_policy_document.atlantis-ops-[service-name].hcl}
Register the service environments with Atlantis to enable Terraform state bucket creation in atlantis.yaml
:
projects: # ... existing projects ...
# [Service Name] - name: [service-name]-dev dir: environments/[service-name]-dev autoplan: enabled: false - name: [service-name]-prod dir: environments/[service-name]-prod autoplan: enabled: false
2.2 Second MR: Configure Environment Project Permissions
Section titled “2.2 Second MR: Configure Environment Project Permissions”After the first MR is merged and applied, create a second merge request to add the storage and KMS permissions.
Grant the service account permissions to manage Terraform state files and encryption keys in environments/env-projects/atlantis.tf
:
locals { atlantis_service_accounts = { # ... existing accounts ... [service-name] = { member = "serviceAccount:atlantis-ops-[service-name]@gitlab-ops.iam.gserviceaccount.com" environments = toset(["[service-name]-dev", "[service-name]-prod"]) } }}
# Storage bucket permissions for Terraform stateresource "google_storage_bucket_iam_member" "terraform-state-object-admin-atlantis-[service-name]" { for_each = local.atlantis_service_accounts["[service-name]"].environments
bucket = google_storage_bucket.infra-terraform[each.value].name role = "roles/storage.objectAdmin" member = local.atlantis_service_accounts["[service-name]"].member
depends_on = [module.gitlab-infra-terraform]}
# KMS permissions for Terraform state encryptionresource "google_kms_crypto_key_iam_member" "terraform-state-encrypter-decrypter-atlantis-[service-name]" { for_each = local.atlantis_service_accounts["[service-name]"].environments
crypto_key_id = google_kms_crypto_key.terraform-state-encryption[each.value].id role = "roles/cloudkms.cryptoKeyEncrypterDecrypter" member = local.atlantis_service_accounts["[service-name]"].member
depends_on = [module.gitlab-infra-terraform]}
Step 3: Configure Repository Access
Section titled “Step 3: Configure Repository Access”Example secret MR Example webhook and user MR
3.1 Add Atlantis User to Repository
Section titled “3.1 Add Atlantis User to Repository”Add the Atlantis user as a maintainer to your target repository in the infra-mgmt
repository:
# In your GitLab project configurationmembers = { (local.users.atlantis.id) = { access_level = "maintainer" }}
3.2 Configure Webhook and secrets
Section titled “3.2 Configure Webhook and secrets”Create a project webhook and generate secrets for Atlantis in the infra-mgmt
repository (assuming your project is on ops.gitlab.net):
resource "random_password" "atlantis-ops-[service-name]-webhook-secret" { length = 32 special = false}
resource "random_password" "atlantis-ops-[service-name]-api-secret" { length = 32 special = false}
# Secret for project to send events to atlantis webhookresource "vault_kv_secret_v2" "atlantis-ops-[service-name]-webhook" { mount = "k8s" name = "env/ops/ns/atlantis/webhooks/ops-[service-name]"
data_json = jsonencode({ secret = random_password.atlantis-ops-[service-name]-webhook-secret.result }) delete_all_versions = true}
# Secret for making API requests to atlantis serverresource "vault_kv_secret_v2" "atlantis-ops-topo-svc-api" { mount = "shared" name = "data/atlantis/ops-topo-svc/api"
data_json = jsonencode({ secret = random_password.atlantis-ops-[service-name]-api-secret.result }) delete_all_versions = true}
resource "gitlab_project_hook" "[service-name]-atlantis" { project = module.project_canonical-[service-name]-deployer.id url = "https://atlantis-ops-[service-name].ops.gke.gitlab.net/events" token = random_password.atlantis-ops-[service-name]-webhook-secret.result
note_events = true merge_requests_events = true push_events = true
enable_ssl_verification = true}
Step 4: Create Infrastructure Service Accounts
Section titled “Step 4: Create Infrastructure Service Accounts”4.1 Create Target Project Service Accounts
Section titled “4.1 Create Target Project Service Accounts”First, create the service accounts that Atlantis will impersonate in your target GCP projects. These need to have access to the resources that will be managed via terraform
Create service account configuration files:
For each environment (terraform/[ENV]/[SERVICE_NAME]-service-accounts.tf
):
# Service accounts for the [Service Name]module "[service_name]_service_accounts" { source = "ops.gitlab.net/gitlab-com/service-account/google" version = "1.0.0" for_each = { readwrite = [ "roles/compute.admin", "roles/storage.admin", "roles/logging.admin", "roles/monitoring.admin" ] readonly = [ "roles/compute.viewer", "roles/storage.objectViewer", "roles/iam.serviceAccountViewer", "roles/logging.viewer", "roles/monitoring.viewer" ] }
project_id = "[Google project ID]" service_account_prefix = "[service-name]" service_account_display_name_prefix = "[Service Name]" suffix = each.key roles = each.value
# Allow the Atlantis service account to impersonate these service accounts impersonation_members = [ "serviceAccount:atlantis-ops-[service-name]@gitlab-ops.iam.gserviceaccount.com" ]}
# Outputsoutput "[service-name]_readwrite_service_account_email" { description = "The email of the [service-name] readwrite service account" value = module.[service-name]_service_accounts["readwrite"].service_account_email}
output "[service-name]_readonly_service_account_email" { description = "The email of the [service-name] readonly service account" value = module.[service-name]_service_accounts["readonly"].service_account_email}
Step 5: Configure Target Repository
Section titled “Step 5: Configure Target Repository”5.1 Add atlantis.yaml Configuration
Section titled “5.1 Add atlantis.yaml Configuration”Create atlantis.yaml
in the repository root:
---version: 3automerge: truedelete_source_branch_on_merge: trueparallel_plan: trueparallel_apply: trueabort_on_execution_order_fail: true
projects: - name: dev dir: terraform/dev execution_order_group: 1 - name: prod dir: terraform/prod execution_order_group: 2
Configuration details can be found at https://www.runatlantis.io/docs/repo-level-atlantis-yaml
5.1 Add Terraform Configuration in the directory from previous step
Section titled “5.1 Add Terraform Configuration in the directory from previous step”Ensure your Terraform configuration includes:
# Configure Terraform backendterraform { backend "gcs" { bucket = "[terraform-state-bucket-created-in-step2]" prefix = "[service-name]/[project-name]" }}
## Googleprovider "google" { credentials = var.google_application_credentials_path
impersonate_service_account = var.google_application_credentials_path == null ? var.google_impersonated_account : null
# Explicitly set the access_token to null to ensure we don't use # GOOGLE_OAUTH_ACCESS_TOKEN if it is in our environment # kics-scan ignore-line access_token = null}
provider "google-beta" { credentials = var.google_application_credentials_path
impersonate_service_account = var.google_application_credentials_path == null ? var.google_impersonated_account : null
# Explicitly set the access_token to null to ensure we don't use # GOOGLE_OAUTH_ACCESS_TOKEN if it is in our environment # kics-scan ignore-line access_token = null}
# Variablesvariable "google_impersonated_account" { type = string description = "Email of the service account to impersonate (mainly by Atlantis) if google_application_credentials_path is not set" default = "atlantis-ops-[service-name]@gitlab-ops.iam.gserviceaccount.com" # Needs to be update to account created in step 2}
Step 6: Deployment and Validation
Section titled “Step 6: Deployment and Validation”6.1 Test Atlantis Integration
Section titled “6.1 Test Atlantis Integration”- Create a test Terraform change in your target repository
- Open a merge request
- Verify that Atlantis automatically runs
terraform plan
- Add approval to the merge request
- Comment
atlantis apply
to test the apply workflow - Verify that the infrastructure changes are applied successfully
Troubleshooting
Section titled “Troubleshooting”Common Issues
Section titled “Common Issues”-
“This repo is not allowlisted for Atlantis”
- Ensure the repository is added to the
orgAllowlist
configuration - Verify the repository path is correct
- Ensure the repository is added to the
-
Missing secrets errors
- Check that all required secrets are created in Vault
- Verify the external secrets configuration is correct
- Ensure secret paths match between configuration and Vault
-
Permission denied errors
- Verify service account permissions
- Check that the Atlantis service account can impersonate the target GCP service accounts
- Review IAM bindings and roles
-
Webhook not triggering
- Verify webhook URL and token configuration
- Check that the webhook is enabled for the correct events
- Review GitLab project webhook settings
-
Service account impersonation errors
- Ensure the variable
google_impersonated_account
is properly set in the service account configuration - Verify that the hardcoded service account email matches the actual Atlantis service account
- Check that the service account exists and has proper permissions
- Ensure the variable