Skip to content

ArgoCD Service

ReferenceLink
ArgoCDUI
DashboardsArgoCD Overview
LogsElastic Cloud
Application Configurationgl-infra/argocd
Cluster Configurationconfig-mgmt
GKE Cluster BootstrapTerraform module

ArgoCD is a declarative, GitOps continuous delivery tool for Kubernetes.

The Production Engineering team uses ArgoCD to manage Kubernetes workloads on multiple GKE clusters of the GitLab SaaS infrastructure platform.

It can be accessed at https://argocd.gitlab.net/.

argocd.gitlab.net

In order to obtain access permission to ArgoCD for the first time, it must requested via Lumos and then approved by the requester’s manager and the service owners.

Different roles are available:

  • app.argocd.admins: full access to ArgoCD, reserved for SREs owning the service
  • app.argocd.oncall: near full access to ArgoCD without delete permissions for critical top-level applications, projects, and repository credentials, reserved for SREs in the oncall rotation
  • app.argocd.viewer: view only access to all projects in ArgoCD
  • app.argocd.members: for application-specific permissions defined by RBAC rules, most users will use this role

Once access has been granted, the user can log into ArgoCD via Okta from the homepage.

ArgoCD is deployed in the argocd namespace of the ops-gitlab-gke GKE cluster in the gitlab-ops GCP project.

ArgoCD deploys itself using its official Helm chart, for which the configuration can be found here.

Its ingress is proxied through Cloudflare, then goes through an Istio gateway and is protected by OAuth2-Proxy.

User authentication is managed via Okta only.

ArgoCD is entirely managed via 2 GitLab projects:

How to add or update a new service to ArgoCD

Section titled “How to add or update a new service to ArgoCD”

See documentation here.

How to onboard a GKE cluster into ArgoCD (via Terraform)

Section titled “How to onboard a GKE cluster into ArgoCD (via Terraform)”

This can be done mainly via the GKE ArgoCD Bootstrap Terraform module.

  1. First, ArgoCD needs to be given the permission to view and connect to GKE clusters in the target GCP project:

    locals {
    argocd_service_account_email = "[email protected]"
    }
    resource "google_project_iam_member" "argocd-cluster-viewer" {
    project = var.project
    role = "roles/container.clusterViewer"
    member = "serviceAccount:${local.argocd_service_account_email}"
    }
  2. Then the gke-argocd-bootstrap module must be instantiated for each targeted cluster:

    data.tf
    data "vault_kv_secret_v2" "gitlab-token-argocd" {
    mount = "ci"
    name = "access_tokens/gitlab-com/gitlab-com/gl-infra/argocd/config/cluster-provisioner"
    }
    data "google_client_config" "provider" {}
    # providers.tf
    provider "gitlab" {
    alias = "argocd"
    base_url = "https://gitlab.com"
    token = data.vault_kv_secret_v2.gitlab-token-argocd.data.token
    }
    provider "kubernetes" {
    alias = "my-gke-cluster"
    cluster_ca_certificate = base64decode(module.my-gke-cluster.cluster_ca_certificate)
    host = "https://${module.my-gke-cluster.cluster_endpoint}"
    token = data.google_client_config.provider.access_token
    }
    # gke.tf
    module "my-gke-cluster-argocd-bootstrap" {
    source = "gitlab.com/gitlab-com/gke-argocd-bootstrap/google"
    version = "1.1.0"
    environment = var.environment
    cluster = {
    name = module.my-gke-cluster.cluster_name
    endpoint = "https://${module.my-gke-cluster.cluster_endpoint}"
    ca_certificate = base64decode(module.my-gke-cluster.cluster_ca_certificate)
    location = var.region
    project = var.project
    }
    cluster_role_binding = {
    service_account_email = local.argocd_service_account_email
    }
    providers = {
    gitlab = gitlab.argocd
    kubernetes = kubernetes.my-gke-cluster
    }
    }
  3. When applied, Terraform will commit a new cluster secret manifest under the clusters/ directory in the argocd/config repository which will be synced automatically by ArgoCD shortly after.

  4. The cluster is now ready to use in ArgoCD! :tada:

If you received a page for ArgoCD, the first thing is to determine whether the problem is:

  • Application sync failures
  • ArgoCD controller/server issues
  • Repository connectivity problems
  • Resource deployment issues

We have some useful dashboards to reference for a quick view of system health:

When investigating issues, key questions to ask:

  • Are applications failing to sync?
    • Check the ArgoCD UI for specific error messages
    • Verify repository connectivity and credentials
    • Check for resource conflicts or pending finalizers
  • Is the ArgoCD API responding?
    • Check server pod health and logs
    • Verify ingress/load balancer configuration
  • Are there performance issues?
    • Monitor memory and CPU usage of ArgoCD components