ArgoCD Service
- Service Overview
- Alerts: https://alerts.gitlab.net/#/alerts?filter=%7Btype%3D%22argocd%22%2C%20tier%3D%22inf%22%7D
- Label: gitlab-com/gl-infra/production~“Service::ArgoCD”
Logging
Section titled “Logging”Quick Links
Section titled “Quick Links”Reference | Link |
---|---|
ArgoCD | UI |
Dashboards | ArgoCD Overview |
Logs | Elastic Cloud |
Application Configuration | gl-infra/argocd |
Cluster Configuration | config-mgmt |
GKE Cluster Bootstrap | Terraform module |
Summary
Section titled “Summary”ArgoCD is a declarative, GitOps continuous delivery tool for Kubernetes.
The Production Engineering team uses ArgoCD to manage Kubernetes workloads on multiple GKE clusters of the GitLab SaaS infrastructure platform.
It can be accessed at https://argocd.gitlab.net/.
Getting access to ArgoCD
Section titled “Getting access to ArgoCD”In order to obtain access permission to ArgoCD for the first time, it must requested via Lumos and then approved by the requester’s manager and the service owners.
Different roles are available:
app.argocd.admins
: full access to ArgoCD, reserved for SREs owning the serviceapp.argocd.oncall
: near full access to ArgoCD without delete permissions for critical top-level applications, projects, and repository credentials, reserved for SREs in the oncall rotationapp.argocd.viewer
: view only access to all projects in ArgoCDapp.argocd.members
: for application-specific permissions defined by RBAC rules, most users will use this role
Once access has been granted, the user can log into ArgoCD via Okta from the homepage.
Architecture
Section titled “Architecture”ArgoCD is deployed in the argocd
namespace of the ops-gitlab-gke
GKE cluster in the gitlab-ops
GCP project.
ArgoCD deploys itself using its official Helm chart, for which the configuration can be found here.
Its ingress is proxied through Cloudflare, then goes through an Istio gateway and is protected by OAuth2-Proxy.
User authentication is managed via Okta only.
Service Management
Section titled “Service Management”ArgoCD is entirely managed via 2 GitLab projects:
-
gitlab-com/gl-infra/argocd/apps
: contains the definitions of all ArgoCD Applications and ApplicationSets that each define what Kubernetes workloads are deployed to which cluster(s) and their configuration, including ArgoCD itself. -
gitlab-com/gl-infra/argocd/config
: contains the basic resources configuring the ArgoCD service, including:argocd-config
: top-level application deploying all other configuration resources and the below applicationsargocd-clusters
: Kubernetes cluster definitions and credentialsargocd-apps
: App of Apps managing all Applications and ApplicationSets defined ingitlab-com/gl-infra/argocd/apps
How to add or update a new service to ArgoCD
Section titled “How to add or update a new service to ArgoCD”See documentation here.
How to onboard a GKE cluster into ArgoCD (via Terraform)
Section titled “How to onboard a GKE cluster into ArgoCD (via Terraform)”This can be done mainly via the GKE ArgoCD Bootstrap Terraform module.
-
First, ArgoCD needs to be given the permission to view and connect to GKE clusters in the target GCP project:
locals {}resource "google_project_iam_member" "argocd-cluster-viewer" {project = var.projectrole = "roles/container.clusterViewer"member = "serviceAccount:${local.argocd_service_account_email}"} -
Then the
gke-argocd-bootstrap
module must be instantiated for each targeted cluster:data.tf data "vault_kv_secret_v2" "gitlab-token-argocd" {mount = "ci"name = "access_tokens/gitlab-com/gitlab-com/gl-infra/argocd/config/cluster-provisioner"}data "google_client_config" "provider" {}# providers.tfprovider "gitlab" {alias = "argocd"base_url = "https://gitlab.com"token = data.vault_kv_secret_v2.gitlab-token-argocd.data.token}provider "kubernetes" {alias = "my-gke-cluster"cluster_ca_certificate = base64decode(module.my-gke-cluster.cluster_ca_certificate)host = "https://${module.my-gke-cluster.cluster_endpoint}"token = data.google_client_config.provider.access_token}# gke.tfmodule "my-gke-cluster-argocd-bootstrap" {source = "gitlab.com/gitlab-com/gke-argocd-bootstrap/google"version = "1.1.0"environment = var.environmentcluster = {name = module.my-gke-cluster.cluster_nameendpoint = "https://${module.my-gke-cluster.cluster_endpoint}"ca_certificate = base64decode(module.my-gke-cluster.cluster_ca_certificate)location = var.regionproject = var.project}cluster_role_binding = {service_account_email = local.argocd_service_account_email}providers = {gitlab = gitlab.argocdkubernetes = kubernetes.my-gke-cluster}} -
When applied, Terraform will commit a new cluster secret manifest under the
clusters/
directory in theargocd/config
repository which will be synced automatically by ArgoCD shortly after. -
The cluster is now ready to use in ArgoCD! :tada:
Troubleshooting
Section titled “Troubleshooting”If you received a page for ArgoCD, the first thing is to determine whether the problem is:
- Application sync failures
- ArgoCD controller/server issues
- Repository connectivity problems
- Resource deployment issues
We have some useful dashboards to reference for a quick view of system health:
When investigating issues, key questions to ask:
- Are applications failing to sync?
- Check the ArgoCD UI for specific error messages
- Verify repository connectivity and credentials
- Check for resource conflicts or pending finalizers
- Is the ArgoCD API responding?
- Check server pod health and logs
- Verify ingress/load balancer configuration
- Are there performance issues?
- Monitor memory and CPU usage of ArgoCD components