`kas` Basic Troubleshooting
kas
deployment manifest location
Section titled “kas deployment manifest location”kas
is running inside our regional GKE cluster, in the gitlab
namespace. It is deployed via the Gitlab Helm chart through CI jobs at the k8s-workloads/gitlab-com repository
Changing the number of running pods
Section titled “Changing the number of running pods”As kas
is deployed as part of the Gitlab helm chart, you need to modify the helm values that get passed to it in order to change the minimum and maximum number of running pods. The helm values in question are
gitlab.kas.minReplicas
and gitlab.kas.maxReplicas
Restarting
Section titled “Restarting”Log onto a console server and get access to the cluster as documented here and run the following command
kubectl -n gitlab delete pod -l app=kas
Tail the logs
Section titled “Tail the logs”As kas
is a standard pod in our Gitlab helm chart, logs are being sent to Kibana/elasticsearch at https://log.gprd.gitlab.net/goto/b8204a41999cc1a136fa12c885ce8d22
If you need to get the logs from Kubernetes directly, you can do so by logging onto a console server and get access to the cluster as documented here and run the following command
kubectl -n gitlab logs -f -l app=kas
Debugging ingress
Section titled “Debugging ingress”THIS SECTION IS OUT OF DATE.
As kas
uses a GCP Ingress and Google managed certificates it is different from other services, as there is no haproxy nor cloudflare involved. The GCP ingress object is defined in the k8s-workloads/gitlab-com repository, and a specific helm release called gitlab-extras
. The definition can be seen here.
GCP Ingress objects are implemented by a GCP External HTTPS Load balancer, and you find the exact GCP Load balancer in use by Kas using the following command
gcloud --project gitlab-production compute forwarding-rules list | grep gitlab-gitlab-kas
To see the forwarding rule use
gcloud --project gitlab-production compute url-maps list | grep gitlab-gitlab-kas
Note that if you look closely at the Load Balancer, you can see we rely on Container native load balancing which means that the load balancer uses Network Endpoint Groups to add pod IPs directly as backends to the Load Balancer. This means that while it needs a Kubernetes Service
object in order find which pods to use as backends, the traffic flow goes from the internet, to the Load Balancer, then directly to one of the pods, not to any NodePort
nor through any kube-proxy
iptables rules.
To see the status of the network endpoint groups, and how many backends (pods) each one has behind them, use the following command
gcloud --project gitlab-production compute network-endpoint-groups list | grep gitlab-gitlab-kas
The best way to view all this information however, is through the web ui. Simply go to the Load Balancers page in the appropriate google project, and filter by gitlab-gitlab-kas
to find the Load Balancer to look at. From there you can see the health check configuration, the backend mapping by url, and the number of backends (pods) that are healthy.
Specific Issues/Errors
Section titled “Specific Issues/Errors”Kubernetes Agent reports unauthenticated
Section titled “Kubernetes Agent reports unauthenticated”If you get reports the agent is not working, and you see the following error in the Kubernetes Agent logs
{"level":"warn","time":"2020-11-26T09:44:47.943+1100","msg":"GetConfiguration.Recv failed","error":"rpc error: code = Unauthenticated desc = unauthenticated"}
It means that the Kubernetes Agent pod(s) are failing to authenticate to the Gitlab internal API. You need to ensure the contents of the Kubernetes agent container secret (gitlab-kas-credential-v1
) matches the value of the API .gitlab_kas_secret