A S2 level incident lasting 3 days, led to disruption to git clones, in particular for the www-gitlab-com, gitlab-ee, gitlab-ce
although many others were affected to.
Diagnosis went around in circles:
Initially targeted abuse
High CI activity rates
Workhorse throughput
Network issues
Gitaly concurrency limits (which had contributed)
Smoking gun: not only git clones which were slow, artifact downloads against S3 had also sky-rocket in latency
Testing git clones and artifact downloads via https://gitlab.com, then against front-end load balancers, then again Workhorse helps us pinpoint the issue with the HAProxy front-end fleet.