A S2 level incident lasting 3 days, led to disruption to git clones, in particular for the www-gitlab-com, gitlab-ee, gitlab-ce
although many others were affected to.
Diagnosis went around in circles:
Initially targeted abuse
High CI activity rates
Workhorse throughput
Network issues
Gitaly concurrency limits (which had contributed)
Issue had two major components. HAProxy was thottling all traffic, particularly outbound git traffic. This
led to longer clone times and a surge in the number of active concurrent sessions.
The Gitaly rate-limiter then kicked in and was causing major tailbacks.