Skip to content

Gitaly Queuing

Gitaly Queuing Graph

  • Gitaly queueing alerts
  • High (possibly extremely high) latencies on certain requests but load on Gitaly servers remains low
  • Surges in number of active HTTP or SSH sessions

*2018-11-06: Up to 15 minute delays on clones from GitLab repositories, including www-gitlab-com, gitlab-ee, gitlab-ce

  • A S2 level incident lasting 3 days, led to disruption to git clones, in particular for the www-gitlab-com, gitlab-ee, gitlab-ce although many others were affected to.
  • Diagnosis went around in circles:
    • Initially targeted abuse
    • High CI activity rates
    • Workhorse throughput
    • Network issues
    • Gitaly concurrency limits (which had contributed)
  • Issue had two major components. HAProxy was thottling all traffic, particularly outbound git traffic. This led to longer clone times and a surge in the number of active concurrent sessions.
  • The Gitaly rate-limiter then kicked in and was causing major tailbacks.