Skip to content

Gitaly is down

  • Message in prometheus-alerts Gitaly is down on [hostname]
  • Is the NFS file server running and accessible? Can you access it via a shell session?
  • try to find the reason for the reboot.
    • have a look at the stackdriver GCE VM instance logs for cloudaudit system events and serial console output.
  • check for zero size object files
    • necessary until this get’s fixed
    • else there will be errors with pushing, cloning, web ui…
cd /var/opt/gitlab/git-data/repositories/@hashed
ionice -n 5 find . -regextype sed -regex ".*/objects/.*" -size 0 -mtime +1 > /var/tmp/zerofiles.txt
sudo -u git
cd <repo>
git fsck
#
git update-ref -d <invalid_ref_found_by_git_fsck>
git fsck --full
  • Check Sentry for unusual errors
  • Check Kibana for increased error rates
  • Check the Gitaly service logs on the affected host
    • grep for SIGSEGV or SIGILL in /var/log/gitlab/gitaly/
  • Check Grafana dashboards to check for a cause of this outage

3. Ensure that the Gitaly server process is running

Section titled “3. Ensure that the Gitaly server process is running”
  • Can you see the process in ps aux | grep gitaly?
  • Is the prometheus port responding: Does curl https://localhost:9236/metrics respond?
  • Attempt to restart gitaly service: sudo gitlab-ctl restart gitaly