GitalyFileServerDown
Overview
Section titled “Overview”This alert indicates that the Gitaly file server is down. It’s considered a high-severity issue that requires immediate attention. Every user with a project on the Gitaly server may be unable to use GitLab.com.
Services
Section titled “Services”- Service Overview
- Owner: Gitaly Team
- Label: gitlab-com/gl-infra/production~“Service::Gitaly”
Metrics
Section titled “Metrics”The GitalyFileServerDown alert definition is monitoring the status of the Gitaly service on a node and triggers an alert if the service has been down for more than 15 minutes
Alert Behavior
Section titled “Alert Behavior”- This alert should be rare, but if it’s triggered, needs to be investigated immediately
Severities
Section titled “Severities”- This alert might create S1 incidents.
- There might be some gitlab.com users impact
- Review Incident Severity Handbook page to identify the required Severity Level
Verification
Section titled “Verification”Recent changes
Section titled “Recent changes”Troubleshooting
Section titled “Troubleshooting”- Basic troubleshooting steps
- Additional logs to check
- Check if the gitaly process is running and prometheus is responding to requests
Possible Resolutions
Section titled “Possible Resolutions”Dependencies
Section titled “Dependencies”There is no external dependency for this alert
For escalation contact the following channels:
Alternative slack channels: