Skip to content
Runbooks
Search
Ctrl
K
Cancel
GitLab
Code Context
Select theme
Dark
Light
Auto
GitLab Runbooks
about
about.gitlab.com Service
agentic-duo-chat
index
ai-active-context
ActiveContext
ai-assisted
AI-Assisted Service
ai-gateway
AI Gateway Service
alerts
AiGatewayServiceRunwayIngressTrafficCessationRegional
Code Suggestions
AI Gateway rate limits
alerts
ApdexSLOViolation
ErrorSLOViolation
TrafficAbsent and TrafficCessation
amp
Amp Service
api
GitLab API Service
argocd
ArgoCD Service
atlantis
Atlantis Service
Atlantis Setup Guide for Infrastructure Deployments
Atlantis Web UI
audit-evidence-gathering
Runbook for audit evidence gathering procedures
backup-restore-testing
Backup & Restore Testing Service
bastions
db-benchmarking-bastions
db-lab bastion hosts
dev.gitlab.org host
gprd-bastions
gstg-bastions
ops-bastions
pre-bastions
release-bastions
blackbox
Blackbox Exporters Service
alerts
BlackboxProbeFailures
Blackbox git exporter is down
build-tooling
Distribution Build Tooling Service
camoproxy
Camoproxy Service
Camoproxy troubleshooting
Upgrade camoproxy
cells
Cells
Cells and Amp Documentation
Cells and Auto-Deploy
Auto-upgrading Dependency Versions
Breakglass
Create AWS DMS Source Filter Rules for an Organization Migration
Cells DNS
Cell Infrastucture Debugging and Development
patching
Patch Cell's Tenant Model
Creating Patches
Debugging Patches
Deleting a Patch
Retrying a Failed Patch
Using [`ringctl`] operations in pipelines
Post deployment migrations in cells
Cell Provisioning and De-Provisioning
Cells QA: Running E2E Tests Through the HTTP Router
Topology Service Database Restore Runbook
Connecting to a Cell's Toolbox Pod
Validate Instrumentor Changes within Cells Infrastructure
certificates
Gitlab Certificate Run Books
AWS Managed Certificates
chef_hybrid
chef_server
chef_vault
cloudflare
forum
gcp
gkms
zendesk
ci
CI/CD Variables Troubleshooting Guide
CI Inputs
CI Protected Variables
ci_deleted_objects_processing
CI Deleted Objects Processing Triage
ci-orchestration
CI Orchestration Service
alerts
CiOrchestrationServiceJobInfraFailureRatioErrorSLOViolation
Job Queue Duration Apdex SLO Violations
Pipeline Creation Sidekiq Worker SLO Violations
Pipeline Processing Sidekiq Worker SLO Violations
ci-runners
CI Runners Service
alerts
ci_pending_builds
ci_too_many_archiving_trace_failures
ci_workhorse-queuing
ci-apdex-violating-slo
CiRunnersServiceCiRunnerJobsApdexSLOViolationSingleShard
CiRunnersServicePollingErrorSLOViolation
CiRunnersServiceQueuingQueriesDurationApdexSLOViolation
ci_graphs
Network Info
linux
Linux CI/CD Runners fleet configuration management
architecture
autoscaling
Blue Green Deployments
Linux CI/CD Runners fleet configuration changes
Hosted Runners Debugging Guide
docker-machine
Docker machine fails to create machine
Deploy docker-machine
Linux CI/CD Runners fleet graceful shutdown procedure
Linux CI/CD Runners fleet deployments when Ops/Deployer is down
Provisioning a new shard
org-ci
org-ci runners
scale-existing-shards
macos
MacOS Runners
access
Debugging macOS Runners
AWS macOS Dedicated Host Characteristics
macOS runner fleet deployments
macOS Images
macOS resources in AWS
providers
gcp
Google Cloud Metrics Investigation
Quotas
release-cycle
Blue_Green_Deployment
runner-projects
CI Runner Troubleshooting Guide
windows
Windows Autoscaling Runners
Connecting to a Windows machine
clickhouse
ClickHouse Cloud Service
ClickHouse Cloud Failure Remediation, Backup & Restore Process
cloud_connector
Cloud Connector
alerts
AI Gateway JWKS fetch failed (Slack notification)
CloudflareCloudConnectorRateLimitExhaustion
Cloud Connector - Cloudflare
cloud-cost-analysis
Cloud Cost Analysis (SQLMesh Catalog) Service
cloud-sql
Google Cloud SQL Service
alerts
CloudSQLDatabaseDown
Cloud SQL Troubleshooting
cloudflare
Cloudflare Web Application Firewall Service
Cloudflare Audit Log Rule Processing
Cloudflare is Down
Cloudflare
Cloudflare Logs
Cloudflare: Managing Traffic
Cloudflare for the on-call
Service Locations
CloudFlare Troubleshooting
Accessing and Using CloudFlare
console
Console Access Service
Accessing the Rails Console as an SRE
consul
Consul Service
Interacting with Consul
contributors
contributors.gitlab.com Service
contributors.gitlab.com
customersdot
CustomersDot Service
customers.gitlab.com
Backups
Disk space alerts (production)
CustomersDot main troubleshoot documentation
Scaling CustomersDot VMs
Usage Billing Enrichment & Consumption - Production Runbook
data-insights-platform
Data Insights Platform Service
environments
product-usage-data
Data Insights Platform - Product Usage Data via Snowplow
usage-billing
Data Insights Platform - Usage Billing
troubleshooting
Troubleshooting: Usage Billing - Data Ingestion
Data Insights Platform - Message queueing with NATS/Jetstream
Data Insights Platform Runbooks
data-server-rebuild-ansible
Data-Server Rebuild Ansible Service
db-benchmarking
Database Benchmarking Service
dblab
Database Lab (postgres.ai) Service
decomposition
CI Mirrored Tables
design
design.gitlab.com Service
design.gitlab.com Runbook
dev-gitlab-org
dev.gitlab.org Service
dev.gitlab.org - Automated tasks
dev.gitlab.org - Maintenance tasks
disaster-recovery
index
alerts
GCPScheduledSnapshots
gameday
Disaster Recovery Gameday Schedule
Google Cloud Snapshots
Zonal and Regional Recovery Guide
Measuring Recovery Activities
docs-website
docs.gitlab.com Service
duo
GitLab Code Suggestion Failover Solution
Duo Enterprise License Access Process for Staging Environment
GitLab Duo Triage
duo-agent-platform
Duo Agent Platform Service
duo-chat
Duo Chat Runbook
duo-code-review
Duo Code Review Runbook
duo-workflow-svc
Duo Workflow Service
alerts
DuoWorkflowSvcServiceCheckpointErrorsErrorSLOViolation
DuoWorkflowSvcServiceLlmErrorSLOViolation
DuoWorkflowSvcServiceServerApdexSLOViolation
DuoWorkflowSvcServiceServerErrorSLOViolation
DuoWorkflowSvcServiceServerTrafficAbsent
DuoWorkflowSvcServiceServerTrafficCessation
DuoWorkflowSvcServiceToolUseErrorSLOViolation
editor-extensions
Editor Extensions Runbook
elastic
Elastic
Advanced Search
disaster_recovery
Advanced Search Disaster recovery
Elastic Nodes Disk Space Saturation
Elastic Cloud
exercises
elastic_ebay_exercise
Elastic exercises
Kibana exercises
Kibana
troubleshooting
Troubleshooting
elk_mapper_parsing_exception
engineering-portal
Engineering Portal Service
errortracking
ErrorTracking Service
ErrorTracking main troubleshooting document
example-service-gke
Example Runway GKE Service
ext-pvs
External Pipeline Validation Service
external_license_db
External License DB Service
External License DB main troubleshooting documentation
external-dns
ExternalDNS Service
fleet-management
Fleet Management Service
config_management
Config Management
alerts
ChefClientErrorCritical
ComponentResourceRunningOut_disk_space
Chef Cookbook Process
Chef Guidelines
Chef Cookbook Onboarding: Using the Playground Cookbook
Chef Remove Node
Chef Server
Chef troubleshooting
Chef Vault Basics
VM Build Process with Terraform and Chef
Chefspec
Debug failed chef provisioning
disable-chef-runs-on-a-vm
Chef secrets using GKMS
Managing Chef User Accounts
forum
forum.gitlab.com Service
Management for forum.gitlab.com
frontend
HAProxy (Frontend) Service
`asset_proxy` is `DOWN`
Blocking individual IPs and Net Blocks on HA Proxy
Blocking and Disabling Things in HAProxy
gitlab-com-is-down
HAProxy Management at GitLab
HAProxy Logging
Increased Error Rate
Possible Breach of SSH MaxStartups
SSL Certificate Expiring or Expired
gamedays
Game days
scenarios
Databasebase backup health check
GCP snapshot health check
Complete zonal failure recovery procedure
git
Git Access Service
Deploying a change to gitlab.rb
Git
Git Stuck Processes
gitlab-review-app-certs
Summary
Purge Git data
Workhorse Session Alerts
gitaly
Gitaly Service
Access Git repository through Teleport
alerts
GitalyFileServerDown
GitalyServiceGoserverTrafficCessationSingleNode
GitalyVersionMismatch
Find a project from its hashed storage path
Copying or moving a Git repository by hand
git-high-cpu-and-memory-usage
Debugging gitaly with gitaly-debug
Gitaly is down
Gitaly error rate is too high
Gitaly latency is too high
Upgrading the OS of Gitaly VMs
Gitaly profiling
Gitaly Queuing
Gitaly repository cgroups
Restoring gitaly data corruption on a project after an unclean shutdown
Gitaly Repository Export
Gitaly Stale Locks
Gitaly token rotation
Gitaly unusual activity alert
Gitaly version mismatch
`gitalyctl`
Gracefully restart gitaly-ruby
Moving repositories from one Gitaly node to another
Gitaly multi-project migration
Adding new storage capacity
Partial Gitaly Storage Rebalancing
Prometheus Storage Inconsistent
Gitaly Snapshot Verification
GitLab Storage Re-balancing
Managing GitLab Storage Shards (Gitaly)
gitlab-com-artifact-registry
Artifact Registry Service
overview
gitlab-com-pkgs
Package GCS Bucket Service
overview
gitlab-static
gitlab-static.net zone hosted on Cloudflare Service
Web IDE Assets
glab
glab Runbook
glgo
Identity layer service for the Google Cloud integration
glql
GitLab Query Language (GLQL) Service
GitLab Query Language (GLQL)
google-cloud-storage
Google Cloud Storage Service
CI Artifacts CDN
growth
Growth – Trials Health Runbook
hackystack
Hackystack Service
hosted-runners
Hosted Runner On-call Run Books
HostedRunnersServiceApiRequestsErrorSLOViolationSingleShard
Hosted Runner maintenance for {customer} has failed
IMDS Throttling
inaccuracies-between-deployment_status-ssm-parameter-and-state-of-infrastructure
Troubleshooting HostedRunnersLoggingServiceUsageLogsErrorSLOViolationSingleShard
Performance Troubleshooting: Low Apdex & High Queue Duration on Dedicated Hosted Runners (DHR)
prepare_error_state_lock
provision_post_deploy_healthcheck_failed
provision_pre_deploy_healthcheck_failed
HostedRunnersServiceCiRunnerJobsErrorSLOViolationSingleShard
HostedRunnersServiceRunnerManagerDownSingleShard
security_incidents
Troubleshooting HostedRunnersLoggingServiceUsageReplicationErrorSLOViolation
http-router
Cloudflare HTTP Router Service
Deployments
Disabling routing requests through `http-router`
HTTP Router: On-Call Survival Guide
HTTP Router Worker Logs
Missing Metrics in HTTP Router Dashboard
importers
Importers Runbooks
Bitbucket Cloud Importer Runbook
Bitbucket Server Importer Runbook
Direct Transfer Importer Runbook
FogBugz Importer Runbook
Gitea Importer Runbook
GitHub Importer Runbook
Import/Export Importer Runbook
Manifest File Importer Runbook
Repository by URL Importer Runbook
incident-io
Incident.io Service
Changelog
GitLab Production Onboarding for Incident.io
On-Call
Incident Workflow
incidents
Incidents
When GitLab.com is down
internal-api
GitLab Internal API Service
ir.gitlab.com
Investors Relations (ir.gitlab.com) main troubleshoot documentation
istio
Istio Service
jaeger
Jaeger Service
kas
GitLab Relay (KAS) Service
`kas` Basic Troubleshooting
`kas` Disable Integrations
kube
Kubernetes Service
alerts
component_saturation_slo_out_of_bounds:kube_persistent_volume_claim_disk_space
KubeContainersWaitingInError
KubernetesClusterZombieProcesses
Helm Upgrade is Stuck
Ad hoc observability tools on Kubernetes nodes
Rebuilding a GKE cluster
GKE Cluster Upgrade Procedure
Isolating a pod
Creating a new GKE cluster
k8s-oncall-setup
GitLab
How to resize Persistent Volumes in Kubernetes
How to take a snapshot of an application running in a StatefulSet
GKE/Kubernetes Administration
Kubernetes
StatefulSet Guidelines
logging
Logging Service
exercises
ILM exercise
logging_exercies_1
logging_bigquery_schemas
Cloudflare Logpush Schema
Loading StackDriver(SD) Archives from Google Cloud Storage (GCS) into BiqQuery
Scaling Elastic Cloud Clusters
troubleshooting
Troubleshooting
Vector
mailgun
Mailgun Service
How GitLab.com uses Mailgun
Mailgun Events
mailroom
Mailroom Incoming Mail Service
mcp-server
GitLab MCP Server Service
memorystore
Google Cloud Memorystore Service
memorystore-redis-tracechunks
Memorystore Redis TraceChunks Service
Memorystore Redis TraceChunks Service
metrics-catalog
Metrics Catalog
Service-Level Monitoring
Traffic Cessation Alerts
mimir
Grafana Mimir Service
Auditing Metrics
Cardinality Management
Mimir Onboarding
monitoring
Monitoring Service
Advisory Database Unresponsive Hosts/Outdated Repositories
Tuning and Modifying Alerts
Alertmanager Notification Failures
alerts
AlertmanagerNotificationsFailing
Accessing a GKE Alertmanager
Alerting
Apdex alerts troubleshooting
Get a list of Prometheus jobs
Service Apdex
Service Error Rate
Service Operation Rate
An impatient SRE's guide to deleting alerts
Filesystem errors are reported in LOG files
filesystem_alerts_inodes
Grafana graph is empty
Mixins
Node memory alerts
Prometheus Checkpointing Slow
Prometheus Empty Service Discovery
prometheus-failed-checkpoints
prometheus-failed-compactions
prometheus-failed-wal-truncations
prometheus-failing-rule-evaluations
Prometheus FileSD read errors
Prometheus High Memory Utilization
Prometheus Indexing Backlog
Prometheus Invalid Configuration File
prometheus-is-down
Prometheus Not Ingesting
Prometheus Notifications Backlog
Prometheus Invalid Configuration File
Prometheus Persist Errors
Prometheus Persistence Pressure Too High
Prometheus pod crashlooping
prometheus-scrape-errors
Prometheus Rule Evaluation Slow
Prometheus Scraping Slowly
Prometheus Series Maintenance Stalled
Prometheus Dead Man's Snitch
Prometheus WAL Corruptions
Push Gateway
set_maintenance_window
Thanos
Upgrading Monitoring Components
nat
NAT Service
Cloud NAT Troubleshooting
NAT Gateway Port Allocation
nats
NATS Service
NATS Backup
NATS monitoring
NATS Operations
nginx
NGINX Service
omnibus
GitLab Omnibus Package Service
onboarding
Onboarding
Session: Application architecture
Gitlab.com on Kubernetes
Diagnosis with Kibana
ops-gitlab-net
ops.gitlab.net Service
Database dump of ops.gitlab.net
Restore Gitaly data on `ops.gitlab.net`
orbit
Orbit Service
package-registry
Package registry
Dependency proxy for containers runbook
Generic package runbook
Maven packages runbook
NPM runbook
NuGet runbook
PyPI runbook
Terraform module registry runbook
packaging
Generating GPG Key Pairs
GPG Keys for Package Signing
Managing Repository Metadata Signing Keys
pages
Pages Service
Block specific pages domains through HAproxy
GitLab Pages returning 404
Determine The GitLab Project Associated with a Domain
Troubleshooting LetsEncrypt for Pages
patroni
Postgres (Patroni) Service
alerts
PatroniGCSSnapshotDelayed
PatroniLongRunningTransactionDetected
PatroniScrapeFailures
PostgresSplitBrain
walgBaseBackupDelayed, WALGBaseBackupFailed
Steps to create (or recreate) a Standby CLuster using a Snapshot from a Production cluster as Master cluster (instead of pg_basebackup)
Check the status of transaction wraparound Runbook
Custom PostgreSQL Package Build Process for Ubuntu Xenial 16.04
Alertmanager Silence Management Tool
database_peak_analysis
How and when to deprovision the db-benchmarking environment
GCP Quota Analysis - Quick Reference Guide
Patroni GCS Snapshots
Geo Patroni Cluster Management
gitlab-com-wale-backups
gitlab-com-walg-backups
Log analysis on PostgreSQL, Pgbouncer, Patroni and consul Runbook
Making a manual clone of the DB for the data team
Mapping Postgres Statements, Slowlogs, Activity Monitoring and Traces
OS Upgrade Reference Architecture
Patroni Cluster Management
performance-degradation-troubleshooting
pg_collect_query_data
Postgresql minor upgrade
Pg_repack using gitlab-pgrepack
`pg_xid_wraparound` Saturation Alert
`pg_txid_xmin_age` Saturation Alert
pg-ext-manager
PostgreSQL HA
pgbadger Runbook
Postgresql troubleshooting
postgres_exporter
GitLab application-side reindexing
postgres-backups-verification-failures
postgres-checkup
Dealing with Data Corruption in PostgreSQL
Diagnosing long running transactions
Postgres maintenance
Postgresql
PostgreSQL Backups: WAL-G
postgresql-buffermapping-lwlock-contention
PostgreSQL
postgresql-locking
How to evaluate load from queries
PostgreSQL Trigram Indexes
Adding a PostgreSQL replica
Credential rotation
PostgreSQL subtransactions
PostgreSQL VACUUM
Primary Database Node CPU Saturation Analysis
Primary Database Node WAL Generation Saturation Analysis
How to provision the benchmark environment
SQL query analysis and optimization for Postgres
Rails SQL Apdex alerts
Rotating Rails' PostgreSQL password
Running Post-deployment Migrations Manually through a Change Request
Scale Down Patroni
Scale Up Patroni
High-level performance analysis and troubleshooting of a Postgres node
Handling Unhealthy Patroni Replica
Roles/Users grants and permission Runbook
using-wale-gpg
Postgres wait events analysis (a.k.a. Active Session History; ASH dashboard)
WAL logs analysis
Zero Downtime Postgres Database Decomposition
patroni-ci
CI Postgres (Patroni) Service
Recovering from CI Patroni cluster lagging too much or becoming completely broken
patroni-registry
Registry Postgres (Patroni) Service
patroni-sec
Sec Postgres (Patroni) Service
pd-event-logger-7760xa
Logs PagerDuty incident events to ElasticSearch Service
events
pgbouncer
PGBouncer Primary Database Pool Service
alerts
component_saturation_slo_out_of_bounds:pgbouncer_single_core
patroni-consul-postgres-pgbouncer-interactions
Add a new PgBouncer instance
pgbouncer-applications
PgBouncer connection management and troubleshooting
Removing a PgBouncer instance
Sidekiq or Web/API is using most of its PgBouncer connections
Pgbouncer Service
pgbouncer-ci
CI PGBouncer Primary Database Pool Service
pgbouncer-registry
Registry PGBouncer Primary Database Pool Service
pgbouncer-sec
Sec PGBouncer Primary Database Pool Service
pingdom
Pingdom Service
Pingdom
pipeline-validation-service
Pipeline Validation Service
plantuml
PlantUML Service
PlantUML
postgres-archive
Postgres DR Archive Service
Postgres archive replicas
postgres-dr-delayed
Postgres DR Delayed Replica Service
Postgres Replicas
product_analytics
Product Analytics Service
Product Analytics ClickHouse Failure Remediation, Backup & Restore Process
Product Analytics SSL Troubleshooting
psql-timings
PSQL Timings Service
pubsub
Pubsub for Logging Service
PubSub Queuing Rate Increasing
pulp
Pulp (pulp.pre.gitlab.net) Service
Pulp Backup and Restore
CloudFlare Analytics Engine Queries
Pulp Runbook - Deleting a Package
Pulp Runbook - Functional Operations
Pulp Infrastructure Setup
Pulp Repository Metadata Signing Keys
Pulp SLIs
troubleshooting
Pulp User Management
pvs
Pvs Service
rate-limiting
Rate Limiting Service
redis
Persistent Redis Service
Blocking individual IPs using Redis and Rack Attack
Clearing sessions for anonymous users
Redis on Kubernetes
Memory space analysis with cupcake-rdb
Provisioning Redis Cluster
Troubleshooting
Redis Cluster
Functional Partitioning
Redis RDB Snapshots
Redis-Sidekiq catchall workloads reduction
A survival guide for SREs to working with Redis at GitLab
Scaling Redis Cluster
redis-actioncable
Redis ActionCable Service
redis-cluster-cache
Redis Cluster Cache Service
Removing cache entries from Redis
redis-cluster-chat-cache
Redis Cluster Chat Cache Service
redis-cluster-database-lb
Redis Cluster Database Loadbalancing Service
redis-cluster-feature-flag
Redis Cluster Feature Flag Service
redis-cluster-queues-meta
Redis Cluster Queues Meta Service
redis-cluster-ratelimiting
Redis Cluster RateLimiting Service
redis-cluster-registry
Redis Cluster Registry Service
redis-cluster-repo-cache
Redis Cluster Repo Cache Service
redis-cluster-sessions
Redis Cluster Sessions Service
redis-cluster-shared-state
Redis Cluster SharedState Service
redis-feature-flag
Redis-feature-flag Service
redis-pubsub
Redis Pub/Sub Service
redis-ratelimiting
Redis-ratelimiting Service
redis-registry-cache
Redis Registry Cache Service
redis-sessions
Redis Sessions Service
redis-sidekiq
Redis Sidekiq Service
redis-tracechunks
Redis TraceChunks Service
registry
Container Registry Service
alerts
ContainerRegistryDBHighReplicaConnectivityQuarantineRate
ContainerRegistryDBHighReplicaLagQuarantineRate
ContainerRegistryDBHighReplicaPoolChurnRate
ContainerRegistryDBLoadBalancerReplicaPoolSize
ContainerRegistryDBNoReplicasAvailable
ContainerRegistryDBReplicaPoolDegraded
ContainerRegistryDBReplicaPoolSizeInstability
ContainerRegistryNotifications
PatroniRegistryServiceDnsLookupsApdexSLOViolation
Database Connection Pool Saturation
Container Registry Batched Background Migrations
Container Registry CDN
Container Registry Database Index Bloat
Container Registry Database Load Balancing
Container Registry database post-deployment migrations
gitlab-registry
Builds failing with `MANIFEST_INVALID: manifest invalid; http: request body too large`
High Number of Overdue Online GC Tasks
Container Registry Prefer Mode
High Number of Pending or Failed Outgoing Webhook Notifications
release-management
High build pressure
High deploy pressure
release-tooling
Delivery Release Tooling Service
renovate
Renovate at GitLab: Current Implementation Documentation
repository-mirroring
Repository Mirroring Service
Mirror Updates Silently Failing
Pull Mirroring Timeout with Large LFS Files
runway
Runway Platform Service
Runway cert-manager-sync Failure
Restore/Backup Runway-managed Cloud SQL
Cloud SQL Restore Pipeline Troubleshooting
Privileged Access Management
runway-db-example
Example Runway-managed Postgres Service
runway-redis-example
Example Runway-managed Redis Service
sample-collector
Distributed Tracing Sample Collector Service
sast-service
SAST Scanner Service for SAST in the IDE
search
Global Search Service
secret-detection
Detects secret leaks in the given payloads Service
Secret Detection Partner Token Verification Troubleshooting
secret-revc-worker
Secret Revocation Worker Service
secret-revocation
Secret Revocation Service
secrets-manager
OpenBao for GitLab Secrets Manager Service
security-patching
alerts
UbuntuLivepatch
linux-os
Linux OS Patching
Patching Notifications
systems
Bastions
Console
Deploy
Gitaly
GKE
HAProxy
Patroni
PGBouncer
Redis
Runner Managers
sentry
Monitoring Service
Troubleshooting
service_desk
Debugging Service Desk
sidekiq
Sidekiq Background Jobs Service
alerts
Title: SidekiqQueueTooLarge
Disabling Sidekiq workers
Pull mirror overdue queue is too large
Sidekiq queue migration
sharding
sidekiq_error_rate_high
Sidekiq Concurrency Limit
Poking around at sidekiq's running state
Sidekiq queue no longer being processed
`sidekiq_queueing` apdex violation
Sidekiq SLIs
A survival guide for SREs to working with Sidekiq at GitLab
Exporting projects silently
siphon
Siphon Service
alerts
SiphonLogicalReplicationSlotLagHigh
spamcheck
Spamcheck Service
sqlmesh-catalog
SQLMesh Catalog Service
sscs
auth
SSCS - Authentication & Authorization Runbooks
Blocked user login attempts are high
Cloud Connector - Authentication
JWKS keys fetch for token-based Authentication
Email-based One Time Passwords (OTP)
GATE runbook
Rate of successful user logins is zero
stackdriver
Stackdriver Metrics Service
staging-ref
Staging ref
GET Monitoring Setup
storage
fs
zfs
zlonk
Zlonk Service
switchboard
Switchboard Service
teleport
Teleport
Database Access via Teleport
Rails Console Access via Teleport
Getting Access to Teleport
SSH Access to a Host via Teleport
Teleport Administration
Teleport Approver Workflow
Teleport Disaster Recovery
(Title: Name of alert)
(Title: Service Name)
thanos
thanos Service
token-rotation-management
Token rotation management Service
topology-grpc
Topology Service gRPC
topology-rest
Topology Service Rest
Breakglass
topology-service
Performance Test Topology Service
Topology Service: On-Call Survival Guide
topology-spanner
Topology Service Cloud Spanner
tracing
Distributed Tracing Service
tracing-app-ruby
Distributed Tracing Sample App - Ruby Service
tutorials
Tutorials
Incident Diagnosis in a Symptom-based World
Example Tutorial Template
How to use flamegraphs for performance profiling
Life of a Git Request
Life of a Web Request
Provisioning a GitLab Instance on an AWS VM
Tips for writing tutorials
uncategorized
about.gitlab.com
access-gcp-hosts
Access Requests
Alert about SSL certificate expiration
Alert Routing
alerts
component_saturation_slo_out_of_bounds:gcp_quota_limit
Alerts Should Have Runbook Annotations
Aptly
Auto DevOps
Benchmarking Database Instances
Blocking a project causing high load
Canary in GCP production and staging
Cloud SQL Data Export
Create:Code Review Group Runbook
Release Artifact Bucket
Deleting a project manually
Deleted Project Restoration
Deploy Cmd for Chatops
Domain Registration
Error budget weekly reporting
externalvendors
Getting help with GCP support and Rackspace
Feature Flags
Getting setup with Google gcloud CLI
gcp-network-intelligence
GCP Projects
Managing Geekbot standups
gke-runner
index
INSTRUCTIONS
GitLab Flavored Markdown (GLFM)
granting-rails-or-db-access
GitLab Job Completion
Manage DNS entries
Migration Skipping
Missing Repositories
Google mtail for prometheus metrics
namespace-restore
Node CPU alerts
Node Reboots
Omnibus package troubleshooting
OPS-GITLAB-NET Users and Access Tokens
OSQuery
patching-production
Periodic Job Monitoring
Project exports
Rails is down
Remove Blobs
Removing kernels from fleet
Ruby profiling
Shared Configurations
snowplow
index
GitLab staging environment
subnet-allocations
Terraform Broken Main
Application Database Queries
Upgrades and Rollbacks of Application Code
How to upload a file to Google Cloud Storage from any system without a credentials configuration
Uploads
Workers under heavy load because of being used as a CDN
Configuring and Using the Yubikey
vault
Hashicorp Vault for Infrastructure Service
Access Management for Vault
Vault Administration
Troubleshooting Hashicorp Vault
How to Use Vault for Secrets Management in Infrastructure
Vault Audit Log Analysis
Vault Secrets Management
version
version.gitlab.com Service
version.gitlab.com Runbook
web
GitLab.com Web Service
Diagnostic Reports
Identity Verification
Rails middleware: path traversal
Static objects caching
Static repository objects caching
Workhorse Image Scaler
web-ide
Web IDE runbook
web-pages
GitLab Pages Service
websockets
Websockets Service
Custom Websocket Alerts
wikis
Wikis
wiz-runtime-sensor
Wiz Sensor Service
woodhouse
Woodhouse Service
woodhouse-slack
Woodhouse Slack bot Service
Woodhouse-Slack Overview
workhorse
Workhorse Service
Workhorse Apdex Degradation
workspaces
Remote Development Workspaces Service
zoekt
Exact code search Service
GitLab
Code Context
Select theme
Dark
Light
Auto
Orbit Service
Alerts
:
https://alerts.gitlab.net/#/alerts?filter=%7Btype%3D%22orbit%22%2C%20tier%3D%22inf%22%7D
Label
: gitlab-com/gl-infra/production~“Service::Orbit”