component_saturation_slo_out_of_bounds:pgbouncer_single_core
Overview
Section titled “Overview”The pgbouncer_single_core
metric measures the total CPU usage by PGBouncer process. PGBouncer is a single threaded application, so if the process fully consumes CPU, a resource may become saturated. This may cause slowness or timeouts on the DB requests. As a result additional pgbouncer nodes may need to be provisioned.
This alert can cause a high Severity incident and needs to be investigated.
Services
Section titled “Services”- Service Overview
- Team that owns the service: Core Platform:Database Group
- Label: gitlab-com/gl-infra/production~“Service::Pgbouncer”
Metrics
Section titled “Metrics”- The alert temlpate expression measures the total CPU usage by PGBouncer process. This template is used to auto-generate alert templates for different environments and Service type (Patroni or Pgbouncer)
- Generated alert measures the CPU usage rate of pgbouncer processes, then sums this usage across CPUs and modes and clamps the result between 0 and 1 (0-100% CPU usage). Examples of generated alerts for Patroni and for Pgbouncer
- The alert has soft and hard SLO defined
Alert Behavior
Section titled “Alert Behavior”- This alert should be rare, but if it’s triggered, needs to be investigated immediately
Severities
Section titled “Severities”- This alert might create S2 incidents
- There might be many gitlab.com users affected as the requrest to backend DB may be delayed
- Review Incident Severity Handbook page to identify the required Severity Level
Verification
Section titled “Verification”Under normal conditions the saturation is expected to be as small as possible. The closer the saturation to 1, the higher the load and new additional nodes are required to be provisioned.
- Grafana Mimir Explore query for “patroni” service type in “gprd” environment
- Grafana Mimir Explore query for “pgbouncer” service type in “gprd” environment
Recent changes
Section titled “Recent changes”Troubleshooting
Section titled “Troubleshooting”- Pgbouncer Service Overview Dashboard
- Pgbouncer Runbook docs
- Adding a new Pgbouncer instance
- Pgbouncer connection management and troubleshooting
Possible Resolutions
Section titled “Possible Resolutions”Dependencies
Section titled “Dependencies”- There are no external or internal dependencies for this alert
For escalation contact the following channels:
Alternative slack channels: