Pulp SLIs

Pulp uses Service Level Indicators (SLIs) to monitor the health and performance of its components. These SLIs are defined in the metrics catalog and are used to measure availability and latency for the service.

Overview

The Pulp service has the following SLIs:

SLI	Description	Type
`pulp_app_api`	Pulp application API service requests	Apdex + Error Rate
`pulp_app_content`	Pulp application content API requests	Apdex + Error Rate
`pulp_nginx`	Nginx ingress controller load balancer requests	Apdex + Error Rate
`pulp_cloudsql`	GCP CloudSQL PostgreSQL database operations	Request Rate
`pulp_gcs`	GCS bucket storage operations	Request Rate
`pulp_redis`	GCP Redis Memorystore caching operations	Request Rate

Observability

The SLIs appear in the Pulp Overview dashboard.

Application SLIs

pulp_app_api

The pulp_app_api SLI monitors the Pulp API service, which handles administrative operations such as repository management, content synchronization, and user management.

Metrics:

api_request_duration_milliseconds_bucket - Request latency histogram
api_request_duration_milliseconds_count - Total request count

Apdex Thresholds:

Significant Labels:

http_method - HTTP request method (GET, POST, PUT, DELETE, etc.)
http_target - API endpoint path
http_status_code - HTTP response status code

Error Rate: Tracks 5xx HTTP status codes as errors.

pulp_app_content

The pulp_app_content SLI monitors the Pulp Content API service, which handles package downloads and content delivery to clients (e.g., yum/dnf clients fetching packages).

Metrics:

content_request_duration_milliseconds_bucket - Request latency histogram
content_request_duration_milliseconds_count - Total request count

Apdex Thresholds:

Satisfied: <= 10s

Significant Labels:

http_method - HTTP request method
http_route - Content API route
http_status_code - HTTP response status code

Error Rate: Tracks 5xx HTTP status codes as errors.

Other Metrics

In addition to the SLIs above, the following metrics are available in the Pulp Overview dashboard.

Task Queue Metrics

Longest Unblocked Task Wait Time

Tracks how long the oldest unblocked task has been waiting in the queue. Lower values are better.

Metric:

tasks_longest_unblocked_time_seconds{namespace="pulp"}

Unblocked Task Queue Length

Tracks the number of unblocked tasks waiting to be processed. Lower values are better.

Metric:

tasks_unblocked_queue{namespace="pulp"}

Infrastructure SLIs

pulp_nginx

Monitors the nginx ingress controller that load balances traffic to Pulp services.

Metrics:

nginx_ingress_controller_request_duration_seconds_bucket - Request latency histogram
nginx_ingress_controller_requests - Total request count

Apdex Threshold:

Satisfied: <= 10s

Significant Labels:

method - HTTP method
path - Request path
status - HTTP status code

pulp_cloudsql

Monitors the GCP CloudSQL PostgreSQL instance used by Pulp.

Metrics:

stackdriver_cloudsql_database_cloudsql_googleapis_com_database_postgresql_statements_executed_count

Significant Labels:

database_id - Cloud SQL database identifier
database - Database name
operation_type - Type of SQL operation

pulp_gcs

Monitors the GCS bucket used for package storage.

Metrics:

stackdriver_gcs_bucket_storage_googleapis_com_api_request_count

Significant Labels:

bucket_name - GCS bucket name
method - API method

pulp_redis

Monitors the GCP Redis Memorystore instance used for caching and session management.

Metrics:

stackdriver_redis_instance_redis_googleapis_com_commands_calls

Significant Labels:

instance_id - Redis instance identifier

Pulp SLIs

Overview

Observability

Application SLIs

pulp_app_api

pulp_app_content

Other Metrics

Task Queue Metrics

Longest Unblocked Task Wait Time

Unblocked Task Queue Length

Infrastructure SLIs

pulp_nginx

pulp_cloudsql

pulp_gcs

pulp_redis

Related Documentation