Pulp Backup and Restore
Overview
Section titled “Overview”This runbook covers backup and restore procedures for the Pulp service. Pulp’s backup strategy consists of two main components:
- CloudSQL Database Backups - Automated backups of the PostgreSQL database
- Object Storage - Artifacts stored in GCS buckets with built-in redundancy
Important Notes
Section titled “Important Notes”Note that we are not leveraging the native Pulp operator for backup and restoration. We instead rely solely on the strategies provided by our cloud provider. Refer to the Pulp Operator documentation for additional details on why we are not using the Pulp Operator for backups.
Backup Configuration
Section titled “Backup Configuration”Backups should be configured via Terraform:
Database Restore Procedure
Section titled “Database Restore Procedure”Prerequisites
Section titled “Prerequisites”- Access to GCP Cloud SQL Console
- Appropriate IAM permissions for database restoration
- Pulp CLI configured and authenticated
- Understanding that the application will be degraded during restore
Restore Steps
Section titled “Restore Steps”Follow GCP’s documentation for restoring from backup.
1. Document Current State (Recommended for Validation)
Section titled “1. Document Current State (Recommended for Validation)”Pod Counts
Section titled “Pod Counts”Gather the Pod Counts so that we can scale down and scale back up post restoration.
kubectl get deploy -n pulpDocument the desired Pod counts. Note that these Deployments do NOT use Horizontal Pod Autoscalers.
Database PreRestore Analysis
Section titled “Database PreRestore Analysis”This depends on the scenario thus the below is only an example. Determine how we can validate a restoration was successful. Before restoring, document the current state for post-restore comparison:
# List current users (example verification)pulp user list | jq '.[].username' 2>/dev/null || pulp user list
# Example output:# "user1"# "user2"# "admin"
# Check system statuspulp statusSave this output for comparison after the restore completes.
2. Scale down Pulp
Section titled “2. Scale down Pulp”Scale down the Pods as to prevent any interference while the database is being restored.
kubectl patch pulp pulp -n pulp --type='merge' -p=' {"spec":{ "api":{"replicas":0}, "content":{"replicas":0}, "web":{"replicas":0}, "worker":{"replicas":0} }}'3. Perform the Restore
Section titled “3. Perform the Restore”- In the GCP Console, navigate to your CloudSQL instance
- Click on “Backups” in the left sidebar
- Select the backup you want to restore from (verify the timestamp)
- Click “Restore”
- Confirm the restoration
Note: Restoration time varies based on database size. For small databases (<1GB range), expect approximately 10 minutes. Larger databases may take significantly longer.
4. Scale up Pulp
Section titled “4. Scale up Pulp”Scale up, substitute the below numbers with what was documented earlier:
kubectl patch pulp pulp -n pulp --type='merge' -p=' {"spec":{ "api":{"replicas":1}, "content":{"replicas":1}, "web":{"replicas":1}, "worker":{"replicas":1} }}'5. Verify the Restore
Section titled “5. Verify the Restore”Once the restore completes and pods are stable:
-
Wait for all pods to reach Ready state:
Terminal window kubectl get pods -n pulp -w -
Verify database connectivity, using the
pulp-cli:Terminal window pulp status -
Verify data integrity by comparing with pre-restore state (the below is example only):
Terminal window # Check users match the backup timestamppulp user list | jq .[].username -
Confirm the data matches the backup timestamp (data created after the backup should not exist)
Post-Restore Actions
Section titled “Post-Restore Actions”- Monitor application logs for any persistent errors
- Verify that all Pulp services are functioning correctly
- Test critical workflows (e.g., package uploads, downloads)
- Document the restore in an incident issue
Object Storage Restore Procedure
Section titled “Object Storage Restore Procedure”GCS buckets used by Pulp benefit from GCP’s built-in redundancy features. In the event of storage issues:
- Verify bucket configuration and replication settings
- Check GCS availability and durability documentation
- Review the Terraform configuration to identify backup bucket settings and replication configuration. If data loss is confirmed, coordinate with the infrastructure team to restore from replicated buckets.
- Contact GCP support if data loss is suspected
References
Section titled “References”- Pulp Operator Backup Documentation
- GCP CloudSQL Backup and Recovery
- GCS Availability and Durability
- Pulp Terraform Module
- Pulp Helm Chart
- Pulp Helmfile