Usage Billing Enrichment & Consumption - Production Runbook
Overview
Section titled “Overview”This runbook provides quick reference for debugging and resolving production issues in the usage billing enrichment & consumption pipeline. The system processes raw usage events through enrichment and consumption stages to calculate and deduct GitLab credits from customer wallets.
Pipeline Overview
Section titled “Pipeline Overview”The usage billing pipeline consists of three main stages:
Raw Events (ClickHouse) → Enrichment (every 1 hour) → Consumption (wallet deduction every 1 hour, 10 mins after enrichment)Pipeline Stages
Section titled “Pipeline Stages”- Raw Events: Usage events are ingested into ClickHouse from the Data Insights Platform (DIP)
- Enrichment: Events are enriched with subscription context and GitLab credit calculations
- Consumption: Enriched events trigger wallet deductions for customers
Diagram
Section titled “Diagram”graph TD
A["GitLab Services<br/>(emit usage events)"] --> AIGW["AIGW<br/>(Gateway)"]
AIGW --> B["Data Insights Platform"]
B --> C["ClickHouse<br/>Raw Events"]
C -->|Enrich<br/>hourly| D["Add subscription<br/>& consumer context"]
D -->|Enriched Events| E["ClickHouse<br/>Enriched Data"]
E -->|Consume<br/>hourly + 10min| F["Deduct credits<br/>from wallets"]
F -->|Update| G["PostgreSQL<br/>Wallet Balances"]
G -->|Check balance| H["Access Control"]
H -->|allow/deny| I["GitLab.com<br/>Feature Access"]
F -->|Real-time<br/>on transaction| J["Check usage<br/>thresholds"]
J -->|50%/80%/100%| K["Send threshold<br/>notifications"]
K -->|Email| L["Customer<br/>Notification"]
M["Monthly<br/>trigger"] -->|Send| N["Generate monthly<br/>usage report"]
N -->|Email| L
G -->|Monthly| O["Query overages<br/>for billing"]
O -->|Submit| P["Zuora<br/>Usage Import"]
style A fill:#e1f5ff
style AIGW fill:#e1f5ff
style C fill:#fff3e0
style E fill:#fff3e0
style G fill:#f3e5f5
style H fill:#e8f5e9
style I fill:#e8f5e9
style L fill:#fce4ec
style P fill:#ede7f6
Key Tables & Their Purpose
Section titled “Key Tables & Their Purpose”| Table/View | Location | Purpose |
|---|---|---|
raw_billing_usage_messages | ClickHouse | Raw message blobs from DIP |
raw_billing_usage_events | ClickHouse | Extracted structured events (populated via materialized view) |
usage_billing_enriched | ClickHouse | Events + subscription context + calculated GitLab credits |
consumer_wallets | PostgreSQL | Current balance for consumer wallets |
subscription_wallets | PostgreSQL | Current balance for subscription wallets |
wallet_transactions | PostgreSQL | Recent transactions |
zuora_overage_submissions | PostgreSQL | Zuora submission uploads |
zuora_overage_records | PostgreSQL | Overage submissions corresponding to individual subscriptions |
Monitoring & Alerts
Section titled “Monitoring & Alerts”Dashboards
Section titled “Dashboards”- DIP Dashboard
- ClickHouse query performance dashboard (TODO: Add link)
- Job execution monitoring (TODO: Add link)
Staging Kibana Links
Section titled “Staging Kibana Links”Production Kibana Links
Section titled “Production Kibana Links”Alert Thresholds
Section titled “Alert Thresholds”TODO: Add alert thresholds and escalation paths once monitoring is configured
Troubleshooting Scenarios
Section titled “Troubleshooting Scenarios”- Events Not Arriving in ClickHouse
- Events Not Enriched
- Events Not Consumed
- Incorrect Wallet Balance
- Performance Degradation
- Subscription Not Submitted via Zuora Overage Submission
- Zuora Overage Submission Failed
- Access Cutoff Flow Issues
- Usage Notifications Not Sent
Events Not Arriving in ClickHouse
Section titled “Events Not Arriving in ClickHouse”Symptoms:
- No records in
raw_billing_usage_events - Stale dashboard data
Debug Commands:
Check how many records are in the table:
SELECT count(*), max(IngestionTimestamp)FROM raw_billing_usage_eventsWHERE toDate(IngestionTimestamp) = today()Resolution:
Identify where the data flow is breaking by checking each integration point:
- GitLab to AIGW: Verify source services are emitting usage events, then review AIGW Logs for transmission errors
- AIGW to DIP: Refer to DIP Runbook for pipeline issues
- DIP to ClickHouse: Verify ClickHouse connectivity and check DIP dashboard
- Escalate if needed: Contact Analytics team if needed
Events Not Enriched
Section titled “Events Not Enriched”Symptoms:
- Records exist in
raw_billing_usage_events - Missing from
usage_billing_enriched - No subscription context
Debug Commands:
SELECT *FROM usage_billing_enrichedWHERE toDate(EnrichedAt) = today()Check Enrichment Logs in Kibana:
Search for these error patterns:
- “Failed to enrich event”
- “Event missing required identifier”
- “Missing subscription for event_id”
- “Failed to find or create a consumer for event”
Kibana Links (Staging):
Kibana Links (Production):
Resolution:
Note: These steps can be done with read-only Rails console access to CDot production.
Step through the EnrichmentService steps manually in a Rails console to identify root cause if not clear from logs.
Re-run enrichment for all un-enriched events in a time period:
Billing::Usage::EnrichmentCoordinatorJob.perform_now( start_time: '2025-06-19 10:00:00 UTC', end_time: '2025-06-19 13:00:00 UTC')Re-run enrichment for specific event(s) in a time period:
# Fetch events (more custom querying required to find individual/specific events)events = Billing::Usage::RawEvents.new.fetch_batch_for_enrichment( start_time: start_time, end_time: end_time)
# Execute enrichmentBilling::Usage::EnrichmentService.new(events).executeEvents Not Consumed
Section titled “Events Not Consumed”Symptoms:
- Records exist in
usage_billing_enriched - No wallet transactions
- Dashboard shows 0 or inaccurate usage
Debug Steps:
1. Find the Consumer/Subscription Wallets
# Find consumerconsumer = Consumer.for_subscription('subscription_name').firstconsumer = Consumer.with_user_id(entity_id).first
# Find consumer walletconsumer_wallet = consumer.wallet
# Find subscription walletsmonthly_commitment_wallet = SubscriptionWallet.find_by(subscription_name: '<subscription_name>', category: 'monthly_commitment')otc_wallet = SubscriptionWallet.find_by(subscription_name: '<subscription_name>', category: 'otc')overage_wallet = SubscriptionWallet.find_by(subscription_name: '<subscription_name>', category: 'overage')2. Check Balance and Transactions
# Check wallet balancewallet.balance
# Review all transactionswallet.transactions3. Check Consumption Logs in Kibana
Search for these error patterns:
- “No consumers found for consumption event”
- “No subscription found for consumption event”
Kibana Links (Staging):
Kibana Links (Production):
Resolution:
Note: These steps can be done with read-only Rails console access to CDot production.
Billing::Usage::ConsumptionCoordinatorJob.perform_now( start_time: '2025-06-19 10:00:00 UTC', end_time: '2025-06-19 13:00:00 UTC')Incorrect Wallet Balance
Section titled “Incorrect Wallet Balance”Symptoms:
- Transactions exist
- Balance doesn’t match expected
- Customer reports wrong totals
Debug Commands:
# Find consumerconsumer = Consumer.for_subscription('subscription_name').firstconsumer = Consumer.with_user_id(entity_id).first
# Check wallet balanceconsumer_wallet = consumer.walletconsumer_wallet.balance
# Review all transactionsconsumer_wallet.transactions
# Check credits added vs deductedcredits_added = Wallets::Transaction.where(wallet_id: consumer_wallet.id).credits_added.sum(:amount)credits_deducted = Wallets::Transaction.where(wallet_id: consumer_wallet.id).credits_deducted.sum(:amount)
# Check for expired credits affecting balanceexpired = Wallets::Transaction.where(wallet_id: consumer_wallet.id).where('expires_at < ?', Time.current)expired.sum(:amount)
# Find transactions by date rangeWallets::Transaction.where(wallet_id: consumer_wallet.id).between_created_dates(Date.today - 30.days, Date.today)Resolution:
- Check for missing allocations or expired credits
- Verify transaction calculations match expected consumption
- Investigate
Billing::Usage::ConsumptionServiceorBilling::Usage::ConsumptionProcessingServiceif discrepancies found
Performance Degradation
Section titled “Performance Degradation”Symptoms:
- Jobs timing out
- Queue backlog
- Processing drift
Debug Steps:
- Check job duration metrics
- Monitor ClickHouse query performance (via CH Monitoring dashboard)
- Check batch sizes
Resolution:
- Short-term: Increase parallelization
- Long-term: Optimize queries, consider hourly aggregation
Subscription Not Submitted via Zuora Overage Submission
Section titled “Subscription Not Submitted via Zuora Overage Submission”Symptoms:
SubmitOverageUsageJobran successfully- Don’t see the subscription overage submitted in Zuora
Debug Steps:
1. Verify Subscription exists in Overage Records
If it doesn’t exist, the subscription was never picked for upload:
Zuora::OverageRecord.where( subscription_name: subscription_name).group(:zuora_overage_submission_id).count2. Check Subscription Flags and Charge
Verify subscription flags:
usage_overage_billing_allowed__cshould benilortrueusage_overage_terms_accepted__cshould betrue
Verify Usage Charge exists on the Subscription (Monthly Commitment purchased):
subscription.rate_plan_charges.where(charge_type: 'Usage').exists?3. Check if Subscription Has Overage
The subscription might not have been picked up by the overage query. Check the overage query:
billing_month = Time.current.beginning_of_monthbilling_month_end = billing_month.end_of_month
overaged_subscriptions = SubscriptionWallet .overage .joins( <<~SQL.squish INNER JOIN (#{Wallets::Transaction.total_credits_deducted(billing_month, billing_month_end).to_sql}) credits_used ON credits_used.wallet_id = subscription_wallets.id SQL ) .joins( <<~SQL.squish LEFT JOIN (#{Wallets::Transaction.total_overage_offset_credits_added(billing_month, billing_month_end).to_sql}) credits_added ON credits_added.wallet_id = subscription_wallets.id SQL ) .where('COALESCE(credits_used.total, 0) > COALESCE(credits_added.total, 0)') .where(subscription_name: '<your subscription name>') .select( 'subscription_wallets.id', 'subscription_wallets.subscription_name', 'GREATEST(COALESCE(credits_used.total, 0) - COALESCE(credits_added.total, 0), 0) AS overage' )Resolution:
Based on findings:
- Accept subscription overage acceptance flags (via Zuora)
- Add/purchase usage charge
- Create valid overage data for the subscription
Zuora Overage Submission Failed
Section titled “Zuora Overage Submission Failed”Symptoms:
SubmitOverageUsageJobran but the status isFailed
Debug Commands:
sub = Zuora::OverageSubmission.last
# Verify created_at to ensure a submission was created# sub.created_at should be near about same as Time.nowsub.created_at
# Verify your subscription was present in the submissionsub.zuora_overage_records
# Check the failure reason and Zuora response if anysub.error_messagesub.zuora_responsesub.zuora_response.import_statusCheck Zuora Import Status:
- If
import_statusisFailed(Zuora returned a failure):- Check https://test.zuora.com/platform/apps/com_zuora/usage?~(clearFilter~true)
- Click on
Import Failed Recordsto see specific failure reason - Note: Sometimes it takes ~5-10 minutes for the failure to appear on the UI
Known Zuora Failure Reasons:
- Overage was once submitted and then was also “billed” for the same billing month (invoice created)
- Subscription data sent to Zuora is incorrect:
- Start-date does not match the expected start date
- Usage charge (on-demand GitLab credit SKU) does not exist for the subscription on Zuora
- Subscription does not exist on Zuora
- Note: Zuora failure fails the entire batch even if one record has an issue
If check_import_status_url is empty:
- Upload failed before hitting Zuora
- Could be due to network issue or exception within the job
Resolution:
For Network Issues:
# Mark the submission as pending or delete itsub.update!(status: 0)
# Re-run the job - it will delete this pending submission and create new onesBilling::SubmitOverageUsageJob.perform_now(date: billing_month)For Invalid Data:
- Identifying which subscription caused the issue is difficult and manual
- If you had run Zuora overage submission successfully for a subscription and have billed it once (and probably deleted the submission records for testing again), you cannot resubmit it again in the same billing month. Create a new subscription and test with that.
- For any other data issues, reach out to Utilization team for assistance. The overage submission job verifies the date and usage-charge of the subscription before submitting, so ideally the submitted data should be valid.
Dry Run: Check Subscriptions to be Submitted to Zuora
Section titled “Dry Run: Check Subscriptions to be Submitted to Zuora”Use Case: Preview which subscriptions would be submitted without actually submitting
Commands:
# Dry run to check countsBilling::Usage::Zuora::OverageSubmissionService.new( date: Date.today, dry_run: true).execute
# Get subscription namesBilling::Usage::Zuora::OverageSubmissionService.new( date: Date.today, dry_run: true).send(:overage_wallets).map(&:subscription_name)Access Cutoff Flow Issues
Section titled “Access Cutoff Flow Issues”Symptoms:
- Users losing access unexpectedly
- Access cutoff not enforced across subscriptions
- Overage wallet not working correctly
Possible Issues:
- Incorrect subscription flags -
usage_overage_billing_allowed__corusage_overage_terms_accepted__cmisconfigured - Missing wallets - Overage, monthly_commitment, or OTC wallets don’t exist
- Consumption job failures - Wallets not being deducted despite enriched events
- Feature flag disabled - Usage billing or cutoff feature flags turned off
Debug Commands:
subscription = Subscription.current_subscription('subscription_name')
# Check subscription flagsputs "Usage permitted: #{subscription.usage_permitted?}"puts "Overage allowed: #{subscription.allow_billing_for_usage_overage?}"puts "Overage terms accepted: #{subscription.overage_terms_accepted?}"
# Check wallets existsubscription.wallets.map { |w| "#{w.category}: #{w.balance}" }
# Check feature flagsputs "Usage billing enabled: #{Billing::Usage::Feature.enabled?}"puts "Usage cutoff enabled: #{Billing::Usage::Feature.usage_cutoff_enabled?}"Resolution:
- Verify subscription flags in Zuora and sync:
subscription.reload - Create missing wallets:
subscription.find_or_create_wallet(:overage) - Re-run consumption job to deduct credits
- Check Kibana logs for “No consumers found” or “No subscription found” errors
Usage Notifications Not Sent
Section titled “Usage Notifications Not Sent”Symptoms:
- Customers not receiving usage threshold emails (50%, 80%, 100%)
- Monthly usage reports not sent
- Notification records not created
Possible Issues:
- Feature flag disabled - Notifications feature flag turned off
- No monthly commitment - Subscription doesn’t have purchased monthly commitment
- Missing billing account - Subscription not linked to billing account or customers
- Notification already sent - Debounce or duplicate prevention blocking resend
- Wallet missing - Subscription wallet doesn’t exist
Debug Commands:
subscription = Subscription.current_subscription('subscription_name')
# Check feature flagputs "Notifications enabled: #{Billing::Usage::Feature.notifications_enabled?}"
# Check monthly commitmentputs "Has commitment: #{subscription.purchased_monthly_commitment?}"
# Check billing account and customersbilling_account = BillingAccount.find_by(zuora_account_id: subscription.account_id)puts "Billing account: #{billing_account.present?}"puts "Customers: #{billing_account&.customers&.count}"
# Check walletputs "Wallet exists: #{subscription.wallet.present?}"
# Check sent notificationsBilling::Usage::GitlabUnitsNotification.where( billing_account_id: billing_account.id).order(created_at: :desc).limit(5)Resolution:
- Enable notifications feature flag:
Unleash.enabled?(:usage_billing_notifications) - Verify subscription has monthly commitment purchased
- Ensure billing account and customers are linked
- Manually trigger notification job:
Subscriptions::TriggerGitlabUnitsEmailsJob.perform_now(subscription.zuora_id, monthly_notification: false) - Check Kibana logs for “GitlabUnitsNotifications” errors
Common Resolution Commands
Section titled “Common Resolution Commands”Note: These steps can be done with read-only Rails console access to CDot production.
Re-run Enrichment
Section titled “Re-run Enrichment”Billing::Usage::EnrichmentCoordinatorJob.perform_now( start_time: Time.parse('2025-06-19 10:00:00 UTC'), end_time: Time.parse('2025-06-19 13:00:00 UTC'))Re-run Consumption
Section titled “Re-run Consumption”Billing::Usage::ConsumptionCoordinatorJob.perform_now( start_time: Time.parse('2025-06-19 10:00:00 UTC'), end_time: Time.parse('2025-06-19 13:00:00 UTC'))Re-run Zuora Overage Submission
Section titled “Re-run Zuora Overage Submission”billing_month = Time.current.beginning_of_monthBilling::SubmitOverageUsageJob.perform_now(date: billing_month)Feature Flags
Section titled “Feature Flags”The usage billing system uses granular feature flags for different components. The system uses a global enable, selective disable pattern.
For detailed information on available components and how to enable/disable them, see the available components documentation.
Related Documentation
Section titled “Related Documentation”- CustomersDot Overview
- CustomersDot Architecture
- Usage Billing Design Doc
- Work Item #14569 - Original runbook issue
Escalation
Section titled “Escalation”For issues that cannot be resolved using this runbook:
- Check the Fulfillment team escalation process
- Contact the Utilization team for assistance with data issues
- Escalate to Analytics team for ClickHouse connectivity or DIP issues