Skip to content

Data Insights Platform - Usage Billing

These Data Insights Platform (DIP) instances help ingest consumption-based billable_usage events from internal consumption services, e.g. AI-Gateway (AIGW) and export it to customerdot ClickHouse Cloud instances.

Following is a brief overview of its setup in these environments:

DIP Overview

  • production: billing.prdsub.gitlab.net on prdsub GKE environment.
  • staging: billing.stgsub.gitlab.net on stgsub GKE environment.
  • All currently provisioned DIP instances should be accessible to on-call engineers. The access to these underlying GKE clusters follows our standard VPN-based tool chain as documented here.
  • In case of other folks needing access to these environments, an access request needs to be created - for example.

Within these environments, we setup DIP in the single-deployment mode which allows us to run all DIP components within the same statefulset. This deployment mode keeps our topology simple to begin with.

Note, if needed, we can run each of the components in their own deployments/statefulsets to scale cluster throughput further.

For example, on the stgsub GKE cluster:

➜ ~ kubectl -n data-insights-platform get statefulset
NAME READY AGE
data-insights-platform-single 3/3 70d
➜ ~ kubectl -n data-insights-platform get pods
NAME READY STATUS RESTARTS AGE
data-insights-platform-ingress-nginx-controller-596f47b544vdgr6 1/1 Running 0 11d
data-insights-platform-single-0 1/1 Running 0 20h
data-insights-platform-single-1 1/1 Running 0 20h
data-insights-platform-single-2 1/1 Running 0 20h

We can of course, scale the statefulset to multiple replicas - in this case 3.

SLIs for Data Insights Platform within customerdot environment is defined in its specific metrics-catalog file, and available on Grafana here.

(work in progress)

EntityDetails
ProviderGCP/GKE
GCP Projectgitlab-subscriptions-prod
Regionus-east1
Networks
DNS Namesbilling.prdsub.gitlab.net
Deployment configsIn config-mgmt repository
In gitlab-helmfiles repository
Cloudflare settingsIn config-mgmt repository
Zone: billing.prdsub.gitlab.net
Host: billing.prdsub.gitlab.net
EntityDetails
ProviderGCP/GKE
GCP Projectgitlab-subscriptions-staging
Regionus-east1
Networks
DNS Namesbilling.stgsub.gitlab.net
Deployment configsIn config-mgmt repository
In gitlab-helmfiles repository
Cloudflare settingsIn config-mgmt repository
Zone: billing.stgsub.gitlab.net
Host: billing.stgsub.gitlab.net

For Usage Billing deployments, following is a good representation of how usage-billing data flows across the various systems and/or network boundaries.

Usage Billing data flows

  1. Duo Agent Platform requests and/or workflows are sent to be processed on AIGW.
  2. AIGW first calls CustomerDot to check entitlements & allocated GitLab credits to enforce cutoff and/or overages.
  3. Once clear to move forward and having processed the request, a corresponding billable_usage event is emitted to billing.prdsub.gitlab.net backed by Data Insights Platform (DIP).
  4. DIP ingests & parses these events internally, eventually exporting it to CustomerDot-owned ClickHouse Cloud instance.
  5. CustomerDot then fetches these parsed events from their ClickHouse Cloud instance; extracts necessary information from the payload, enriches it as needed and runs all usage-based billing logic.
  6. Once processed, fully-enriched events are bulk-uploaded to Zuora via their API for eventual revenue recognition with users/customers.
  7. Subsequently, fully-enriched events are also bulk-uploaded to Snowflake for internal analytical purposes.