Skip to content

Data Insights Platform - Product Usage Data via Snowplow

Note, these environments are not receiving production traffic yet. We’re currently planning migrating our current Snowplow data streams over to these Data Insights Platform backed environments. See work epic for details.

These Data Insights Platform instances help ingest product usage data (Snowplow) from GitLab instances across multiple environments, i.e. .com, Dedicated & self-managed and export it to S3/Snowflake - to serve our internal data & analytics teams.

Snowflake sits outside the scope of these environments. We only land data in S3 - which is then ingested into Snowflake async.

Following is a brief overview of its setup in these environments:

DIP Overview

  • production: usagestats.gitlab.com
  • staging: usagestats.staging.gitlab.com
  • All currently provisioned DIP instances should be accessible to on-call engineers. The access to these underlying GKE clusters follows our standard VPN-based tool chain as documented here.
  • In case of other folks needing access to these environments, an access request needs to be created - for example.

SLIs for Data Insights Platform within analytics environments is defined in a separate metrics catalog file and available on Grafana here.

(work in progress)

EntityDetails
ProviderGCP
GCP Projectanalytics-eventsdot-prod
Regionus-east1
Networkseventsdot-prod-vpc / eventsdot-prod-subnet
DNS Namesusagestats.gitlab.com
Deployment configsIn config-mgmt repository
In gitlab-helmfiles repository
EntityDetails
ProviderGCP
GCP Projectgl-analy-evtsdot-stg-0d67dbbc
Regionus-east1
Networksevents-dot-stg-vpc / events-dot-stg-subnet
GKE clusterevents-dot-stg
DNS Namesusagestats.staging.gitlab.com
Deployment configsIn config-mgmt repository
In gitlab-helmfiles repository