AI-Assisted Service

Service Overview
Alerts: https://alerts.gitlab.net/#/alerts?filter=%7Btype%3D%22ai-assisted%22%2C%20tier%3D%22sv%22%7D
Label: gitlab-com/gl-infra/production~“Service::AI-Assisted”

Logging

ai-assisted

Summary

AI Assisted is a dedicated Rails fleet that provides an AI Abstraction Layer in front of AI Gateway. It handles AI-specific requests that are often long running and depend on external services (such as Vertex AI and Anthropic). To prevent these requests from occupying Puma workers on the main fleet, potentially impacting performance, AI Assisted runs on an isolated fleet to safely serve these endpoints without affecting core GitLab traffic.

Architecture

For diagram, refer to architecture blueprint.

Operational Roles and Responsibilities

Currently, all traffic served by the AI Assisted service supports Code Suggestions features, such as Code Generation and Completion, which are owned by the Code Creation Group. However, if additional feature teams begin contributing endpoints under /api/v4/ai_assisted in the future, service ownership may need to be reassessed.

The service is implemented across the following projects:

Gitlab.com: Kubernetes deployments, services and ingress rules
gitlab-haproxy: HAProxy routing configuration and backend pools
Chef Repository: Chef roles and backend IP configuration
gitlab-com/runbooks: Service catalog, metrics catalog, and monitoring configuration

Deployment

AI-Assisted Service is deployed in:

Staging
Production

Regions

AI-Assisted Service is currently deployed across 3 pods in the following regions:

us-east1-b
us-east1-c
us-east1-d

Performance

AI-Assisted Service includes the following SLIs/SLOs:

Links to further Documentation

https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/24005