GitLab Code Suggestion Failover Solution
This page provides instructions for switching the LLM provider in case of an outage with the primary provider. It is intended for product and support engineers troubleshooting LLM provider outages affecting gitlab.com users.
How to switch to backup for code generation
Section titled “How to switch to backup for code generation”We use a feature flag to switch which model and model provider are active for code generation.
If the primary model provider experiences an outage, enable the feature flag by running the following command in the #production
Slack channel:
/chatops run feature set incident_fail_over_generation_provider true
After the primary LLM provider is back online, we can change back to the primary model by running this command in the #production
Slack channel to disable the feature flag:
/chatops run feature set incident_fail_over_generation_provider false
How to switch to backup for code completion
Section titled “How to switch to backup for code completion”We use a feature flag to switch which model and model provider are active for code completion.
We are using a feature flag to control whether to enable the failover solution.
To enable failover solution in production when the primary LLM is down, send this to #production slack channel:
/chatops run feature set incident_fail_over_completion_provider true
Note: on failover mode, direct access is automatically forbidden and all Code Suggestion requests become indirect access. For more details about direct vs indirect access, please refer to the documentation.
After the primary LLM provider is back online, we can disable the feature flag, so that we are switching back to the primary LLM provider:
/chatops run feature set incident_fail_over_completion_provider false
How to verify
Section titled “How to verify”-
Go to Kibana Analytics -> Discover
-
Select
pubsub-mlops-inf-gprd-*
as Data views from the top left -
For code generation, search for
json.jsonPayload.message: "Returning prompt from the registry"
:- You should see
json.jsonPayload.prompt_id: code_suggestions/generations/base
andjson.jsonPayload.prompt_version <version>
-
You can also find the template file in this folder
-
For example, if the version is 2.0.1, then the template file is
ai-assist/ai_gateway/prompts/definitions/code_suggestions/generations/base/2.0.1.yml
-
In this file we can find the current model and model provider, for example, here we are using
claude-3-5-sonnet@20241022
provided byvertex_ai
:model:name: claude-3-5-sonnet@20241022params:model_class_provider: litellmcustom_llm_provider: vertex_aitemperature: 0.0max_tokens: 2_048max_retries: 1
-
- You should see
-
For code completion, search for
json.jsonPayload.message: "code completion input:"
, and then we can find the provider that is currently in use:- if you see
json.jsonPayload.model_provider: anthropic
, then we are using the failover modelclaude-3-5-sonnet-20240620
provided byanthropic
- if you see another value for
json.jsonPayload.model_provider
, then we are using a non-failover model
- if you see