Skip to content

Duo Workflow Service

NOTE: Do NOT expose customer’s RED data in public issues. Redact them or make a confidential issue if you’re unsure.

Before starting the investigation, please collect the following information:

  • GitLab username for the user that encountered the bug (e.g. @johndoe)
  • What happened (e.g. User asked a question in Flows tab in VSCode extension and the agent platform did not respond)
  • When it happened (e.g. Around 2024/09/16 01:00 UTC)
  • Is it happening in .com, self-managed or dedicated instances? If self-managed or dedicated, what GitLab version they’re using?
  • GitLab Workflow VS Code extension version (e.g. v.6.26.1) if applicable.
  • If using VSCode extension, ask whether they use gRPC connection or webhooks to communicate.
  • Are there executor logs?
    • For VSCode -> Command + P -> Show Extension Logs -> Choose GitLab Language Server from dropdown
    • For flows running in CI -> CI job logs
  • What is the flow type (chat for agentic chat, software_development if it’s Flows tab, issue to MR, or a custom flow etc.)
  • How often it happens (e.g. It happens everytime)
  • Steps to reproduce (e.g. 1. Ask a question “xxx” 2. Click …)
  • AI Gateway or self-managed AI Gateway (If they use custom models, it’s likely latter.)
  • A link to a Slack discussion, if any.

The Duo Workflow Service is a Python service that manages and executes Duo Agent Platform sessions using LangGraph. Within AI-Gateway, it handles communication between the user interface, the LLM provider, and the executors, while maintaining workflow state through periodic checkpoints saved to GitLab. This service provides the intelligence layer that interprets user goals, plans execution steps, processes LLM responses, and orchestrates the necessary commands to complete tasks, all while maintaining a secure boundary between untrusted code execution and the core GitLab infrastructure. .

See design document at https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/duo_workflow/

Duo Workflow Service will autoscale with traffic. To manually scale, update runway-production.yml based on documentation.

It is also possible to directly edit the tunables for the duo-workflow-svc service via the Cloud Run console’s Edit YAML interface. This takes effect faster, but be sure to make the equivalent updates to the runway-production.yml as described above; otherwise the next deploy will revert your manual changes to the service YAML.

Duo Workflow Service uses both custom metrics scraped from application and default metrics provided by Runway. These alerts are routed to g_duo_agent_platform_prometheus_alerts in Slack. To route to different channel, refer to documentation.

Currently, error logs from Sentry also trigger alerts. These alerts are directed to g_duo_workflow_alerts in Slack.