Skip to content

DuoWorkflowSvcServiceServerApdexSLOViolation

  • This alert fires when the apdex (latency) of GRPC requests to the Duo Workflow Service server component exceeds the SLO threshold.
  • The alert indicates that the service is experiencing higher-than-acceptable latency, which impacts user experience.
  • Apdex measures the proportion of requests that complete within acceptable time thresholds (satisfied: < 3s, tolerated: < 5s).
  • Possible user impacts
    • Users will see delayed responses from Duo Agent Platform.
  • The metric used is gitlab_component_apdex:confidence:ratio_1h and gitlab_component_apdex:confidence:ratio_6h for the server component of duo-workflow-svc.
  • This metric measures the apdex score (0-1 scale, where 1.0 is perfect).
  • Satisfied threshold: < 3 seconds time to the first response
  • Tolerated threshold: < 5 seconds time to the first response
  • The SLO threshold is 95% apdex, meaning the alert fires when apdex drops below this threshold.
  • Link to metric catalogue
  • To silence the alert, please visit Alert Manager Dashboard
  • This alert is expected to be rare under normal conditions. High frequency indicates performance degradation.
  • This alert creates S2 incidents (High severity, pages on-call).
  • All gitlab.com, self-managed and dedicated customers (other than those using self-hosted DAP) using Duo Workflow features are potentially impacted.
  • Review Incident Severity Handbook page to identify the required Severity Level.
  1. Check duo workflow service logs:

  2. Check for recent changes:

    • Review recent changes mentioned under Recent changes section.
    • Check if a recent deployment affected server latency.
    • If a recent change caused the issue, consider rolling back.
  • N.A. We don’t have historical data on this alert’s resolutions.
  • AI Gateway / Duo Workflow Service
  • For investigation and resolution assistance, reach out to #g_agent_foundations on Slack.