Debugging Service Desk
The Monitor:Respond group is responsible for Service Desk product development.
Additionally, the Scalability group has been doing some infrastructure work around mailroom on gitlab.com.
Assessing impact
Section titled “Assessing impact”Despite being a free feature, Service Desk has low usage and spikes up and down (e.g. weekends/holidays). Zoom out to a few days (3 or 7) to get a feel for the impact.
- Is traffic completely flat?
- There could be a problem with Sidekiq, Mailroom or email ingestion as a whole. See Determine root cause.
- There may be a recent change merged to Gitlab::Email::Handler.
- There may be a problem with GitLab DNS.
- Is traffic lower than normal?
- There may be a recent breaking change to regular incoming email (for example,
Gitlab::Email::Receiver
or Service Desk email ingestion (for example,Gitlab::Email::ServiceDeskReceiver
. - There could be a problem with a 3rd party service customers use for redirection; such as GMail or Google Groups. See Find where the email goes.
- There may be a recent breaking change to regular incoming email (for example,
- No detectable change?
- Customers may be using an uncommon service for redirection that has changed its headers.
- The customer’s email may be marked as Spam in the incoming mail inbox.
Check the Respond Grafana charts
- Is there a noticeable impact?
There are helpful links to the side of the Respond charts (e.g. Kibana, Sentry links).
Determine root cause
Section titled “Determine root cause”Try to reproduce via a known Service Desk - e.g. this sandbox
A good place to start is to get two emails sent to a known Service Desk experiencing issues - one that works (or worked), one that doesn’t.
Get them forwarded as eml
files to ensure headers are intact.
Ask someone with a service desk setup to send you the emails they received.
Email ingestion
Section titled “Email ingestion”Docs: Incoming email and Configuring Service Desk
Since our email ingestion (and eventually Service Desk) uses header content to determine where an email is going, compare the headers to see if anything has changed.
- The headers we accept
- Header comparison source code is in
lib/gitlab/email/receiver.rb
Production issue 6419 was due to a change in headers, specifically Delivered-To
no longer being added to Google Group emails.
The project key should be visible in the headers, and that’s how Service Desk knows which project to create the new issue in.
Find where the emails go
Section titled “Find where the emails go”At the time of writing, most Service Desk setups use a redirection mechanism (e.g. through a third-party Google group) or forwarding since it allows the user to distribute a fully customized email address, and reduces chance of abuse by obscuring the Service Desk email address and allowing it to be changed.
- Did the issue get created in that project, or another project (was the project key correct)?
- Did the sender get a “thank you” email (either a thank you email for that Service Desk, a different Service Desk, or a “I don’t know where that email should go” email)
- If no thank you email, did the email wind up as a note somewhere (ie. ingested into a different part of the GitLab instance)?
In the past we’ve had:
- redirection of emails (third-party intermediary dropped headers, etc) causing Production issue 6419
- incompatibility of JSON and non-UTF-8 encoding causing Production issue 7029 … etc
Trace a specific email
Section titled “Trace a specific email”For gitlab.com - SREs have access to the [email protected]
mailbox, which can be checked to see if an email was received at all.
You can look up what happened to a specific e-mail by matching its SMTP Message-Id
header to the json.mail_uid
field.
In Kibana, find the logs via search: json.mail_uid: <Message-Id>
and either json.class: EmailReceiverWorker
or json.class: ServiceDeskEmailReceiverWorker
(Service Desk emails may be serviced by either worker class, so it’s ideal to check both)
Here’s an example.
The headers we log are in lib/gitlab/email/receiver.rb
.
SMTP header | Log field |
---|---|
Message-Id | json.mail_uid |
From | json.from_address |
To | json.to_address |
Delivered-To | json.delivered_to |
Envelope-To | json.envelope_to |
X-Envelope-To | json.x_envelope_to |
A full list of the headers we accept can be found the incoming email documentation.
Code flow
Section titled “Code flow”Emails go through the following to get to Service Desk:
- Mailroom
- See mail-room runbooks for detailed debugging
- Mailroom is a separate process outside of Rails. It ingests emails and determines whether to send them to different processes (e.g. Sidekiq queue, API, etc)
- Reply to a note
- Service Desk
- etc
- Mailroom interacts with rails using redis (adding a job to a sidekiq queue directly). This might be changed to an API call that enqueues the job instead.
- Infra epic &644 and Scalability epic 1462
POST /api/:version/internal/mail_room/*mailbox_type
- For source and Omnibus installs, we use
config/mail_room.yml
(viafiles/gitlab-cookbooks/gitlab/recipes/mailroom.rb
for Omnibus). - For charts, we use
config/mail_room.yml
- Rails - either Mailroom-direct-to-Sidekiq (old method) or API-call-to-Sidekiq (new method)
- If a Mailroom-initiated Sidekiq job:
- In Kibana, make the following query to
pubsub-sidekiq-inf-gprd
:json.class: EmailReceiverWorker
json.delivered_to: exists
- Code path:
- In Kibana, make the following query to
- If an API call-initiated job, we make a postback POST request to our internal API, which enqueues the job via Sidekiq:
- In Kibana, make the following query to
pubsub-rails-inf-gprd
:json.route: /api/:version/internal/mail_room/*mailbox_type
json.method: POST
- (if needed)
json.project_id: <the project ID>
lib/api/internal/base.rb
- `lib/api/internal/mail_room.rb
- In Kibana, make the following query to
- If a Mailroom-initiated Sidekiq job: