Drop in Ad Serving in SIN3 Datacenter
Incident Report for Xandr
Postmortem

Incident Summary

From approximately 14:46 to 17:23 UTC on Thursday, March 25, 2021 Xandr experienced a high percentage of timeouts from SIN3 Impression Bus to external bidders located in Hong Kong.

Scope of Impact

During the incident window external bidders using the SIN3 datacenter experienced a high percentage of time-outs, which may have resulted in reduced ad serving for some clients.

Timeline (UTC)

2021-03-25 14:46: Incident started and engineering notified
2021-03-25 15:05: Mitigation efforts attempted and failed
2021-03-25 15:15: External carrier contacted and notified Xandr of scheduled maintenance of some of their hardware
2021-03-25 15:58: Incident ticket escalated
2021-03-25 17:07: External carrier initiated mitigation efforts
2021-03-25 17:15: Latency began improving
2021-03-25 17:23: Xandr external bidder latency below 10% and incident considered resolved

Cause Analysis

The root cause of the incident was determined to be a submarine cable fault between Singapore and Hong Kong, which caused packet loss and high latency.

Resolution Steps

The external carrier rerouted traffic between Singapore and Hong Kong to a different, non-optimal, path as a temporary fix. They plan to start working on fixing the submarine cable on March 31st.

Posted Apr 01, 2021 - 13:25 UTC

Resolved

The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.

Posted Mar 25, 2021 - 23:11 UTC
Monitoring

We have patched the issue and are monitoring our systems closely. We will provide an update as soon as the issue has been fully resolved.

Posted Mar 25, 2021 - 22:14 UTC
Identified

We have identified the following issue:

  • Component(s): Bidding, Ad Serving
  • Impact(s):
    • Decrease in requests sent to Bidder customers
    • Drop in delivery on external supply
  • Severity: Major Outage
  • Datacenter(s): SIN3

Our engineers are actively working towards a resolution, and we will provide an update as soon as possible. Thank you for your patience.

Posted Mar 25, 2021 - 16:43 UTC