Spreedly’s edge network partner, which routes all traffic to Spreedly from around the globe, had a major service outage lasting approximately 49 minutes, immediately followed by a partial service outage lasting approximately 24 minutes, for a total outage time of up to 73 minutes. Most web requests to Spreedly applications, including transaction processing requests, during this outage were served error pages. Requests which received these error pages did not reach Spreedly’s applications.
Beginning at 09:47 UTC (5:47AM EDT) on Tuesday June 8, 2021, our edge network provider reports serving errors to approximately 85% of all requests across their network. Spreedly automated systems detected this outage and alerted and mobilized the appropriate response teams. This outage prevented most requests to the Spreedly platform from succeeding. Spreedly attempted to route around the edge network partner, but our partner’s systems recovered before our attempt completed, so we canceled the attempt. The edge network provider began to recover at 10:36 UTC (6:36 AM EDT), and normal service was restored by 11:00 UTC (7:00AM EDT). The edge network partner also reports that a permanent fix for the defect which originally caused the outage began to be released at 17:25 UTC.
Spreedly is investigating approaches to improve its network resiliency at the edge, as well as enhancing its downtime monitoring and incident resolution procedures.