From 2021-11-05 1:00 AM UTC until 1:10 AM UTC requests to the Spreedly Core API intermittently returned 500 error response codes for all API request types. Approximately 10,000 requests received a 500 error response during this time.
On 11/05/2021 at 1:02AM UTC Spreedly’s internal monitoring detected an elevated number of error responses being returned from the Spreedly Core API. Engineers were paged and began investigating. An internal system dependency was identified as partially unavailable beginning at 1:00AM UTC. An automated antivirus scan that runs on the dependent system resulted in constrained resources on a subset of hosts, resulting in those hosts being marked as “unhealthy” by an automated health check process, and they were subsequently removed from service. New hosts were automatically brought into service and normal operations resumed at approximately 1:08 AM UTC on 11/05/2021.
Spreedly engineers have made changes to mitigate the effects of the automated antivirus scan such that it should no longer cause the system to become unresponsive.