IMPACTED DATACENTER LOCATIONS: ALL-PARTIAL
TIME: 12:48 PM UTC to 2:53 PM UTC
ISSUE SUMMARY: Starting at 12:48 PM UTC, a newly released feature in SGN Connect 3.6 related to LAN ZeroTrust generated high volumes of errors across the agent and portal API infrastructure. This prevented access to the portal and prevented agents from connecting to the SGN, with an estimated 35% of devices impacted.
Devices that were connected to the SGN before 12:48 PM UTC would have remained connected to the SGN, however, any policy changes made during the intermittent access to the portal would not have reflected across the SGN.
At 2:45 PM UTC, the team released a fix for the LAN ZeroTrust feature that resolved the errors and restored full service.
ADDITIONAL INFORMATION: Reliability is a top priority for us at Todyl as we continue to rapidly scale. Over the past few weeks, the team has been hard at work implementing major improvements to our Portal & API infrastructure, along with new versions of the agent to ensure stability during periods of high traffic. The errors that led to the outage today were unrelated to traffic levels, and the team is implementing additional controls to prevent a re-occurrence including additional database monitoring, further segmentation of the API layer, and more aggressive error handling to isolate and contain issues within the platform.