Increased Portal & SGN Errors
Incident Report for Todyl
Postmortem

IMPACTED DATACENTER LOCATIONS: ALL-PARTIAL

TIME: 12:48 PM UTC to 2:53 PM UTC

ISSUE SUMMARY: Starting at 12:48 PM UTC, a newly released feature in SGN Connect 3.6 related to LAN ZeroTrust generated high volumes of errors across the agent and portal API infrastructure. This prevented access to the portal and prevented agents from connecting to the SGN, with an estimated 35% of devices impacted.

Devices that were connected to the SGN before 12:48 PM UTC would have remained connected to the SGN, however, any policy changes made during the intermittent access to the portal would not have reflected across the SGN.

At 2:45 PM UTC, the team released a fix for the LAN ZeroTrust feature that resolved the errors and restored full service.

ADDITIONAL INFORMATION: Reliability is a top priority for us at Todyl as we continue to rapidly scale. Over the past few weeks, the team has been hard at work implementing major improvements to our Portal & API infrastructure, along with new versions of the agent to ensure stability during periods of high traffic. The errors that led to the outage today were unrelated to traffic levels, and the team is implementing additional controls to prevent a re-occurrence including additional database monitoring, further segmentation of the API layer, and more aggressive error handling to isolate and contain issues within the platform.

Posted Sep 07, 2021 - 18:55 UTC

Resolved
This incident has been resolved.
Posted Sep 07, 2021 - 15:25 UTC
Update
We are continuing to investigate this issue.
Posted Sep 07, 2021 - 14:27 UTC
Update
We are continuing to investigate this issue.
Posted Sep 07, 2021 - 14:21 UTC
Update
New connections may experience a delay in connecting. Devices currently connected to the SGN will remain connected.
Posted Sep 07, 2021 - 13:30 UTC
Update
We are continuing to investigate this issue.
Posted Sep 07, 2021 - 13:09 UTC
Investigating
We are currently investigating this issue.
Posted Sep 07, 2021 - 12:59 UTC
This incident affected: Secure Global Network (SASE) (Secure Global Network) and Todyl.com & Management Portal.