Assets Disconnecting

Incident Report for Zero Networks

Resolved

The Issue

During this timeframe, customers may have experienced:
-MFA latency or drops for some assets.
-Periodic unavailability of some Portal URLs (e.g., monitored assets, segmented assets).

Root Cause

The issue occurred because a backend service experienced a resource limitation, triggering a mechanism that should mitigate the issue, release load on the service and allow it to function, but created a degradation in performance. During that time, a few customers experienced the following symptoms:
views in the Portal were inaccessible intermittently, agents began running reconnect flows, which contributed to MFA latency or drops.

How We Identified the Issue

Our monitoring system quickly detected increased latency within the Portal as soon as the issue began, triggering an internal investigation process, allowing us to quickly mitigate the problem.

Mitigation Steps

We scaled the service resources to address increased demand and restore normal operations, ensuring better stability moving forward.
We also enhanced our monitoring capabilities to detect and address similar issues more proactively, reducing the likelihood of recurrence.
Posted 6 months ago. Dec 03, 2024 - 13:00 UTC