Discovered: Jan 18, 2022 20:36 - UTC
Resolved: Jan 18, 2022 23:15 - UTC
A planned tenant migration from US1 cluster to US4 cluster caused an
unforeseen destabilization of the US4 cluster due to resource constraints.
The overload of resources caused several key services for communication to collectors to fail and disconnect in the US4 cluster in the Auvik platform. It also prevented a number of users on the US4 cluster from being able to log in.
01/17/2022
13:00 - UTC Throughout the day ~3450 tenants are migrated from US1 cluster to US4 cluster.
17:30 - UTC The US4 cluster begins to show early signs of destabilization, but were not caught by proactive monitoring.
01/18/2022
20:36 - UTC Engineering begins investigation into a higher than normal rate of collector failures in the US4 cluster.
20:57 - UTC Resource consumption continues to increase forcing more services offline in the US4 cluster. The engineering team continues to investigate.
21:18 - UTC Engineering determines the root cause of the resource issue.
21:24 - UTC Engineering outlines a plan to address the resource issues on the US4 cluster.
21:37 - UTC The US1 cluster is reviewed by engineering for any possible issues from the migration of tenants. No adverse issues are discovered.
22:11 - UTC Additional resources are provisioned to the US4 cluster to address the connector disconnect issues.
22:37 - UTC Engineering begins to apply a fix to address the inability of users to log into their tenants on the US4 cluster.
23:07 - UTC Engineering completes the roll out of the fix to now allow affected users to log into their tenants on US4 cluster.
23:15 - UTC Engineering declares the incident has been resolved.