On September 23, 2022, from 6:08 PM until 8:22 PM UTC, customers accessing UiPath Automation Cloud, Automation Hub, and all other UiPath cloud services saw an outage to some functionalities across all regions. The outage prevented UiPath services from being able to interact with one another.
Document Understanding
Document Understanding was almost completely down. Tagging inside an already opened document session and calling OCR or already published extractor endpoints remained unimpacted.
Automation Hub
Automation Hub was completely down for all FPS tenants (log in through Automation Cloud). Automation Hub Classic users were NOT impacted.
Data Service
Data Service was completely down for all tenants' requests from activities. Tenants' requests from the cloud were not impacted.
Notification Service
We have observed partial failure during this time frame. We noticed eleven accounts that could have observed undelivered notifications.
Connections Service
Connection Service UI was completely down for any users who had logged in during this time span, as well as any users redirected to maintain or create new connections/triggers, (i.e. from Studio, Apps, etc). Trigger dispatch requests to Orchestrator were delayed.
Hypervisor
No impact on the ability to run jobs on existing pools or Elastic Robot Orchestration workflows. Limited impact on users’ ability to create new Automation Cloud Robot VMs.
There were multiple overlapping issues. One of our Azure Active Directory applications is only used to run non-critical operations in deployment pipelines. During a deployment, the secret of this application expired. This combination uncovered a bug that accidentally wiped out the secret hashes that some of our services use to authenticate to one another.
After root causing the issue, we were able to restore the secret hashes with a secret renewal for the Azure AD application and re-deployment. After the mitigation, we validated that all functionality was completely restored.
We had multiple alerts across the stack for failures being observed across multiple UiPath products within minutes.
Engineers across multiple services engaged quickly to start investigating the issue. Because of authentication failures due to missing secret hashes, we needed to navigate through secrets to identify the problem that was happening and figure out a reliable process to restore the secrets to mitigate the incident.
We understand how incredibly impactful and unacceptable this incident is and apologize deeply. We are continuously taking steps to improve the UiPath cloud platform and our processes to help ensure such incidents do not occur in the future. In this case, this includes (but is not limited to):