Friday September 1st 2022, 16:00 - 17:57 UTC
New test requests in all regions began to queue up in “new job” status waiting for the system to allocate them to test resources. The spike in queued test requests was detected by our internal alerting system.
An application deployment introduced a bug that caused an unrecoverable error in the resource allocation process.
We reverted the deployment to the previous version.
We are evaluating ways to better detect an issue like this through better test coverage in our development process.