We would like to provide additional detail surrounding the downtime which occurred on 10/12/2021
What happened?
At 8:05am PST, the NA1 server environment experienced downtime which resulted in QLess applications becoming temporarily inaccessible on the server.
Duration: 5 minutes.
Cause
The root cause of the outage was due to a race condition that was triggered on one of our backend servers.
This race condition is a known issue, and can be triggered by heavy transactional load or other rare circumstances.
Remediation
Upon receiving a monitoring alert notification QLess Engineers restarted the service.
Prevention
QLess has initiated a complete refactoring of the affected application. The new service application will be more fault-tolerant, available and self-healing.