What happened:
At 10:31 Pacific Time on April 29, 2021, we executed a routine data maintenance procedure, related to bulk deletion of deactivated user accounts. This procedure involved a high number of concurrent operations, which generated unexpected high load on our main database servers, causing the primary to become unresponsive while processing said operations. This resulted in the MURAL application being unavailable for all users.
Details and corrective actions:
At 11:06 Pacific Time we initiated a manual failover to a new primary database server. This was completed at 11:17 Pacific Time and we were able to restore full service, without any loss of data.
What we’ve done to avoid this happening again:
We have temporarily restricted the procedure that triggered this situation, while we review and upgrade how it works. This is aimed also at improving our handling of massive bulk data operations to prevent them from impacting service availability.
MURAL administrators can continue to disable users on an individual or pick-list basis, via the admin interface.
We apologize for any inconvenience caused by this incident and thank you for your understanding.