On Jan 11, 2021, at 20:45 UTC messages stopped being written to Storage for publishes in our US-East PoP resulting in those messages not being available in history during the incident. The delay was caused by slow database writes caused by a problem with the way deleted records are handled in the distributed data store in some scenarios. We were able to restart the process and the issue was resolved at 22:23 UTC. During the incident, Storage reads were successful for any data that persisted prior to the incident, and no data was lost, instead all messages published during the incident were queued until the writers were able to successfully catch up.
We have updated the code to prevent the errors caused by deleted records in the distributed data storage.