Southern Asia PoP may have experienced delays with messages using Storage
Incident Report for PubNub
Postmortem

Problem Description, Impact, and Resolution 

The incident started at about 21:29 UTC (13:29 PST) on 2021-01-05. Due to extremely high CPU, messages were not being written to Storage for any publishes that occurred in our Mumbai PoP, however, Storage reads were successful for any data that persisted prior to the incident (with some latency). No data was lost, rather, it queued up until the writers were able to successfully catch up. 

The resolution came when we restarted the Storage processes. The incident concluded at about 21:58 UTC (13:58 PST).

Mitigation Steps and Recommended Future Preventative Measures 

We have updated the code to prevent the errors caused by deleted records in the distributed data storage.

Posted Jan 26, 2021 - 18:57 UTC

Resolved
This incident has been resolved.
Posted Jan 18, 2021 - 15:18 UTC
Monitoring
A fix has been implemented at 14:46 UTC, and we are monitoring the results for the next 30 mins.
Posted Jan 18, 2021 - 14:47 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Jan 18, 2021 - 14:32 UTC
Investigating
Starting around 13:42 UTC, customers in our Southern Asia PoP may have experience delays between the time messages were published and the time they were read/write to storage.
Posted Jan 18, 2021 - 14:21 UTC
This incident affected: Realtime Network (Storage and Playback Service) and Points of Presence (Southern Asia Points of Presence).