History, Push Device Registration, and Channel Group service experiencing elevated latencies and errors in Southern Asia

Incident Report for PubNub

Postmortem

Problem Description, Impact, and Resolution

At 14:53 UTC on July 18, 2021, we observed elevated latency and error rates in our History, Push Device Registration, and Channel Groups services. We brought new capacity online to help what appeared to be malfunctioning instances in our History service. Once the new instances were brought into service, the issue was resolved at 15:41 UTC. Unfortunately, due to a process error while the malfunctioning instances were taken out of service, the nodes were terminated before we could perform an investigation, losing their state, and the root cause of the malfunction remains unknown.

Mitigation Steps and Recommended Future Preventative Measures

To prevent a similar issue from occurring in the future we are updating our processes to ensure that malfunctioning nodes are taken offline in a way that will preserve their state for analysis. The replacement systems have been operating normally for over two days, and the system is stable. Our team continues to monitor closely.

Posted Jul 22, 2021 - 16:15 UTC

Resolved

The issues have not resurfaced since 3:41 PM UTC. We will follow up with a post-mortem soon.

We apologize for the impact this may have had on your service. Please reach out to us by contacting PubNub Support (support@pubnub.com) if you wish to discuss the impact on your service.

Posted Jul 18, 2021 - 16:58 UTC

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Jul 18, 2021 - 16:06 UTC

Identified

The issue has been identified and a fix is being implemented.

Posted Jul 18, 2021 - 15:47 UTC

Investigating

We are currently investigating elevated latency and errors from our History, Push Device Registration, and Channel Group services in our Southern Asia point-of-presence.

Posted Jul 18, 2021 - 15:40 UTC

This incident affected: Realtime Network (Storage and Playback Service, Mobile Push Gateway) and Points of Presence (Southern Asia Points of Presence).