Beginning at 19:52 EDT on July 7th, 2021, History, Push Device Registration, and Channel Group change experience elevated latencies and errors. At 20:13 EDT Kustomer engineering was alerted that PubNub, a third-party, real-time service Kustomer relies on to power chat 2.0 and team pulse, was reporting a degraded performance with their system. This prevented these components from working properly until the issue was resolved.
At about 10:31 UTC (06:31 EDT), the Storage began to experience elevated latencies and errors in the Asia Northeast PoP. https://pubnub.statuspage.io/
Services impacted:
Details on the root cause can be found here: https://status.pubnub.com/incidents/5377g4wvyxf3
At 10:31 UTC on 07 July 2021, we observed elevated latencies and errors in our Tokyo point of presence which affected History. This issue occurred because of an atypical combination of usage patterns causing CPU over-utilization in that region, ultimately resulting in the latencies and errors we observed. The incident was fully resolved at 11:07 UTC.
The incident was fully resolved at 11:07 UTC, No further information on the resolution from PubNub was given.
PubNub added additional checks to prevent this from reoccurring in the short term. Still, in the coming weeks, they will be analyzing the causal usage patterns and working to separate the affected services with the goal of mitigating the possible recurrence of these multiple process failures.