Platform Latency
Incident Report for Kustomer
Postmortem

Summary

Beginning at 4:00 pm EST on October 6, 2021, Kustomer engineering was alerted to an issue where one of the database shards entered a bad state. At a high level, our third-party cloud database (MongoDB) was unable to process a high volume of write transactions on our messages during peak traffic hours. It lasted for approximately 3 hours.

Impact / Alerts

During this time, inbound and outbound messages generated within Prod1 Kustomer organizations were held. These messages were held until the database issue was cleared up and the items started to get redriven. The Kustomer system experienced latency which included the inability to access conversations and customer timelines.

Root Cause

This issue stemmed from a bug on the side of our MongoDB regarding their server-side logic around transactions.

Resolution

Kustomer engineering worked with out third-party could database to rectify this issue at approximately 5:33 PM EST. All items were redriven with residual events completed by 7:20 PM EST.

Lessons/Improvements

  • Continue to engage our third-party cloud database to resolve the defect on their end.
Posted Oct 15, 2021 - 16:56 EDT

Resolved
We have seen no further instances of this and are confident the issue has been resolved.
Posted Oct 07, 2021 - 10:17 EDT
Update
Systems look normal now. We are still monitoring the situation as we investigate the root cause of the issue.
Posted Oct 06, 2021 - 21:55 EDT
Monitoring
We are continuing to monitor the situation as latency issues improve.
Posted Oct 06, 2021 - 20:08 EDT
Investigating
We are currently investigating new reports of platform latency.

Please reach out to our Support team with any additional questions. You can reach us by going to https://help.kustomer.com/ and clicking "Contact Support" at the top of the page.
Posted Oct 06, 2021 - 19:19 EDT
Update
We are continuing to monitor for any further issues.
All queued messages are being redriven and we are continuing to investigate for the root cause.
Posted Oct 06, 2021 - 17:42 EDT
Monitoring
The platform latency should be lessening and we are currently monitoring the situation.
Posted Oct 06, 2021 - 17:24 EDT
Investigating
We are currently experiencing increased latency with Kustomer that is causing some messages to not send immediately.

If you have additional questions, please reach out to our Support team here: https://kustomer.kustomer.help/contact/contact-kustomer-support-HJ_cRX5mK
Posted Oct 06, 2021 - 16:48 EDT
This incident affected: Prod1 (US) (API, Web Client).