Site Slowness
Incident Report for DrChrono
Postmortem

Site Slowness

Summary

The platform experienced a spike in slowness on 07/06/2021. There was one main contributing factor for the slowness throughout the day.

Timeline (EST, 24-hour clock)

Date/Time Activity
2021-07-06 11:42 The Support Team received notification of site slowness.
2021-07-06 11:47 DevOps tried to enable session affinity in CloudFlare.
2021-07-06 11:50-12:05 DevOps ramped up traffic to AWS from 0%-20%.
2021-07-06 12:05 Issue found with routing traffic to AWS.
2021-07-06 12:05 DevOps Team dropped AWS traffic to 0%.
2021-07-06 12:15 DevOps Team continued to find a solution.
2021-07-06 13:29 Hotfix solution is being prepared for routing traffic to AWS.
2021-07-06 13:37 Status Page updated to reflect investigating status for site slowness.
2021-07-06 14:24 Status Page updated to identified for site slowness.
2021-07-06 14:59 Hotfix solution continues to be worked on.
2021-07-06 16:24 DevOps implements hotfix solution.
2021-07-06 16:24 Status Page updated to resolved for site slowness.
2021-07-07 05:10 DevOps monitors traffic flow from into AWS.
2021-07-07 06:02 DevOps tested application functionality with switch to AWS.
2021-07-07 09:23 DevOps identifies increase in queue times. Identified increase in system workers would help resolve queue backup issues.
2021-07-07 11:20 DevOps increases system pool resources.
2021-07-07 12:41 Hotfix solution thoroughly tested and verified.

Contributing Factors

Various factors affected site slowness during this time:

  1. Flowing 20% traffic to AWS the Friday before holiday weekend.
  2. Splitting traffic between RackSpace and AWS.

The major contributing factor was splitting the traffic flow between RackSpace and AWS.

Impact

This issue affected customers' ability to navigate and use the website, which affected the ability to modify appointments, use the site, etc.

Corrective Actions

Ensure all traffic is running through AWS with no more splitting of traffic between RackSpace and AWS. This solution will prevent a similar situation from happening again.

Posted Jul 12, 2021 - 13:04 PDT

Resolved
This incident has been resolved. Please let our Support team know if you are still experiencing any issues with system performance.
Posted Jul 06, 2021 - 13:24 PDT
Identified
We have identified the issue and our Engineers are currently working on a hotfix. We will post additional updates here as we have more information.
Posted Jul 06, 2021 - 11:24 PDT
Investigating
We are currently investigating reports of sitewide slowness and trouble saving data that appear to have begun at approximately 8:45 am PST. We will provide an update with additional information as soon as possible.
Posted Jul 06, 2021 - 10:37 PDT
This incident affected: drchrono.com, drchrono iPad EHR, and drchrono iPad Check-In Kiosk Application.