Alert! - RESOLVED: Toronto Data Center Degraded Service | INC153997
Incident Report for Central 1
Postmortem

Postmortem: Toronto Data Center Degraded Service | INC153997 P2 

On Friday, March 3rd, at approximately 10:55 a.m. PT (1:55 p.m. ET) Central 1 services and applications across various portfolios (Payments, Treasury, Digital Banking, IT Services) started to become latent and sporadically unresponsive. Central 1 originally reported the impact as the Secure Site during the 11:24 a.m. PT (2:24 p.m. ET) Status Page notice, noting a problem in our Toronto Data Center (TOHC).  Services recovered on their own at 11:18 a.m. PT (2:18 p.m. ET). The resolved Status Page notice reflected that services in TOHC like Digital Banking, Branch Capture, and other applications were impacted. No major outages were observed during this time. 

Point of failure was a 3rd party company Central 1 uses for Penetration testing services. The company uses highly trained security testing resources to find vulnerabilities and gaps in Central 1’s perimeter before the malicious actors do. During the execution of this year’s Red Team exercise on Friday morning, they accidentally saturated Central 1’s services, specifically, a firewall in TOHC ran out of memory (ran at 100% CPU) and prevented some traffic from reaching their servers. This caused sporadic failed messages as traffic was still be sent through, just the volume over the 100% was dropping. 

Central 1 did receive alerting and was able to locate the point of failure during the outage and have worked with our vendor closely to ensure they properly measure their volume of scanning. They have worked with Central 1 for years and have a strong understanding to not complete intrusive testing. They completed a full review of their scripts prior to continuing any additional scanning. Due to their strong history with Central 1, we feel there is near zero risk of a reoccurrence. 

Actions:

RITM331018 – Network to complete a Silverline review

Assigned to: Network

Due Date: End of May 2023

Review additional configurations and services available to mitigate these type of attacks 

RITM331020 – InfoSec Actions with 3rd Party Review

Assigned to: Information Security Team

Due Date: Completed

Received 3rd party postmortem with actions to mitigate another production impacting scan.

 

At Central 1, we take the quality of our service delivery very seriously. We understand that when our services are not working as expected, it can have a significant impact on our customers and their businesses. We believe that the best way to address any issues that arise is to be transparent about them and work diligently to improve our processes and systems.

 If you have any questions about this postmortem, please reach out to me to discuss. 

Jason Seale

Director, Client Support Services

jseale@central1.com | 778.558.5627

Posted Apr 01, 2023 - 11:14 PDT

Resolved
Services have remained stabled. The point of failure was identified and will be reviewed in the upcoming postmortem. 

Services are expected to remain stable throughout the weekend.
Posted Mar 03, 2023 - 16:28 PST
Monitoring
Originally reported as only a Secure Site outage, Central 1’s services in our Toronto Data Center degraded between 10:55 a.m. to 11:18 a.m. PT (1:55 to 2:18 p.m. ET).

Technical support is investigating with our vendor for root cause analysis.

We continue to monitor system stability and provide an update by 4 p.m. PT (7 p.m. ET).

Central 1 - Support@central1.com - 1.888.889.7878, press 1
Posted Mar 03, 2023 - 11:47 PST
Investigating
The Secure Site (www.secure.central1.com) is currently unavailable. Technical support is investigating.

We will provide an update by 12:15 p.m. PT (3:30 p.m. ET).


Central 1 - Support@central1.com - 1.888.889.7878, press 1
Posted Mar 03, 2023 - 11:23 PST
This incident affected: Central 1 Services, Payment Services, and Incident Alerting.