Resolved - INC157454 - Please do not use todays Banking reports and Files

Incident Report for Central 1

Postmortem

Postmortem: Incident – INC157454 – CAS Production Day End Job failed - High (P2)

 Overview

On Thursday, April 20, 2023, the new banking system CAS sent a PagerDuty Alert at 4 a.m. PT to advise the CAS production end-of-day (EOD) failed. The EOD job was run twice in error, causing the transactions on the banking reports and files along with CAS online to display April 21st instead of April 19th.

 Due to the date error, all the banking reports were removed from the FTP folders, and the banking system was brought offline while the database was restored to the 4 a.m. PT (7 a.m. ET) snapshot. Once the restore was complete the EOD job was run, which set the database to the correct date. Once the date was confirmed in CAS, the reports, PREC, and OREC files were rerun and placed in the credit union’s FTP folder.

 When CAS was taken offline to restore the snapshot, credit unions could not access CBS Online. Outgoing wires could not post to CAS (INC157479) which delayed wires from being sent out to other FIs.  

 CAS was offline for 1 hour and 31 minutes from 5:13 a.m. to 6:44 a.m. PT (8:13 a.m. ET to 9:44 a.m. ET).

 The root cause of this incident is the CAS end-of-day step failed and did not retry due to a missing exception lock. During recovery the team decided to restart the job normally with the expectation that the job would resume from the last failed step however, the job started from the very beginning causing the day end to run again and move to the following day, resulting in the incorrect system date.

Actions

Pending

RITM333831- (CASM-3674) Application enhancements to error handling on the end-of-day job – Q2

Complete

PRB011107- What caused the EOD job to fail?

The CAS end-of-day step failed and did not retry due to a missing exception lock.  During recovery the team decided to restart the job normally with the expectation that the job would resume from the last failed step however, the job started from the very beginning causing the day end to run again and move to the following day.

CHG135404 – Deployed hotfix/1.17.11 to production on Thu, Apr 20

·         fix the day-end locking issue.

·         In case of exception in Jobs during the update of Accounts_Net_Totals retry is not working

RITM334278 - Update the Standard Operating Procedures to clearly indicate how to re-run EOD

Posted 2 years ago. May 12, 2023 - 12:24 PDT

Resolved

All the banking reports including the PREC\OREC files are now in your organization's FTP folder.

CAS online is also available.

Central 1 - Support@central1.com - 1.888.889.7878, press 1
Posted 2 years ago. Apr 20, 2023 - 10:28 PDT

Update

We continue to work on restoring the banking files and reports.

Another update will be provided by 10 a.m. PT (1 p.m. ET).
Posted 2 years ago. Apr 20, 2023 - 08:37 PDT

Identified

Please do not use the AIBC (AIOC), AIBU (AIOU), DAIL (CDJ) banking reports, or the PREC & OREC files created today April 20, 2023.

Central 1’s technical team will be deleting the banking files and reports from your FTP folder that were created today.

We will be recreating the reports and files with the correct data.

An update will be provided at 8 a.m. PT (11 a.m. ET.) or sooner if the reports and files have been recreated.

Central 1 - Support@central1.com - 1.888.889.7878, press 1
Posted 2 years ago. Apr 20, 2023 - 07:50 PDT
This incident affected: Incident Alerting.