Spatial Service Disruption for a subset of customers ANZ Region

Incident Report for TechnologyOne

Postmortem

Issue Summary

TechnologyOne observed system instability over the weekend in the spatial data layer after the regular storage maintenance by the upstream vendor and took measures to recover via a database failover on 16 February 2025.
On 17 February 2025 a subset of spatial customers in the ANZ region were unable to access ConfigManager, Intramaps and Nearmaps. A portion of this customer base were restored by 2pm AEST with the remaining being brought online by 4pm.
At 7pm AEST errors presented in the logs which identified further error states presenting. Action was taken to restore the affected databases from backup and all were restored by 11pm AEST.
At 11.15pm AEST errors in replication were observed and found to be as a result of replication commencing at time restoration was underway. Access was restored and replications re-run with monitoring into the morning.

Root Cause

The database cluster became instable due to Continuous Availability (CA) configuration not being enabled at the spatial storage layer. This configuration prevents inconsistences for the database during storage maintenance.

Corrective Actions

A failover was conducted on the Sunday evening however many databases continued to enter recovery state.
Corrected a storage permissions issue.
Brought databases online one by one as each time attempted as a group failed.
Restored subset of databases found to be in a corrupted state from back-up.

Preventative Actions

Enable CA on the database share on the primary storage for spatial. A maintenance window is planned between 8pm-9pm AEST on 22/2/2025. Updates will be provided via the status page.

Posted 2 months ago. Feb 20, 2025 - 16:47 AEST

Resolved

After overnight monitoring this incident is now resolved.

We will perform a post incident review to identify underlying cause, and preventive action to avoid a repeat in the future and post here on completion.

We apologise for how you and your business may have been affected by this incident.
Posted 2 months ago. Feb 18, 2025 - 08:43 AEST

Monitoring

Our team has verified the implementation of a fix is complete for the impacted environments

We will monitor the logs for the next few hours to ensure no further customers are impacted.
Posted 2 months ago. Feb 17, 2025 - 22:57 AEST

Identified

Our team has identified a subsequent issue to the outage earlier today for a subset of customers impacting Spatial Services

We are currently applying fixes to the impacted environments with the goal to have the fix applied to all affected customers prior to the overnight processing commencing at 11.30PM.

We anticipate the implementation of the fix to take 2 hours to complete and verify.

The next update will be posted when the implementation of the fixes has been completed.
Posted 2 months ago. Feb 17, 2025 - 20:53 AEST
This incident affected: Software as a Service - Australia & New Zealand (Spatial Cloud).