Service for all affected services has been restored
Incident Report for Acquia, Inc.
Postmortem

Purpose of This Report

This is a summary and analysis of an issue that occurred with the delivery of an Acquia product or service. The purpose of this document is to share details about what happened and why, so there is a common understanding of what is required to prevent a future occurrence if at all possible. Any remaining issues or risks are identified, as are recommended or pending actions.

Event Summary

On 8 June 2021 between 09:47 UTC through 12:35 UTC Acquia’s Platform CDN provider experienced an interruption of service affecting global customers.

Timeline of Actions to Resolution

  • 09:47 UTC - Initial onset of global disruption
  • 09:48 UTC - Global disruption identified by monitoring
  • 10:27 UTC - Platform CDN vendor engineering identified the service configuration at fault
  • 10:36 UTC - Impacted services began to recover as the configuration at fault was reverted.
  • 11:00 UTC - Majority of services recovered, additional mitigation measures taken
  • 12:35 UTC - Incident Mitigated, pre-incident capacity restored

Identified Root Cause

The root cause of this interruption of service was found to be a Varnish code release which had originally been conducted 21 May 2021. This release contained a code path which, upon investigation of this incident, was found to cause a crash of the Varnish service when encountering a specific VCL pattern.

Corrective Actions

  1. Acquia’s Platform CDN partner has implemented a permanent fix for the bug discovered (implemented 8-9 June)
  2. Acquia’s Platform CDN partner will actively monitor for Varnish failures following configuration change deployments.
Posted Jun 21, 2021 - 22:56 UTC

Resolved
The interruption of service in CDN services for Acquia Platform CDN impacting some Acquia Cloud Platform Enterprise applications appears to have been resolved. We are currently monitoring. Customers may experience high load on some applications.
Posted Jun 08, 2021 - 12:52 UTC
Monitoring
The interruption of service in CDN services for Acquia Platform CDN impacting some Acquia Cloud Platform Enterprise applications appears to have been resolved. We are currently monitoring. Customers may experience high load on some applications.
Posted Jun 08, 2021 - 11:43 UTC
Identified
We are continuing to monitor an interruption of service in CDN services for Acquia Platform CDN impacting some Acquia Cloud Platform Enterprise applications. We will provide additional information as it becomes available.
Posted Jun 08, 2021 - 10:46 UTC
This incident affected: Cloud Platform Enterprise.