Overview
On June 20th, at 09:24 UTC, an incident was declared for the (eu-4) region. Platform.sh monitoring detected a number of site down alerts and upon further investigation determined the issue was a result of gateway performance.
What Happened
At 09:08 UTC, monitoring alerts indicated a high volume of traffic entering the region. A mitigation plan to address it was quickly implemented. By 09:12 this traffic volume had been mitigated. We declared an incident to further review the issue and identify the root cause. Investigation determined a small number of sources being the cause of the traffic volume.
Platform.sh has plans in the roadmap to put in place appropriate rate limiting to prevent such traffic spikes from bringing down several sites in the region.
Impact
The incident was classified as a partial outage with most sites returning to normal operating levels by 09:17. The longest observed outage lasted a total of 14 minutes with most monitored sites only experiencing a 1 minute to 3 minute outage.