Elevated errors and connectivity problems
Incident Report for Fly.io
Resolved
We experienced a temporary outage in the wireguard mesh connecting all of our physical hosts which resulted in a number of internal systems losing connectivity with each other. The mesh has been repaired, but some regions required manual intervention, notably: AMS, BOM, CDG, HKG, IAD, MIA, SJC, SYD, YYZ. The following was impacted (a non-exhaustive list, this was a full network outage): fly token validation (including github actions and flyctl auth), 6pn networking calls, log delivery, routing requests to user machines.
Posted Apr 24, 2024 - 21:28 UTC
Update
We have identified a small number of physical servers that are still impacted and are recovering them.
Posted Apr 24, 2024 - 19:56 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Apr 24, 2024 - 19:25 UTC
Investigating
We are observing elevated errors and connectivity issues across Fly.io
Posted Apr 24, 2024 - 18:58 UTC
This incident affected: Platform and Tools (API, Deployments).