Resolved -
We experienced a temporary outage in the wireguard mesh connecting all of our physical hosts which resulted in a number of internal systems losing connectivity with each other. The mesh has been repaired, but some regions required manual intervention, notably: AMS, BOM, CDG, HKG, IAD, MIA, SJC, SYD, YYZ. The following was impacted (a non-exhaustive list, this was a full network outage): fly token validation (including github actions and flyctl auth), 6pn networking calls, log delivery, routing requests to user machines.
Apr 24, 21:28 UTC
Update -
We have identified a small number of physical servers that are still impacted and are recovering them.
Apr 24, 19:56 UTC
Monitoring -
A fix has been implemented and we are monitoring the results.
Apr 24, 19:25 UTC
Investigating -
We are observing elevated errors and connectivity issues across Fly.io
Apr 24, 18:58 UTC