YYZ - application host failure

Incident Report for Fly.io

Resolved

This incident has been resolved.
Posted 2 years ago. Oct 09, 2022 - 05:17 UTC

Monitoring

All applications have now been restarted on other working hosts, and we are monitoring the results.
Posted 2 years ago. Oct 09, 2022 - 02:43 UTC

Update

Applications without volumes have all been restarted on other hosts in the region.
The failed host can't be recovered quickly, so we're restoring volumes from their most recent daily snapshot and will restart them on another host. Recently-written data will be unavailable for volumes not in a multi-node cluster.
Posted 2 years ago. Oct 09, 2022 - 02:25 UTC

Update

We're working on replacing the failed boot drive, but don't have a firm ETA yet. We also identified an issue preventing applications from being restarted on other servers in the region and are working on a fix.
Posted 2 years ago. Oct 09, 2022 - 00:29 UTC

Identified

There was a boot-drive failure on a host in the YYZ region, and the failed drive will need to be replaced. Applications without volumes will be restarted on other servers.

Volumes on the affected host may be unavailable until the host is fully restored. Volumes are stored on separate drives so we don't expect any customer data to be lost.
Posted 2 years ago. Oct 08, 2022 - 22:50 UTC

Investigating

We are investigating a network connectivity issue affecting a server in the YYZ region.
Posted 2 years ago. Oct 08, 2022 - 22:14 UTC
This incident affected: Regional Availability (YYZ - Toronto, Canada).