On January 15, 2021, at 2:38 a.m. ET, an urgent ticket was submitted, indicating that publishes for some Pages clients were timing out. There was an issue with downloading Pages files, which impeded publishing for some sites. During investigation, from the period of 6:52 a.m. ET to 9:49 a.m. ET, a server was rolled back to a previous state, causing a subsequent issue of the Code Editor to stop working for Hitchhikers. Once the root cause was identified, Engineers deployed a fix for the timeouts at 1:08 p.m. ET. After this time, publishes that had previously failed started to succeed during the next publish. No data was lost as a result of this outage.
A bug in the caching layer while fetching Pages content caused data to get cached twice, increasing overall publishing latency and causing some publishes to time out.
The root cause has been fixed and we will be implementing additional telemetry around the caching layer.