[Tokyo Region] Presto Query - Partial Outage
Incident Report for Treasure Data
Resolved
We have confirmed the issue has been resolved. The Presto coordinator cluster is now operational.
But, during this incident, our internal retry mechanism did not work as expected. All Presto jobs failed during 11:14~11:34 JST due to an infrastructure issue. We will investigate the root cause and apply appropriate remediation.
Please rerun your failed Presto jobs if necessary.
We apologize for any trouble this may have caused.
Posted Jul 11, 2021 - 20:50 PDT
Monitoring
A fix has been implemented and we are monitoring the Presto coordinator cluster.
Posted Jul 11, 2021 - 20:01 PDT
Identified
We have identified that the Presto coordinator was having an issue. We are working on applying a fix.
Posted Jul 11, 2021 - 19:40 PDT
Update
We are continuing to investigate this issue.
Posted Jul 11, 2021 - 19:39 PDT
Investigating
We have observed a high 503 error rate at Presto Job in the Tokyo Region. Customers may observe a Presto job error.
We are working on investigating the issue.
Posted Jul 11, 2021 - 19:27 PDT
This incident affected: Tokyo (Presto Query Engine).