Elevated rendering errors
Incident Report for imgix
Postmortem

What happened?

On April 26, 2021 21:00 UTC, the imgix service experienced elevated latency when retrieving images via our origin cache. This caused an increase in error rates for some uncached derivative images. Our engineering team implemented a remediation that had error rates subsiding by 21:28 UTC and with full restoration of the service by 21:45 UTC.

How were customers impacted?

During the incident, customers may have noticed some uncached derivative images return an error. Up to 10% of requests failed to return successfully. 

Cached derivative images were not impacted and continued to be served as normal. 

What went wrong during the incident?

Our engineers were alerted to an increased amount of failed origin requests. The issue was quickly identified as traffic which was outside of our terms of service. The traffic was isolated while we worked with affected parties for a longer term solution.

What will imgix do to prevent this in the future?

Response time of our team was increased given ongoing projects regarding suspicious traffic. However, we are still seeing that there are improvements that can be made so that process response times will have a faster turnaround time. Moving forward, we will continue to evaluate how to further optimize our service to prevent situations like this.

Posted May 03, 2021 - 11:52 PDT

Resolved
This incident has been resolved.
Posted Apr 26, 2021 - 14:42 PDT
Monitoring
A fix has been implemented and error rates have returned to normal levels.
Posted Apr 26, 2021 - 14:33 PDT
Update
Fixes are continuing for this incident and error rates are continuing to recover.
Posted Apr 26, 2021 - 14:30 PDT
Identified
The issue has been identified and a fix is being implemented.
Posted Apr 26, 2021 - 14:26 PDT
Investigating
We are currently investigating elevated render error rates for uncached derivative images. We will update once when we obtain more information.

Previously cached derivatives are not impacted.
Posted Apr 26, 2021 - 14:16 PDT
This incident affected: Rendering Infrastructure.