Ambra Incident

Incident Report for Ambra status

Postmortem

We have pinpointed a challenge within our real-time transcoding and rendering service where the autoscaling logic was not functioning optimally, leading to memory shortages. This occurred as the system attempted to scale up to manage a higher workload, but the newly created instances were over utilizing resources, causing them to fail. To address this, we have refined our autoscaling process. This adjustment ensures that new instances will use the available memory more efficiently, allowing the system to scale horizontally and maintain performance without overburdening the infrastructure

Posted May 08, 2024 - 11:27 EDT

Resolved

The incident has been fully resolved by adding additional transcoding nodes. Service is back to normal levels. Our team will be conducting a root cause analysis and sharing as soon as possible. We will continue to monitor the situation to ensure there are no further issues.

Posted Apr 18, 2024 - 17:38 EDT

Investigating

We have received reports of intermittent viewing issues on the Ambra platform. Engineering teams are currently investigating. Additional information will be provided as soon as it is available.

Posted Apr 18, 2024 - 17:05 EDT

This incident affected: Web Services, Image Processing, and Image Viewing.