Find Incident - EU Region (EMEA13 & EMEA14)
Incident Report for Optimizely Service
Postmortem

SUMMARY

Optimizely Search & Navigation (formerly Find) is a cloud-based enterprise search solution that delivers enhanced relevance and powerful search functionality to websites. Between August 9, 2021 and August 12, 2021 we experienced events which impacted the functionality of the service in the EU Digital Experience Cloud region. Details of the incident are described below.

DETAILS

Between August 9, 2021 02:44 UTC and  August 12, 2021 08:30 UTC a limited subset of clients using Search & Navigation clusters EMEA13 and EMEA14 experienced intermittent service degradation.

The problem occurred during a capacity planning process, and the source of the malfunction was determined to be human error. As soon as the root cause was identified, appropriate actions were taken to mitigate the impact. The services was fully operational at August 12, 2021 08:30 UTC. 

TIMELINE

August 9, 2021

02:44 UTC – First alert triggered and troubleshooting started.

03:10 UTC – Issue was identified and workaround initiated. 

06:15 UTC – Functionality impacts were assessed, and mitigating plans were devised.

08:08 UTC – STATUS PAGE updated 

_08:10 UTC – Long term mitigation plan started. 
_

August 12, 2021

08:30 UTC – Mitigation actions completed. Service operational and Incident closed. 

ANALYSIS

The incident was caused by human error during a standard capacity operation which resulted in data inconsistency on the platform.

IMPACT

During the events, a limited subset of client may have experienced network timeout or an inactive service error message.  

CORRECTIVE MEASURES

Short-term mitigation

  • A fix was applied to return the data to its original state.

Long-term mitigation

  • A subset of clients who were affected by this issue was given instructions on how to mitigate the impact. 
  • To improve our processes and to mitigate this happening again the team has gone through the chain of errors and implemented checks to find errors in time before they become an incident.

FINAL WORDS

We apologize for the impact to affected customers. We have a strong commitment to deliver high availability for our Search & Navigation service. We will continue to prioritize our efforts in proving to overcome these recent difficulties, and will do everything we can to learn from the event to avoid a recurrence in the future.

Posted Sep 20, 2021 - 13:16 UTC

Resolved
This incident has been resolved.
We will continue our investigation to establish the root cause. A post mortem will be published as soon as it becomes available.
Posted Aug 12, 2021 - 08:50 UTC
Monitoring
A fix has been implemented and we are closely monitoring the results.
Posted Aug 11, 2021 - 13:15 UTC
Update
Mitigation efforts remain ongoing at this time. Thank you for your patience.
Posted Aug 11, 2021 - 10:05 UTC
Update
Mitigation efforts remain ongoing at this time. Thank you for your patience.
Posted Aug 10, 2021 - 20:22 UTC
Update
The mitigation efforts are in progress. We will keep posting the latest updates whenever they are available.
Thank you for your patience.
Posted Aug 10, 2021 - 13:52 UTC
Update
We are continuing our efforts to apply the mitigation. We will continue to provide updates as more information becomes available.
Posted Aug 10, 2021 - 08:54 UTC
Update
We are continuing to work on a fix for this issue and will continue to provide updates as information becomes available.
Posted Aug 09, 2021 - 19:56 UTC
Update
We are continuing our efforts to apply the mitigation. We will continue to provide updates as more information becomes available.
Posted Aug 09, 2021 - 12:47 UTC
Update
We have identified the cause of the issue and we are working on applying mitigation efforts.
We will provide more updates as soon as they become available.

Thank you for your patience.
Posted Aug 09, 2021 - 09:29 UTC
Identified
We are currently investigating an event that is impacting the service on the Search & Navigation (FIND) service in the EU region (EMEA13 & EMEA14).
A subset of clients may be impacted by the service disruption.
Posted Aug 09, 2021 - 08:08 UTC