On April 21 at 10am ET, a misconfiguration caused an automated provisioning process to create a very large number of new Answers Experiences. At 11:20am, the rapid increase overwhelmed the available resources and resulted in data updates and other new Answers Experiences to halt. We identified the misconfiguration and implemented mitigation steps to clean up the in-progress provisioning. We completed that work and verified all delayed data updates had gone through at 1:47pm ET. No data updates were lost.
When provisioning Answers for a new set of customers managed by a partner, we accidentally provisioned Answers for all of the partner's accounts, which number in the tens of thousands. We did not have enough capacity to absorb such a large increase in Answers search indexes, and after some time the system exhausted its resources and stopped processing all subsequent requests.
In response, we are planning to create a separate queue for work required to provision new Answers Experiences, such that updates to existing Answers Experiences would be unaffected if a similar scenario occurred in the future. Additionally, we are adding an alert for when the Answers index limit is being approached. Finally, we are evaluating different designs for Answers search indexes that would entirely avoid this resource limit in the future.