Starting on March 12, 2024 our customers may have experienced several instances of outages for their Talemetry Career Sites. Customers would have been impacted briefly on the following dates for periods between 20-50 minutes per outage on March 12, March 18, April 17, April 18, April 25, April 30 and May 2.
Our engineers were engaged for investigation upon being alerted of the incidents. Through extensive investigation, our findings pointed to unoptimized procedures in our infrastructure including rise in bot traffic, inefficient queries, load balance, and some performance limitations in the search infrastructure and toolset version. This resulted in our team taking the site down to resolve each instance of the occurrence during this time frame to expedite the short-term resolution while our teams in parallel continuously worked towards long term optimizations.
To mitigate this situation from occurring in the future we deployed various optimizations including:
The above work was concluded on May 2, 2024.
We are also in the process of evaluating bot mitigation technologies to further stabilize and enhance our performance.