We also maintain a list of Known Product Issues separate from this site here.
We recently addressed issues affecting the availability of Box Drive and the All Files page. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.
Between the afternoon of April 27, 2021 and the evening of April 28, 2021 some users may have experienced difficulties while working in Box. During this time, Box Sync and Box Drive requests would have failed intermittently. Additionally, on April 27, 2021 between 2:26 PM PDT and 2:31 PM PD the Box All Files page would have sporadically failed to load for some users.
Analysis
The issue occurred as a result of a recent code change in our ongoing effort to improve performance and stability of our database and caching systems. Central to this incident was the persistent corruption of two hot Memcache key-value pairs as a result of a bug which evaded manual attempts to fix the corruption and kept persisting the corrupted cache value back into Memcache. These bad cache values were treated as cache misses, which resulted in increased load on the DB because of the increased need to fetch the uncached database record. The increased load pushed the DB into a degraded state, which was seen through the increased database read latency and high CPU utilization.
The load on the DB was shed by temporarily hardwiring DB responses which helped mitigate the impact to the site. We were then able to resolve the issue permanently by pushing a bug fix to our production codebase. Once completed, we then reverted the temporary hardwiring of the DB responses.
Corrective Actions
The following corrective actions have been completed or are planned:
Add robust logging for unexpected values returned by the data access layer
Add more metrics to alert internal teams quickly when seeing an unexpected increase in bad values returned from the data access layer
We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter.
Sincerely,
The Box Team