We also maintain a list of Known Product Issues separate from this site here.
We recently addressed issues affecting uploads, downloads, and Box Notes. We would like to take the opportunity to further explain these issues and the steps we have taken to keep them from happening in the future.
On January 8, 2021 between 12:52 PM PST and 1:34 PM PST, some users may have experienced difficulties while working in Box. During this time, users would have had partial issues with Uploads, Downloads, Box Notes, Preview and Windows Conversion. The issue occurred due to maintenance activities related to an internal process for splitting database shards. We were able to resolve the issue by reverting a problematic change from the shard split process to a previously known good configuration. The rollback took longer than anticipated due to another change that was made after the problematic shard-split change which resulted in the the revert steps not working as expected.
Analysis
As part of scaling our database systems, we executed an internal process to split database shards. However, after beginning this process we identified that systems apart of the shard split process weren't meeting their performance SLOs and we made a decision to rollback the change in order to correct the performance issue. The execution of the rollback procedure is where we encountered unexpected issues in two places:
1) Our configuration tool for database clients produced an invalid configuration
2) Our database application service would return errors for a percentage of requests when trying to resolve the invalid configuration
Additionally, we identified a situation where our rollback procedure may fail unexpectedly if another change to database client configuration occurs before a rollback is executed. Our team has identified improvements based on the above analysis in order to prevent a recurrence of these failures in the future.
Corrective Actions
The following corrective actions have been completed or are planned:
Harden the database configuration tooling so it cannot generate an invalid client configuration - Planned
Harden the database application service to alert if it receives an invalid client configuration and fail gracefully - Planned
Update our shard split procedures to review the alerts from our database application service as part of deployment and rollback - Planned
Harden the database configuration tooling to resolve conflicts when reverting a client configuration change. - Planned
We are continuously working to improve Box and want to make sure we are delivering the best product and user experience we can. We hope we have provided some clarity here and we would be happy to answer any questions you may still have regarding this matter.
Sincerely,
The Box Team