Flexera One - APAC/NAM/EU - Customers may be unable to logon
Incident Report for Flexera System Status Dashboard
Postmortem

Description: Flexera One UI – All Regions – Intermittent Login Disruptions

Timeframe: March 6th, 11:07 AM to March 6th, 7:27 PM PST

Incident Summary

On Wednesday, March 6th, 2024, at 5:17 PM PST, reports surfaced indicating that customers were experiencing intermittent login issues or unexpected logouts while navigating within the Flexera One UI. The problem was sporadic and affected customers across all regions.

Our technical teams immediately began investigating the issue. After conducting thorough investigations and reviewing the logs, they pinpointed the problem at 6:27 PM PST, identifying it as linked to a recent scheduled change.

The issue stemmed from changes implemented earlier that day at 11:07 AM PST, to address a previous bug, aimed at preventing users from being logged out due to short periods of inactivity. This involved implementing a mechanism to verify token validity before processing requests. However, this adjustment inadvertently introduced a bug affecting users with multiple browser tabs.

Specifically, if one tab had an inactive token due to prolonged inactivity, while another had an active one due to a new session initiated after prolonged inactivity in the other tab, a race condition occurred, resulting in simultaneous logouts and logins.
The issue went undetected during both manual and automated testing, as it was a scenario that couldn't be effectively reproduced in a test environment.

To resolve this, the mechanism for checking token validity was removed as part of the fix at 7:27 PM PST, allowing users to navigate the Flexera One UI without encountering unexpected logouts.

After conducting successful health checks and obtaining customer validation, the incident was officially declared resolved.

Root Cause

The issue stemmed from changes implemented to address a previous bug, aimed at preventing users from being logged out due to short periods of inactivity. This involved implementing a mechanism to verify token validity before processing requests. However, this adjustment inadvertently introduced a bug affecting users with multiple browser tabs.

Specifically, if one tab had an inactive token due to prolonged inactivity, while another had an active one due to a new session initiated after prolonged inactivity in the other tab, a race condition occurred, resulting in simultaneous logouts and logins.
The issue went undetected during both manual and automated testing, as it was a scenario that couldn't be effectively reproduced in a test environment.

Remediation Actions

  1. Identified root cause: Thoroughly investigated the issue and pinpointed the problem to recent scheduled changes.
  2. Validation of bug: Confirmed the presence of the bug through analysis of customer reports and system logs, ensuring accurate understanding of the issue's scope and impact.
  3. Determined bug introduction: Recognized that the recent adjustment inadvertently introduced a bug affecting users with multiple browser tabs, leading to a race condition.
  4. Implementation of fix: Deployed the fix to remove the mechanism, enabling users to navigate the Flexera One UI without encountering unexpected logouts.

Future Preventative Measures

  1. Update Original Bug Fix: We aim to implement updates to the original bug fix to address the root cause directly and recognize the potential need for additional testing to ensure comprehensive resolution.
  2. Improved Testing: Enhance testing procedures to include scenarios involving multiple browser tabs and varied token states to identify similar issues during the testing phase.
  3. Comprehensive Monitoring: Implement robust monitoring systems to detect unusual behavior patterns, such as simultaneous logouts and logins, in real-time, allowing for prompt intervention.
  4. Thorough Code Reviews: Conduct thorough code reviews to identify potential bugs or unintended consequences introduced by system changes, particularly those related to authentication mechanisms.
Posted Mar 26, 2024 - 17:35 PDT

Resolved
This incident has been resolved.
Posted Mar 06, 2024 - 21:13 PST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Mar 06, 2024 - 19:33 PST
Identified
The issue has been identified and a fix is being implemented.
Posted Mar 06, 2024 - 18:52 PST
Investigating
Incident Description:
Customers attempting to logon to Flexera One in all regions may experience the logon process get stuck in a loop.

Priority: 2

Restoration activity:
Technical teams have been engaged and are currently investigating.
Posted Mar 06, 2024 - 18:21 PST
This incident affected: Flexera One - IT Asset Management - North America (IT Asset Management - US Login Page), Flexera One - IT Asset Management - Europe (IT Asset Management - EU Login Page), and Flexera One - IT Asset Management - APAC (IT Asset Management - APAC Login Page).