There was a change to the boolean logic for detecting empty Postgres LSNs which moved the code path into the wrong path. This reset Postgres LSNs for all connectors that synced between the daily deployment (~9:30 am PDT) and when we reverted that change (~11:30 am PDT).
11/17/2020 ~9:30 am - The first issues were popping up after the day's deployment
11/17/2020 ~11:30 am - The problem PR was reverted and the deployment was reverted to 11/16's build
The change was reverted which fixed the immediate failures and prevented any further incorrect state changes.
XMIN Postgres connectors were fixed after the change was reverted as their states were unaffected.
The affected WAL Postgres connectors were identified and we’ve emailed the affected customers with the next recovery options.
Improved release management monitoring process. We are scoping out additional monitoring to check for connector failure spikes after a code release.