11 Dec 2023 14:45 Sync process accidentally triggered
11 Dec 2023 14:45 Impact on production environment noticed by developer
11 Dec 2023 16:41 Failure in last stage of process noticed
11 Dec 2023 16:50 Cause of failure identified
11 Dec 2023 16:57 Failure rectified, process resumed
11 Dec 2023 20:33 Process completed, system restored
11 Dec 2023 23:18 Recommendations Bug Discovered
12 Dec 2023 07:15 Users notified on Teams of Recommendation issues
12 Dec 2023 08:00 Patch to Recommendation UI deployed successfully, issue resolved ~ 08:00 Users notified of resolution

Root Cause(s)

Why was an unscheduled full production data-resynchronisation triggered?

Why was the sync not terminated?

...

Action Items

Action Items	Owner
Work out why the “traineeInfo” → “recommendationInfo” thing broke, and revert any patchwork done in the FE to compensate
Some mitigation for “accidental” prod triggers - what would this be?
Automated backups or similar “built-in” to the sync process so that it can be aborted and restored as required
Introduce batch messaging to speed up biggest bottleneck - judging by the work on the overnight doctor sync this could reduce the whole process down to a couple of hours

...