2021-08-02 Placements not being exported to ESR
Date | Aug 2, 2021 |
Authors | @Marcello Fabbri (Unlicensed) @Joseph (Pepe) Kelly |
Status | Resolved |
Summary | Placements added by ESR late on 30/07/2021 for 04/08/2021 starters had a |
Impact | Some records might not be exported in time back to ESR, implying that some manual hiring will have to be undertaken instead. |
Non-technical Description
AppRecords meant to be exported on 31/07/2021 haven’t been exported back to ESR.
Trigger
Placements added on 30/07/2021 not going through the ESR interface as expected on 31/07/2021.
Detection
MS Teams: message on the ESR Support channel.
Resolution
GeneratedAppRecords with a start date after 03/08/2021 have been found and an appfilegenerationcommand file has been manually placed on the queue leading to the EsrDataExporter for each deanery that appeared to have non-exported GeneratedAppRecords.
Timeline
Jul 30, 2021: some Placements are being added in the late afternoon for August 4th starters.
Jul 31, 2021: GeneratedAppRecords related to those placements are not getting exported as expected.
Aug 2, 2021: 12:10 PM BST: MS Teams message notifying these GeneratedAppRecords are still marked as
TO_EXPORT
Aug 2, 2021: 15:40 PM BST: Some messages containing the “command” message to export GeneratedAppRecords for MER are being re-sent on the appropriate queue in the ESR interface to retrigger their export.
Aug 2, 2021: 16:53 PM BST: Messages containing the “command” message to export GeneratedAppRecords for EOE, LDN, OXF and WES are being re-sent on the appropriate queue in the ESR interface to retrigger their export.
Aug 2, 2021: 17:06 PM BST: Users are notified via MS Teams that the export of these files has been completed.
– meanwhile other GeneratedAppRecords qualifying for file generation were not being picked up –
Aug 11, 2021: 16:45 PM BST: Manual trigger of the generation of files for the SEV deanery was successful.
Aug 12, 2021: 09:33 AM BST: Manual trigger of the generation of files for the rest of the deaneries was successful.
Aug 12, 2021: Retry policy for messages re-enabled
Aug 13, 2021: Generation of export files (both app and notification) dated 01/July/2021 has been manually triggered, in an attempt to export these data to ESR.
Root Cause(s)
Still investigating why those placements have not been picked up on 31/07/2021.
The issue has probably affected all deaneries, and has been ongoing for days, since there were several GeneratedAppRecords stuck in TO_EXPORT
status that were not being picked up for the generation of exportable files, even though they matched the criteria for export.
When the cron job sends (every day at 13:00 UTC) the “command” (RabbitMQ message on esr.queue.appfilegenerationcommand.create.dataexporter
) to generate the exportable app files, it fails because of a mongo based WriteConflict (com.mongodb.MongoCommandException: Command failed with error 112 (WriteConflict)
).
There’s no conflict when a command is sent manually, so the generation of files that qualified for export has been triggered manually. The reason is likely to be rooted in the fact that manual “commands” on the queue involves a developer sending the appropriate “command” messages one by one, whereas the cron job sends them all at the same time, and end up conflicting. In particular, the conflicts occurred when the processing of the message involved the update of the Counter collection on the EsrDataExportService database. The conflicts where normally resolved by allowing the messages to be replayed, however a change in the retry policy no longer allowed for it, which resulted in the conflicts remaining unresolved.
Loss of the retry policy that was in place probably occurred while migrating to AWS. Now that’s been enabled again, the ESR interface operates as it used to.
Lessons Learned
Check policies are still retained when migrating services to different platforms.
Slack: https://hee-nhs-tis.slack.com/
Jira issues: https://hee-tis.atlassian.net/issues/?filter=14213