Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Current »

Date

Authors

Marcello Fabbri (Unlicensed) Joseph (Pepe) Kelly

Status

Investigating

Summary

Placements added by ESR late on 30/07/2021 for 04/08/2021 starters had a TO_EXPORT status instead of being exported

Impact

Some records might not be exported in time back to ESR, implying that some manual hiring will have to be undertaken instead.

Non-technical Description

AppRecords meant to be exported on 31/07/2021 haven’t been exported back to ESR.


Trigger

Placements added on 30/07/2021 not going through the ESR interface as expected on 31/07/2021.


Detection

MS Teams: message on the ESR Support channel.

Resolution

  • GeneratedAppRecords with a start date after 03/08/2021 have been found and an appfilegenerationcommand file has been manually placed on the queue leading to the EsrDataExporter for each deanery that appeared to have non-exported GeneratedAppRecords.

Timeline

  • : some Placements are being added in the late afternoon for August 4th starters.

  • : GeneratedAppRecords related to those placements are not getting exported as expected.

  • : 12:10 PM BST: MS Teams message notifying these GeneratedAppRecords are still marked as TO_EXPORT

  • : 15:40 PM BST: Some messages containing the “command” message to export GeneratedAppRecords for MER are being re-sent on the appropriate queue in the ESR interface to retrigger their export.

  • : 16:53 PM BST: Messages containing the “command” message to export GeneratedAppRecords for EOE, LDN, OXF and WES are being re-sent on the appropriate queue in the ESR interface to retrigger their export.

  • : 17:06 PM BST: Users are notified via MS Teams that the export of these files has been completed.

  • – meanwhile other GeneratedAppRecords qualifying for file generation were not being picked up –

  • : 16:45 PM BST: Manual trigger of the generation of files for the SEV deanery was successful.

  • : 09:33 AM BST: Manual trigger of the generation of files for the rest of the deaneries was successful.

  • : Retry policy for messages re-enabled

  • : Generation of export files (both app and notification) dated 01/July/2021 has been manually triggered, in an attempt to export these data to ESR.

Root Cause(s)

Still investigating why those placements have not been picked up on 31/07/2021.

The issue has probably affected all deaneries, and has been ongoing for days, since there were several GeneratedAppRecords stuck in TO_EXPORT status that were not being picked up for the generation of exportable files, even though they matched the criteria for export.

When the cron job sends (every day at 13:00 UTC) the “command” (RabbitMQ message on esr.queue.appfilegenerationcommand.create.dataexporter) to generate the exportable app files, it fails because of a mongo based WriteConflict (com.mongodb.MongoCommandException: Command failed with error 112 (WriteConflict)).

There’s no conflict when a command is sent manually, so the generation of files that qualified for export has been triggered manually. The reason is likely to be rooted in the fact that manual “commands” on the queue involves a developer sending the appropriate “command” messages one by one, whereas the cron job sends them all at the same time, and end up conflicting. In particular, the conflicts occurred when the processing of the message involved the update of the Counter collection on the EsrDataExportService database. The conflicts where normally resolved by allowing the messages to be replayed, however a change in the retry policy no longer allowed for it, which resulted in the conflicts remaining unresolved.

Loss of the retry policy that was in place probably occurred while migrating to AWS. Now that’s been enabled again, the ESR interface operates as it used to.

Lessons Learned

Check policies are still retained when migrating services to different platforms.

  • No labels