2021-03-16 ESR APP generation and TIS sync job failures
Date | Mar 16, 2021 |
Authors | @Andy Dingley |
Status | Done |
Summary | |
Impact |
|
Non-technical Description
New functionality was released to allow Permit to Work to use values from the reference tables, instead of being a fixed set of outdated data. Some of our other applications/services, namely ESR and our overnight sync jobs, could not handle that change and began failing. Those projects were updated to allow them to correctly process the new Permit to Work data and we manually made “no change updates” to TIS records to generate missing applicants.
Trigger
TCS deployed with changes to the PermitToWork field, changing it from an enumeration to a string.
Detection
ESR failures detected by Sentry and notified on Slack
TIS-SYNC failures notified on slack during overnight sync jobs
Resolution
tcs-client
andtcs-persistence
version updated in ESR and TIS-SYNC projects.
Timeline
Mar 16, 2021 - 14:30 - TCS service deployed to production
Mar 16, 2021 - 14:42 - Error in ESR APP generation notified in Slack
Mar 16, 2021 - 14:44 - Issue picked up by devs
Mar 16, 2021 - 16:16 - Fix deployed for EsrAppRecordGeneratorService
Mar 16, 2021 - 16:32- Fix deployed for EsrNotificationGeneratorService
Mar 16, 2021 - 17:39 - Fix deployed for EsrInboundDataWriterService
Mar 16, 2021 - 18:48 - Fix deployed for TIS-EsrReconciliationService
Mar 17, 2021 - 00:09 - TIS overnight sync jobs failed
Mar 17, 2021 - 02:30 - Issue picked up by devs
Mar 17, 2021 - 03:55 - Fix deployed to stage for TIS-SYNC - decision made not to merge to prod at this time due to auto-start logic before 05:00
Mar 17, 2021 - 05:20 - TIS-SYNC fix deployed to production environment and sync jobs trigger - NIMDTA jobs didn’t fully trigger due to permissions(?)
Mar 17, 2021 - 10:45 - NIMDTA jobs re-ran and completed successfully
Mar 19, 2021 - 18:20 - Retriggered applicant generation based on trainees reported in Sentry errors
Root Cause(s)
TCS deployed with changes to the PermitToWork field, changing it from an enumeration to a string.
The (de)serialization of the RightToWork object in ESR and TIS-SYNC projects began to fail.
Outdated
tcs-client
andtcs-persistence
used so those project still tried to treat PermitToWork as an enumeration.Updating dependencies in those projects was missed as a search for usages of
PermitToWorkType
did not find them.Those projects do not directly use PermitToWork, but do (de)serialize the Person object which has it nested.
Action Items
Action Items | Owner |
---|---|
Ensure all devs have adequate permissions to run sync jobs | |
|
|
Lessons Learned
Need to be more wary of breaking changes and the effect on services calling the affected API.
Devs not having the correct permissions may have slowed down the full resolution.
Slack: https://hee-nhs-tis.slack.com/
Jira issues: https://hee-tis.atlassian.net/issues/?filter=14213