Date |
|
Authors | |
Status | Done |
Summary | |
Impact |
|
Non-technical Description
New functionality was released to allow Permit to Work to use values from the reference tables, instead of being a fixed set of outdated data. Some of our other applications/services, namely ESR and our overnight sync jobs, could not handle that change and began failing. Those projects were updated to allow them to correctly process the new Permit to Work data.
Trigger
TCS deployed with changes to the PermitToWork field, changing it from an enumeration to a string.
Detection
ESR failures detected by Sentry and notified on Slack
TIS-SYNC failures notified on slack during overnight sync jobs
Resolution
tcs-client
andtcs-persistence
version updated in ESR and TIS-SYNC projects.
Timeline
- 14:30 - TCS service deployed to production
- 14:42 - Error in ESR APP generation notified in Slack
- 14:44 - Issue picked up by devs
- 16:16 - Fix deployed for EsrAppRecordGeneratorService
- 16:32- Fix deployed for EsrNotificationGeneratorService
- 17:39 - Fix deployed for EsrInboundDataWriterService
- 18:48 - Fix deployed for TIS-EsrReconciliationService
- 00:09 - TIS overnight sync jobs failed
- 02:30 - Issue picked up by devs
- 03:55 - Fix deployed to stage for TIS-SYNC - decision made not to merge to prod at this time due to auto-start logic before 05:00
- 05:20 - TIS-SYNC fix deployed to production environment and sync jobs trigger - NIMDTA jobs didn’t fully trigger due to permissions(?)
- 10:45 - NIMDTA jobs re-ran and completed successfully
Root Cause(s)
TCS deployed with changes to the PermitToWork field, changing it from an enumeration to a string.
The (de)serialization of the RightToWork object in ESR and TIS-SYNC projects began to fail.
Outdated
tcs-client
andtcs-persistence
used so those project still tried to treat PermitToWork as an enumeration.Updating dependencies in those projects was missed as a search for usages of
PermitToWorkType
did not find them.Those projects do not directly use PermitToWork, but do (de)serialize the Person object which has it nested.
Action Items
Action Items | Owner |
---|---|
Ensure all devs have adequate permissions to run sync jobs |
Lessons Learned
Need to be more wary of breaking changes and the effect on services calling the affected API.
Devs not having the correct permissions may have slowed down the full resolution.
Add Comment