...
Date | |
Authors | Joseph (Pepe) Kelly, Ashley Ransoo |
Status | Awaiting clean file from ESR |
Summary | ESR applicant-export load failed on 21st May due to a connection error. 13th July 2019 due to bad data in an RMF file received from ESR. The ETL received a 401 (Unauthorized) exception but we don't yet know why500 error. |
Impact | New Applicants for Yorkshire & Humberside were not received by ESR until positions expected for WMD Derby Hospitals NHS Foundation Trust were not successfully loaded as a result. James Harris reported that there were 85 new positions expected to be received |
Table of Contents |
---|
Jira reference
Not
Jira Legacy | ||||||||
---|---|---|---|---|---|---|---|---|
|
Impact
Applicants for West Midlands have not been fully processed by TIS; the PositionReconciliationRecord has not been created/updated for all of the relevant positions. This has the impact of not having the correct status in TIS, which may in tern cause a partial set of notifications/applicants being sent to ESR.
...
- Check which files (if any) weren't processed (looking in logs for number of records saved, queries against the ESR database)
Requested ESR clean and reproduce the file.
- ...
- Validate file contents before processingOnce a clean file is received and TIS is notified of the same, it is expected to be processed along with other files BAU by the ETL.
Detection / Timeline
- 2019-07-13 1433: Ansible message to #esr_operations channel reporting failure.
- 2019-07-15 15:22: Flagged for further investigation.
- 2019-07-16 : Identified the problem was limited to the West Midlands file. Notified ESR about the problem with data.
Action Items
- Raise ticket to include retries (as one type of service resilience) for connection issues, e.g. a configurable list of HTTP status codes.
Lessons Learned
- .validate file contents before processing. (Joseph (Pepe) Kelly - need your help on this)
- RMF (Full) files are known to have bad data from experience. RMF Files are only sent on day 1 of a LO going live or upon request. Although ESR informed and asked for confirmation before sending this File to us, they did not wait for confirmation from our side. Comms need to be managed properly with ESR and making sure we are aware when RMF files are sent to TIS. (cc Nazia AKHTAR)
Lessons Learned
- More than a reason to re-build a proper ESR interface from scratch with best-in-class interface technology
- It is intricate and time consuming to work out impacts on data and resolve accordingly
What went well
- It was simple to rectify the issue with this particular instance .after a lot conversations to work it out
What went wrong
- The verbosity of the logs make makes using them more difficult.
- We didn't start further investigation until Monday afternoon as the issue happened on a non-working day and more recent slack messages on #esr_operations had already made it not apparent.
Where we got lucky
- We did not have to re-run the applicant-load as the remaining RMC files processed successfully other than the WMD RMF file.