Date | |
Authors | Joseph (Pepe) Kelly |
Status | Awaiting clean file from ESR |
Summary | ESR applicant-export failed on 21st May due to a connection error. The ETL received a 401 (Unauthorized) exception but we don't yet know why. |
Impact | New Applicants for Yorkshire & Humberside were not received by ESR until |
Jira reference
Not - TISNEW-3001Getting issue details... STATUS .
Impact
Applicants for West Midlands have not been fully processed by TIS; the PositionReconciliationRecord has not been created/updated for all of the relevant positions. This has the impact of not having the correct status in TIS, which may in tern cause a partial set of notifications/applicants being sent to ESR.
Details of the scheduled jobs are here: ESR Schedules.
Root Causes
- A 500 (Internal Server Error) response was received for a HTTP request from ESR-ETL to ESR.
- A MySQL error
Hibernate: insert into EsrOutboundPositionReconciliationRecord (createdDate, deaneryNumber, deleteChangeIndicator, esrLocation, esrOrganisation, hostLeadEmployerIndicator, lastModifiedDate, managingLocalOffice,
matched, odsEmployerCode, positionId, positionNumber, positionTitle, recordType, tisStatus, vpdCode) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
2019-07-13 14:33:07.333 WARN 1 — [ XNIO-2 task-14] o.h.engine.jdbc.spi.SqlExceptionHelper : SQL Error: 1406, SQLState: 22001
2019-07-13 14:33:07.334 ERROR 1 — [ XNIO-2 task-14] o.h.engine.jdbc.spi.SqlExceptionHelper : Data truncation: Data too long for column 'deaneryNumber' at row 1 Logging shows numerous whole rows included as a national post number 2019-07-13 14:30:43.540 INFO 1 --- [main] c.t.hee.tis.esr.processor.Processor : 1.1: Fetching Posts by NPN:...
- The applicant file 'DE_WMD_RMF_20190713_00002309.DAT' included specialties with a backslash (\) character.
Trigger
- File from ESR containing escape characters.
Resolution
- Check which files (if any) weren't processed (looking in logs for number of records saved, queries against the ESR database)
Requested ESR clean and reproduce the file.
- ...
- Validate file contents before processing.
Detection / Timeline
- 2019-07-13 1433: Ansible message to #esr_operations channel reporting failure.
- 2019-07-15 15:22: Flagged for further investigation.
- 2019-07-16 : Identified the problem was limited to the West Midlands file. Notified ESR about the problem with data.
Action Items
- Raise ticket to include retries (as one type of service resilience) for connection issues, e.g. a configurable list of HTTP status codes.
Lessons Learned
- .
What went well
- It was simple to rectify the issue with this particular instance.
What went wrong
- The verbosity of the logs make using them more difficult.
- We didn't start further investigation until Monday afternoon.
Add Comment