Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Date

Authors

Joseph (Pepe) Kelly , Ashley Ransoo, Andy Dingley

Status

Resolved

Summary

ESR sent loads of files and it looks like we haven’t captured everything from. In fact we had but a discrepancy between ESR’s specification and their implementation meant that they rejected the notifications we sent them.

https://hee-tis.atlassian.net/browse/TIS21-1265

Impact

It would appear that applicants would be missing or not sent to ESR.

Non-technical Description

ESR sent through a number of FULL FILES on 1st of March that did not load or fully load/reconcile to then send applicants against subsequently for St Helens & Knowsley Trust. This meant a delay in sending across Applicant Files to ESR.

ESR told us previously that the maximum number of full files they would send us was 3 per week. On w/c 1st of March, they sent through 9 files, 8 distinct with 1 additional.


Trigger

Unexpected number of FULL FILES (RMF) causing an overload on the TIS-ESR interface.

DE_EMD_RMF_20210301_00002598.DAT
DE_KSS_RMF_20210301_00002788.DAT
DE_EOE_RMF_20210301_00002798.DAT
DE_MER_RMF_20210301_00002766.DAT
DE_OXF_RMF_20210301_00002759.DAT
DE_NWN_RMF_20210301_00002766.DAT
DE_LDN_RMF_20210301_00003186.DAT
DE_WMD_RMF_20210301_00002906.DAT
DE_WMD_RMF_20210302_00002907.DAT

Detection

  • Alerting in our monitoring channel

  • An issue raised on Teams regarding missing applicants from Liam Lofthouse (NWM Data Lead)


Resolution

  • Filter the notification files for those that St. Helens needed (VPD: 96) and modify the VPD to be 3 digits

  • Modified the part of the system that writes to CSV to ensure the VPD is at least 3 digits.


Timeline

  • - 17:28-17:38 - 7 RMF files received (EMD, KSS, EOE, MER, OXF, NWN, LDN)

  • - 17:45 - MongoDB down

  • - 18:09 - MongoDB manually restarted

  • - 18:10 - MongoDB up

  • - 19:10 - MongoDB down

  • - 19:20 - MongoDB up

  • - 21:36 - MongoDB down

  • - 22:31 - MongoDB up

  • - 15:19 - Live defect https://hee-tis.atlassian.net/browse/TIS21-1265 created

  • - 16:57 - WMD RMF received

  • - 18:30 - WMD RMF received

  • - 15:21 - Query raised on teams about unreceived data


Root Cause(s)

  • We sized Mongo on the basis we wouldn’t get many full files.

  • There isn’t anything to signal that Mongo is struggling before a fatal VM failure.

  • The VM doesn’t stop and start in a reasonable time.

  • The replica set isn’t as ‘highly available’ as it should be (not on separate VMs)


Action Items

Action Items

Owner

Test loading full files with a larger instance

Joseph (Pepe) Kelly

[not critical] Clean up Applicant records (for past placements) marked as TO_EXPORT, e.g. ESRExporter - ProdGeneratedapprecord•60117906459cf5418067a1e8

Calendar of releases/milestones would be useful.

Andy Nash (Unlicensed) has started this in the team calendar


Lessons Learned

  • Time to crack on with Tech Improvement work

  • No labels