2021-03-03 Person Search Sync Failed [Unable to parse response body]

Date

Feb 12, 2021

Authors

@Joseph (Pepe) Kelly @Reuben Roberts

Status

Resolved

Summary

Person Search Sync Failed

Impact

Person search page was not showing some data between 01:30 and 09:00

Non-technical Description

The overnight sync procedure for TIS was unable to run. This meant only some trainees were being shown on the person search page.

Person details to be synchronized are sent in batches/pages, one of the pages contained too much data to be processed correctly.
To resolve this the page size used during the synchronization has been reduced from 8000 to 5000.


Trigger

 


Detection

  • The issue was detected when errors were noted in the ‘Sync [Person sync job] started’ Slack notification, and the ‘Sync [Person sync job] finished’ notification failed to appear.

 


Resolution

  • Reduced the page size from 8000 to 5000 to bring the request size down.

  • Enabled the page size to be set by environmental variable on deploy.

  • Manually triggered the Person sync job to rebuild the Person elasticsearch index.


Timeline

  • Mar 3, 2021 01:46 - Sync job failed

  • Mar 3, 2021 08:31 - Sync job failed when re-run

  • Mar 3, 2021 08:34 - Fix submitted

  • Mar 3, 2021 08:42 - Fix deployed

  • Mar 3, 2021 08:53 - Sync job completed successfully

 


Root Cause(s)

  • The nightly sync job failed.

  • The request size with a page size of 8000 was too large, our instance size limits us to 10mb per request.


Action Items

Action Items

Owner

Action Items

Owner

n/a

 


Lessons Learned

  • Configurable property allows easier tweaking in future.