2019-07-30 ElasticSearch Sync Job failed

Date

 

AuthorsJoseph (Pepe) Kelly
StatusComplete
Summary


ImpactNone found.

Root Cause(s)

ElasticSearch Sync job failed during the night at 1:34am (UTC) failed to complete.

The docker container restarted

Trigger

The TIS-SYNC restarted.

Resolution

Re-ran job as on Sync Service.

Detection

Alerting in Slack.

Action Items

Action ItemTypeOwnerIssue
Restart the jobsRestore serviceJoseph (Pepe) Kelly

TISNEW-3183 - Getting issue details... STATUS


Timeline (BST)

  • 2:34 am TIS-SYNC job failed
  • 7.30 am Slack alerting of failed job in #monitoring (5 hour delay is to ensure all sync jobs running complete (or fail) before reporting on relative successes/failures - this has now been replaced by real-time reporting)
  • 7.43 am Picked up alert and re-ran job (Ola)
  • 7:48 am Re-run of jobs completed