2019-07-30 ElasticSearch Sync Job failed
Date |
|
Authors | Joseph (Pepe) Kelly |
Status | Complete |
Summary | |
Impact | None found. |
Root Cause(s)
ElasticSearch Sync job failed during the night at 1:34am (UTC) failed to complete.
The docker container restarted
Trigger
The TIS-SYNC restarted.
Resolution
Re-ran job as on Sync Service.
Detection
Alerting in Slack.
Action Items
Action Item | Type | Owner | Issue |
---|---|---|---|
Restart the jobs | Restore service | Joseph (Pepe) Kelly |
Timeline (BST)
- 2:34 am TIS-SYNC job failed
- 7.30 am Slack alerting of failed job in #monitoring (5 hour delay is to ensure all sync jobs running complete (or fail) before reporting on relative successes/failures - this has now been replaced by real-time reporting)
- 7.43 am Picked up alert and re-ran job (Ola)
- 7:48 am Re-run of jobs completed
Slack: https://hee-nhs-tis.slack.com/
Jira issues: https://hee-tis.atlassian.net/issues/?filter=14213