/
2016-12-19 Elasticsearch snapshots deleted on production
2016-12-19 Elasticsearch snapshots deleted on production
Date |
|
Authors | Grante Marshall (Unlicensed) Graham O'Regan (Unlicensed) |
Status | Complete |
Summary | Elasticsearch snapshots being deleted in Production |
Impact | didn't affect service |
Root Cause
The configuration settings for Curator on production were set to only retain a single snapshot. When the snapshot process failed Curator removed the only remaining snapshot in Azure Blobstorage.
Trigger
The nightly snapshotting jobs from Jenkins failed because existing snapshots already existed. Curator then ran and deleted the only remaining snapshot.
Resolution
The number of snapshots was increased to 5, one per day.
Detection
There are Jenkins jobs that run the Elasticsearch snapshots which failed and the failures were reported to the #dev channel in Slack.
Action Items
Action Item | Type | Owner | Issue |
---|---|---|---|
Increase the number of snapshots retained | prevent | Grante Marshall (Unlicensed) |
Timeline
- 11pm Jenkins job ran
- Jenkins sent notification to Slack
- N (Unlicensed) changed the Curator settings to retain 5 days of snapshots.
Supporting Information
, multiple selections available,
Related content
2017-01-06 ES Snapshotting error
2017-01-06 ES Snapshotting error
More like this
beta-005
beta-005
More like this
2019-07-30 ElasticSearch Sync Job failed
2019-07-30 ElasticSearch Sync Job failed
More like this
2021-06-14 Person ElasticSearch sync job failed affecting Person Search
2021-06-14 Person ElasticSearch sync job failed affecting Person Search
More like this
2021-02-12 Person Search Sync Failed
2021-02-12 Person Search Sync Failed
More like this
2021-06-16 ElasticSearch overload
2021-06-16 ElasticSearch overload
More like this
Slack: https://hee-nhs-tis.slack.com/
Jira issues: https://hee-tis.atlassian.net/issues/?filter=14213