/
2016-12-19 Elasticsearch snapshots deleted on production

2016-12-19 Elasticsearch snapshots deleted on production

Date

 

AuthorsGrante Marshall (Unlicensed) Graham O'Regan (Unlicensed)
StatusComplete
Summary

Elasticsearch snapshots being deleted in Production

Impactdidn't affect service

Root Cause

The configuration settings for Curator on production were set to only retain a single snapshot. When the snapshot process failed Curator removed the only remaining snapshot in Azure Blobstorage.

Trigger

The nightly snapshotting jobs from Jenkins failed because existing snapshots already existed. Curator then ran and deleted the only remaining snapshot.

Resolution

The number of snapshots was increased to 5, one per day.

Detection

There are Jenkins jobs that run the Elasticsearch snapshots which failed and the failures were reported to the #dev channel in Slack.

Action Items

Action ItemTypeOwnerIssue
Increase the number of snapshots retainedpreventGrante Marshall (Unlicensed)

TISDEV-1459 - Getting issue details... STATUS

Timeline

  • 11pm Jenkins job ran
  • Jenkins sent notification to Slack
  •  N (Unlicensed) changed the Curator settings to retain 5 days of snapshots.

Supporting Information

Related content

2017-01-06 ES Snapshotting error
2017-01-06 ES Snapshotting error
More like this
beta-005
More like this
2019-07-30 ElasticSearch Sync Job failed
2019-07-30 ElasticSearch Sync Job failed
More like this
2021-06-14 Person ElasticSearch sync job failed affecting Person Search
2021-06-14 Person ElasticSearch sync job failed affecting Person Search
More like this
2021-02-12 Person Search Sync Failed
2021-02-12 Person Search Sync Failed
More like this
2021-06-16 ElasticSearch overload
2021-06-16 ElasticSearch overload
More like this