Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Current »

Date

 

AuthorsGraham O'Regan (Unlicensed)
StatusComplete
SummaryElasticsearch indexing appeared to fail because it exceeded the 30 second timeout. However, the snapshot was created so the job couldn't complete if run again.
ImpactNone

Root Cause

The timeout for the task was the default 30secs but taking the snapshot and copying it to Blob Storage took longer.

Trigger

The nightly ES index snapshotting job on Jenkins

Resolution

The timeout was set to 20mins to allow more time for the snapshot to complete before raising an error.

Detection

The Jenkins job alerts to the #dev Slack channel.

Action Items

Action ItemTypeOwnerIssue
Increased timeout to 20minspreventGraham O'Regan (Unlicensed)

Timeline

  • Nightly job failed and alerted #dev in Slack
  • An attempt to run the snapshot manually with curl also failed with a different exception because the snapshot did exist.

Supporting Information

Link to initial failed task.

https://build-hee.transformcloud.net/jenkins/job/elasticsearch-snapshot-prod/25/

  • No labels