Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Date

Authors

Liban Hirey (Unlicensed)

Status

Documenting

Summary

Prod Green server went down

Impact

Bulk Upload unavailable

Non-technical Description

  • The Prod Green server went down and it’s instance state was “stopped” when looking at it in AWS


Trigger

  • Appears to have been an AMI issue as the following error is displaying in the instance details:

    EC2 can't retrieve the name because the AMI was either deleted or made private

Detection

  • Slack Alert at 4:02 AM on


Resolution

  • The server was restarted and started functioning accordingly however the AMI error message is concerning therefore will look at recreating the server with a new AMI


Timeline

  • ~04:00 - Components started shutting down

  • 04:02 - Alerts triggered on slack

  • 08:50 - VM restarted in cloud console. Generic Upload available again.


Root Cause(s)

  • The AMI used by the instance was deleted


Action Items

Action Items

Owner

Recreate EC2 instance with a new AMI

Further investigate as a number of our EC2 instances are showing the same AMI deleted/made private message


Lessons Learned

  • .

  • No labels