Date |
|
Authors | |
Status | Documenting |
Summary | Prod Green server went down |
Impact | Bulk Upload unavailable |
Non-technical Description
.
Trigger
.
Detection
.
Resolution
.
...
The Prod Green server went down and it’s instance state was “stopped” when looking at it in AWS
...
Trigger
Appears to have been an AMI issue as the following error is displaying in the instance details:
Code Block EC2 can't retrieve the name because the AMI was either deleted or made private
...
Detection
Slack Alert at 4:02 AM on
...
Resolution
The server was restarted and started functioning accordingly however the AMI error message is concerning therefore will look at recreating the server with a new AMI
...
Timeline
~04:00 - Components started shutting down
04:00 02 - Alerts triggered on slack03
08:50 - VM restarted in cloud console. Generic Upload available again.
...
Root Cause(s)
.The AMI used by the instance was deleted
...
Action Items
Action Items | Owner |
---|---|
Recreate EC2 instance with a new AMI | |
Further investigate as a number of our EC2 instances are showing the same AMI deleted/made private message | |
...
Lessons Learned
.