Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Current »

Date

Authors

Liban Hirey (Unlicensed)

Status

Documenting

Summary

Prod Green server went down due to a scheduled retirement of the instance

Impact

Bulk Upload unavailable

Non-technical Description

  • One of the servers that the TIS application is load-balanced between became unavailable.

  • After contacting AWS Support it turns out the server was shut down due to a scheduled retirement of the instance caused by underlying hardware issues.

  • We did not receive any emails from AWS informing us of this scheduled retirement as the address they send emails to (caaa@hee.nhs.uk) is not managed by our team


Trigger

  • Server instance state was “stopped” when looking at it in AWS

Detection

  • Slack Alert at 4:02 AM on


Resolution

  • The server was restarted and started functioning accordingly


Timeline

  • ~04:00 - Components started shutting down

  • 04:02 - Alerts triggered on slack

  • 08:50 - VM restarted in cloud console. Generic Upload available again.

  • 10:35 - Ticket opened with AWS Support

  • 10:55 - Response received from AWS Support


Root Cause(s)

  • The AMI used by the instance was deleted

  • Could this be due to the recent log4j vulnerability?

  • AWS notifies us that the instance was stopped due to scheduled retirement caused by an “unrecoverable issue with the underlying hardware”.


Action Items

Action Items

Owner

Recreate EC2 instance with a new AMI

Further investigate as multiple EC2 instances are showing the same AMI deleted/made private message

Investigate what triggered the server to go down

https://hee-tis.atlassian.net/browse/TIS21-2514

Mitigate this happening again by making sure we receive emails from AWS

https://hee-tis.atlassian.net/browse/TIS21-2514


Lessons Learned

  • Make sure we receive emails from AWS instead of it going to an account not managed by our team

  • No labels