Date | 2nd October 2017 |
Authors | Chris Mills |
Status | In progress |
Summary | Multiple containers went down on production applications and monitoring |
Impact | Users were unable to log in |
Root Cause
The root cause is split into sections due to the parts this incident affects.
Monitoring:
The monitoring containers not running were as result of redeploying the stack due to the keycloak issue on the 28th.
Prod Applications:
We are currently looking into the root cause of this
Trigger
Investigating
Resolution
Investigating
Detection
We were notified by Reuben Noot (Deactivated) who stated that he was unable to view production when he noted on slack
Reuben Noot [8:56 AM]
production doesn't look to healthy - getting internal server error just trying to get to https://apps.tis.nhs.uk/ui/
Action Items
Action Item | Type | Owner | Issue |
---|---|---|---|
Add monitoring to the monitoring server so that it's active at all times | prevent | DevOps | - TISDEV-2684Getting issue details... STATUS |
Timeline
Supporting Information
Add Comment