Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Authors

Phil James (Unlicensed)

Status

Resolved

Summary

Database ran out of space and resulted in system failure.

Impact

Users were unable to log into TIS for approx 20 mins

uptime robot reported 11mins

https://hee-tis.atlassian.net/browse/TISNEW-5417

...

Action Items

Action Items

Owner

Fix monitoring:

Alertmanager should send to #monitoring-prod rather than #monitoring?

Uptime robot didn’t report outage until keycloak was unavailable

Error messages need to be clear

Look at disk management

Decide on bigger disk?

...