Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Date

Authors

Phil James (Unlicensed)

Status

Resolved

Summary

Database ran out of space and resulted in system failure.

Impact

Users were unable to log into TIS for approx 20 mins

https://hee-tis.atlassian.net/browse/TISNEW-5190

Root Cause(s)

  • Database ran out space

    • Slow logs seemed to take a disproportionate amount of space

Trigger

Resolution

Detection

Timeline

Action Items

Action Items

Owner

Fix monitoring:

Alertmanager should send to #monitoring-prod rather than #monitoring

Uptime robot didn’t report outage until keycloak was unavailable

Look at disk management

Lessons Learned

  • No labels