/
Monitoring and Alerts - Are we drowning

Monitoring and Alerts - Are we drowning

Actions

Session

Description

When / Links etc

Session

Description

When / Links etc

Review Alerts - ETL

Go through the alerts for the ETLs

  • NDW

  • ESR

NDW Alert

Review Alerts - Infrastructure

  • AWS Notifications e.g. Storage / usage / uptime alerts

  • Any other infra alerts?

 

Review Alerts - Tests / Pipelines

  • Jenkins

  • Github Actions

 

Sync Jobs

  • Person Sync

  • Post Sync

 

Rabbit Messages

  • Dead letter queues

  • Errors

 

Logging Standards

Format of logs, what to log, what not to log, detail

Cloudwatch / Docker / App Logs

Tools

Looking at all the tools we currently have for notifications

Should we keep or sack it off

Graphana

Prometheus

Sentry

Uptime robot

Summary of Breakout Sessions

  • Review tools after lift and Shift move to AWS

  • A lot of the existing alerts - team members not sure what they mean and they could be improved

  • Logs - standards of where they are / what we logged need to be improved

Team Blue

 

Related content

Sprint 19 Review (2017-03-14)
Sprint 19 Review (2017-03-14)
More like this
Sprint 12 Review (2020-02-19 to 2020-03-02)
Sprint 12 Review (2020-02-19 to 2020-03-02)
More like this
Daily checklist of Slack messages from scheduled tasks
Daily checklist of Slack messages from scheduled tasks
More like this
Sprint 07 Review (2016-09-13)
Sprint 07 Review (2016-09-13)
More like this
Sprint 20 Review (2020-06-09 to 2020-06-23)
Sprint 20 Review (2020-06-09 to 2020-06-23)
More like this
2023/4: Q1 | Admin Team Review#6( 2023-07-04 to 2023-07-18)
2023/4: Q1 | Admin Team Review#6( 2023-07-04 to 2023-07-18)
More like this