Monitoring and Alerts - Are we drowning

Actions

Session	Description	When / Links etc

Session	Description	When / Links etc
Review Alerts - ETL	Go through the alerts for the ETLs NDW ESR	NDW Alert
Review Alerts - Infrastructure	AWS Notifications e.g. Storage / usage / uptime alerts Any other infra alerts?
Review Alerts - Tests / Pipelines	Jenkins Github Actions
Sync Jobs	Person Sync Post Sync
Rabbit Messages	Dead letter queues Errors
Logging Standards	Format of logs, what to log, what not to log, detail	Cloudwatch / Docker / App Logs
Tools	Looking at all the tools we currently have for notifications Should we keep or sack it off	Graphana Prometheus Sentry Uptime robot

Summary of Breakout Sessions

Review tools after lift and Shift move to AWS
A lot of the existing alerts - team members not sure what they mean and they could be improved
Logs - standards of where they are / what we logged need to be improved

Team Blue

TIS21 Confluence Space