Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Authors

Andy Dingley

Status

DocumentingDone

Summary

The TCS service was repeatedly trying and failing to process an event, causing a large volume of additional log entries.

Impact

A significant increase in the monthly costs for logging, usually ~$300/month but currently at ~$1300 part way through the month.

...

  • : 14:35 - TCS starts logging excessively due to retry loop

  • : 10:30 - Additional CloudWatch costs identified via AWS cost/billing tools

  • : 11:29 - The failing message was moved to a DLQ to stop the logging

  • : 11:01 - Permanent fix deployed to preprod

  • : 15:03 - Permanent fix deploy to prod

  • : ??:?? - Permanent fix deploy to NIMDTA - DEPLOY NOT YET APPROVED

Root Cause(s)

Why was there such excessive logging?

...