Date | |||||||||
Authors | Cai WillisDoris.Wong Jayanta Saha Joseph (Pepe) Kelly Adewale Adekoya | ||||||||
Status | Patched, Root Cause Found, Solution to Root Cause in ProgressResolved | ||||||||
Summary |
| ||||||||
Impact | Large numbers of logs generated requiring ~5 minutes downtime of recommendations. Application was becoming slow beforehand due to message processing. |
...
9:03 - Cai Willis reported the errors
9:35 - Investigation started
9:40 - issue reported to users and Recommendation paused for 5 minutes
9:50 - Temporary fix made
9: 50- Comms sent to users that recommendation was back
10: 40- Preventative measure deployed to recommendation service (prevent requeuing)
12: 05- Likely root cause discovered
13: 15- Root cause solution deployed to production environment
...
Root Cause(s)
Poor handling of null values in deferral reasons
Default behaviour of requeuing messages when exception thrown
...