...
9:03 - Cai Willis reported the errors
9:35 - Investigation started
9:40 - issue reported to users and Recommendation paused for 5 minutes
9:50 - Temporary fix made
9: 50- Comms sent to users that recommendation was back
10: 40- Preventative measure deployed to recommendation service (prevent requeuing)
12: 05- Likely root cause discovered
13: 15- Root cause solution deployed to production environment
...
Root Cause(s)
???
Poor handling of null values in deferral reasons
Default behaviour of requeuing messages when exception thrown
...