...
All times in BST unless indicated
: ~15~22:00 50 - RabbitMQ instability begins.
: ~15:00 15 - RabbitMQ instability ends.
: 10:30 - Missing CoJ in TIS reported.
: 16:24 - Audit of CoJ messages completed and missing CoJ resent to TIS.
...
RabbitMQ became unstable due to resource limits being reached (disk-spacememory, due to excessive messages being held and not processed).
TIS Self-Service code to submit messages to RabbitMQ did not check for successful processing.
CoJs were successfully saved within TIS Self-Service, but not received by TIS due to messaging failure.
The lack of alerting of the failures, or of the resulting data discrepancy, meant that we relied on user reports to become aware of the issue.
...