...
Reinitialising the data directory for the failed node and allowing a full synchronisation to take place.
Restarting the machine the database was on following the synchronisation
...
Timeline
16:30 BST - Backup runstarted
16:44 BST - Disk full
17:46 BST - After investigating several options, began a full re-sync of the node
19:19 BST - Node transitioned to secondary but the the VM was still non-responsive, EC2 status check started failing
21:44 -22:10 BST - Restarted the machine and checked replicaset reported as healthy
00:04 BST - All integration services restarted and a problem message cleared from the message broker
Root Cause(s)
Backup written to the same device used for data & transaction logs
...
Action Items
Action Items | Owner | |
---|---|---|
Increate storage available to Mongo | ||
Create a repeatable process for copying data to stage environment | ||
...
Lessons Learned
Just never output to the same device that data is written on.