Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Date

Authors

Cai Willis Yafang Deng Jayanta Saha Joseph (Pepe) Kelly

Status

Impacted

Summary

TIS21-3846 - Getting issue details... STATUS

Impact

The recommendations search page was not being updated for a number of hours through the day

Non-technical Description

.


Trigger


Detection

  • Message from 2 users


Resolution

  • A “force redeployment” update of the recommendation service

  • Gave users links to details pages for doctors which could not be found


Timeline

BST unless otherwise stated

  • -

  • 02:26 to 08:07 - Queue to recommendation for ‘doctor view’ update built steadily to ~83K

  • -

  • 08:21 - First report in user channel

  • 12:07 - Picked up for investigation

  • 12:07 to 14:00ish - Checked database & ElasticSearch index

  • 13:00 - Checked the return list of GMC for north west

  • 14:00ish - Found messages in reval.queue.masterdoctorview.updated.recommendation didn’t get consumed

  • 16:00ish - Force a new start of recommendation service

  • -

  • .


Root Cause(s)

  • Doctors reported as not showing in the search list

  • ElasticSearch Index for Recommendation Service isn’t being updated

  • Large backlog of messages stuck on a queue for updating the index

  • Message Consumers disappeared but after the final aws ecs update-service --force-new-deployment dropped to one before going back up to 3

  • ?


Action Items

Action Items

Comments

Owner

Monitoring for queue depth, consumption or some other combined metric to say whether messages are being processed ‘acceptably’.


Lessons Learned

  • No labels