Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Done

Date

Authors

Cai Willis Yafang Deng Jayanta Saha Joseph (Pepe) Kelly

Status

Impacted

Summary

The GMC sync job doesn’t update the data of index trainees in ES, so users might see some stale data.

Jira Legacy
serverSystem JIRA
serverId

Impact

4c843cd5-e5a9-329d-ae88-66091fcfe3c7
keyTIS21-3846

Impact

The recommendations search page was not being updated for a number of hours through the day

Non-technical Description

.

...

Trigger

...

Detection

  • Message from 2 users (assumed a similar / same root cause?)

...

Resolution

  • Get the index up to date

  • Give A “force redeployment” update of the recommendation service

  • Gave users links to details pages for doctors which could not be found

...

Timeline

BST unless otherwise stated

  • - 01:05 Slack alert on failure job

  • Checked database

  • ElasticSearch

  • Aug 30, 2022 - 09:07 Sync was started manually and it still failed

  • Aug 31, 2022 - 01:05 Slack alert on failure job

  • Aug 31, 2022 - 10:21 Found Sentry alerting of Profile service

  • Sep 01, 2022 - 01:05 Slack alert on failure job

  • Sep 01, 2022 - 11:39 PR was merged to remove the doctors with null GMC number from the list sent to Profile service, but didn’t work

  • Sep 01, 2022 - 12:01 Found the image of TIS-GMC-SYNC running on Prod was built 3 years ago

  • Sep 01, 2022 - 12:15 PR was merged to bump the build version of TIS-GMC-SYNC service on TIS-DEVOPS, but didn’t work

  • Sep 01, 2022 - 13:32 Image version issue was fixed on Prod, but Slack still alert on failure job

  • Sep 01, 2022 - 14:29 PR was merged to add debug logging for duplicate GMC numbers

  • Sep 01, 2022 - 15:01 TIS-GMC-SYNC job failed again, but duplicate GMC number was logged

  • Sep 01, 2022 - 15:30ish Duplicate GMC number was identified from Curl output

  • Sep 01, 2022 - 15:37 PR of adding debug logging was reverted

  • Sep 05, 2022 - 11:30ish Got reply from GMC and looked into the logs in the midnight. We still got duplicates. Rob sent a further email to GMC.

  • Sep 13, 2022 - 10:15ish We checked the logs and no duplicates found. https://hee-nhs-tis.slack.com/archives/C03GBMYGZD4/p1663060704518199?thread_ts=1663060383.191659&cid=C03GBMYGZD4

  • Sep 20, 2022 a sentry monitoring was added for duplicates from GMC.

Root Cause(s)

  • We have received duplicate doctors sent by GMC with same gmc ids for HEE South West (1-AIIDMQ). The GMC id is 7134553 . The doctors is being sent to us twice, which previously never happened and is not expected. 

  • Based on the assumption that GMC won’t send us duplicates, Profile service use GMC number as key of Map without filtering out duplicate GMC numbers, which causes the error.Some things happened

  • 02:26 to 08:07 - Queue to recommendation for ‘doctor view’ update built steadily to ~83K

  • -

  • - Some things happened

  • 08:21 - First report in user channel

  • 12:07 - Picked up for investigation

  • 12:07 to - Checked database & ElasticSearch index

  • - Some things happened

  • - Some things happened

  • - Some things happened

  • - Some things happened

  • .

...

Root Cause(s)

  • Doctors reported as not showing in the search list

  • ElasticSearch Index for Recommendation Service isn’t being updated

  • Large backlog of messages stuck on a queue for updating the index

...

Action Items

Check if this issue affects new revalidation

Action Items

Comments

Owner

Send an email or message to check with GMC

Rob Pink

Do we want to improve the CI CD process for TIS-GMC-SYNC?

The effort may not worthy to fix this ATM

Investigate why the exceptions were not recorded in Profile Cloudwatch logs

We need a ticket to look into this. We came to the issue from Sentry

Jira Legacy
serverSystem JIRA
serverId4c843cd5-e5a9-329d-ae88-66091fcfe3c7
keyTIS21-3426

Yafang Deng

Do we need to double check what does the new Reval Sync do with the duplicates received from GMC?

Jayanta Saha Cai Willis

Isolate duplicates in TIS-GMC-SYNC service

Done

Fix isolate duplicates from gmc by hihilary · Pull Request #30 · Health-Education-England/TIS-GMC-SYNC (github.com)

Yafang Deng Jayanta Saha

Create Sentry alert for TIS-GMC-SYNC to capture logging when GMC sends duplicates

Done

Jira Legacy
serverSystem JIRA
serverId4c843cd5-e5a9-329d-ae88-66091fcfe3c7
keyTIS21-3513

Yafang Deng Jayanta Saha

...

Lessons Learned

...

Add more debug logging if current logging are not enough to identify the cause.

...