Background

Reval Recommendation/Connection list are showing the data from EleasticSearch (indices: recommendationindex & masterdoctorindex).
A full sync will sync the current data and insert into ES indices.
After the full sync:
when there’s an update on TIS, TIS will sync the single doctor info to Reval
Every minute, CDC process will pick up the changes in Reval DB (DoctorsForDB & Recommendation collections) and propagate the changes to ElasticSearch.
Data source are:
TIS data
Revalidation data

Current issue

Reval data sometimes seem to be unreliable to admin users.

Why?
[Cai’s notes:
1. reval is “passive” so if an update is missed/a message fails, there’s no mechanism to refetch
2. Some unknown issue is resulting in duplicates - probably some bug in the upsert logic
]
Is this mostly a TIS issue? How do we know?

Trigger a full resync

Data flow is as below (embeded from Elastic Search Rebuild Sync Job ):

How long does it take?
Taking 19/03/2025 for example, it took 1 hour and 45 mins (16:50 - 18:35) to clear the messages in reval.queue.connection.syncdata, and it took about 4 hours and 40 mins to clear the messages in the SQS queue tis-revalidation-sync-gmc-queue-prod (18:37 - 23:17)
Monitoring:
https://monitoring.tis.nhs.uk/grafana/d/Upe4ssXMk/rabbitmq-metrics-from-rabbitmq-exporter-granular?from=2025-03-19T16:00:00.000Z&to=2025-03-19T23:59:59.000Z&timezone=browser&var-datasource=e7UUtnuMk&var-environment=PROD&var-service=reval&refresh=5s
https://eu-west-2.console.aws.amazon.com/sqs/v3/home?region=eu-west-2#/queues/https%3A%2F%2Fsqs.eu-west-2.amazonaws.com%2F430723991443%2Ftis-revalidation-sync-gmc-queue-prod
discrepancies alias still needs to be added manually
So we need to wait for a few hours for the sync process to finish and then add the alias.

[Cai’s notes: it’s worth noting that the “full” sync system shares a lot of components with the CDC sync system, so improving efficiency in these areas will also improve the general running of the system

Periodically run Reval resync as a scheduled job
On TIS, every night we do a sync from TIS DB to ES person index to make sure TIS person list is updated. If we can reduce the sync time for Reval, we probably will be able to do Reval sync periodically.
Challenge: how to reduce Reval sync time? - Batch processing [Cai’s notes: SQS processing takes the majority of the time, if we introduced batch processing it may cut this down a huge amount, as seen in ESR]
Get rid of recommendationindex and user another alias on masterdoctorindex for recommendation list (https://hee-tis.atlassian.net/browse/TIS21-5482 )
Advantage: no need to maintain 2 indices of duplicate data
[Cai’s notes: Also this contributes a lot to the amount of time it takes to run the sync, and introduces additional complexity and points of failure for messages]
Disadvantage: loss of flexibilities
Support single doctor resyncing from TIS? - is this already suppported? [Cai’s notes: Yes in that the query used to fetch tis data already allows for gmcnumber(s) to be specified]

https://hee-tis.atlassian.net/browse/TIS21-6275 - this will hopefully remove the perception of “lag”
Refactoring the upsert logic in integration service (logic for matching tis id vs gmc id, null and empty checks, use of transactions)