STILL WIP
Expand | ||
---|---|---|
| ||
Problems:
• Hypothesis: It would all be much simpler if we were copying all programme memberships separately
Principals:
Solutions:
Three ways of caching data, the first might not be viable because of connection specific info:
|
...
The approach of “pre-sorting” the data was also fine before as the exact same code was used for CDC and the ES Resync job. However, in order to repeat the massive time saving we achieved in
Jira Legacy | ||||||
---|---|---|---|---|---|---|
|
Summary
Having multiple indexes makes GET requests simpler
performance has been raised as a potential benefit, but when more complex queries on large data sets take less than a second it’s questionable how much benefit this would really give.
Multiple indexes means duplicating data
Multiple indexes makes requires multiple updates for a single data change
Because we have separate CDC and Resync processes, and because the Java approach is prohibitively slow for the Resync process, we would have to write and maintain the business logic in separate places in separate languages
Tasks to complete TIS21-3774 with this approach
Copy implementation for reindexing used by recommendations
Implement elasticsearch query to match java logic in CDC process for use as part of reindex api call
Clearly document that any changes made to the CDC logic in Connections must be repeated in Integration (is there something fancy we can do in github?)
BONUS: figure out if we can run the reindex for connections and recommendation separately (might be impractical)?
...