Trainee Profile database indexing baseline tests
Experiment plan and results | Trainee Self Service performance: with and without database indexing |
---|---|
Experiment owner | @Reuben Roberts |
Reviewers | @Andy Nash (Unlicensed) |
Approver | |
Optimizely link |
|
Jira ticket(s) | |
Status | In review / In progress / Complete |
On this page |
Stakeholder summary
Adding an index to the Trainee Details database improved overall speed to retrieve a trainee profile by a factor of two.
Experiment planning
Overview
This was a baselining experiment to determine the performance impact of database indexing for the Trainee Details component of TIS Self-Service.
Improved performance would mean that trainee users would be able to see their data more quickly, as this would reduce the load time of the Trainee Details interface.
Improved performance would increase the rate at which changes from other parts of the TIS system can be synchronised with the Trainee Self-Service component, reducing the possibility of confusion resulting from contradictory trainee details appearing in different parts of the TIS system.
Hypothesis
We hypothesize that adding a database index
will decrease the time taken to retrieve a Trainee Profile from the database
Metrics
Time to retrieve a complete Trainee Profile using the Trainee Details APIprimary metric
Database-level processing time for the above secondary metric
Targeting
This experiment would impact
Trainee sync (tis-trainee-sync)
Trainee User Interface (tis-trainee-ui)
Environment
Profiling was conducted on a Dell running Windows 10 and equipped with an Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz and 32.0GB RAM. No specific standardisation was applied to system resources available to the software, since the profiling was intended to be comparative (i.e. A/B).
The MongoDB database server was running locally, with 79 602 TraineeProfile records, equivalent in number to the Trainee Self-Service staging database at the point the tests were run.
Test design
Testing was conducted with the database state modified:
without the traineeTisId field indexed (test A) and
with a regular ascending index applied to the traineeTisId field (test B).
200 trainee TIS Ids were selected at random from the complete set, and the Trainee Details API was called iteratively to retrieve trainee details for each trainee TIS Id. (Result 1, below)
CPU profiling using Visual VM was used to identify wait time, to identify when the application was waiting for a response from the database. (Result 2, below)
MongoDB Query Performance was assessed using the 'Explain Plan' output in MongoDB Compass for a query to retrieve a single TraineeProfile document using traineeTisId as the filter. (Result 3, below)
Notes
It is necessary to allow a newly created database index to stabilise for a minute or two before running tests, otherwise performance metrics will be misleading.
Results
Test A: (Without index)
Retrieving 200 random trainee profiles via the API: 87.4s (+/- 4.8s) (n=5)
getTraineeProfileByTraineeTisId waiting time: 985ms
Query fetch execution time: 5ms (documents examined: 79602)
Test B: (With index)
Retrieving 200 random trainee profiles via the API: 38.6s (+/- 6.3s) (n=5)
getTraineeProfileByTraineeTisId waiting time: 75ms
Query fetch execution time: 0ms (documents examined: 1)
| A: Control | B: Variation | Change |
---|---|---|---|
Retrieving profiles via API | 87.4s | 38.6s | -56% |
getTraineeProfileByTraineeTisId waiting time | 985ms | 75ms | -92% |
Query fetch execution time | 5ms | 0ms | -100% |
Conclusions
Highlights
Overall speed to retrieve a trainee profile using the API using the indexed database was over twice that of the unindexed database.
Database-level metrics of performance were even better than this: the difference reflects the fact that the API performs other work apart from database access. This work is unaffected by the indexing improvements.
Takeaways
Given the low amount of effort required, adding an index to the database is a cost-effective way of improving performance.
Follow-up
Roll-out database index to staging and production environments
Slack: https://hee-nhs-tis.slack.com/
Jira issues: https://hee-tis.atlassian.net/issues/?filter=14213