/
Trainee Profile database indexing baseline tests

Trainee Profile database indexing baseline tests

 

Experiment plan and results

Trainee Self Service performance: with and without database indexing

Experiment owner

@Reuben Roberts

Reviewers

@Andy Nash (Unlicensed)

Approver

Optimizely link

 

Jira ticket(s)

https://hee-tis.atlassian.net/browse/TIS21-1274

Status

In review / In progress / Complete

On this page

 

Stakeholder summary

Adding an index to the Trainee Details database improved overall speed to retrieve a trainee profile by a factor of two.

 

 

Experiment planning

Overview

This was a baselining experiment to determine the performance impact of database indexing for the Trainee Details component of TIS Self-Service.

  • Improved performance would mean that trainee users would be able to see their data more quickly, as this would reduce the load time of the Trainee Details interface.

  • Improved performance would increase the rate at which changes from other parts of the TIS system can be synchronised with the Trainee Self-Service component, reducing the possibility of confusion resulting from contradictory trainee details appearing in different parts of the TIS system.

Hypothesis

We hypothesize that adding a database index

will decrease the time taken to retrieve a Trainee Profile from the database

Metrics

  • Time to retrieve a complete Trainee Profile using the Trainee Details APIprimary metric

  • Database-level processing time for the above secondary metric

Targeting

This experiment would impact

  • Trainee sync (tis-trainee-sync)

  • Trainee User Interface (tis-trainee-ui)

Environment

Profiling was conducted on a Dell running Windows 10 and equipped with an Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz and 32.0GB RAM. No specific standardisation was applied to system resources available to the software, since the profiling was intended to be comparative (i.e. A/B).

The MongoDB database server was running locally, with 79 602 TraineeProfile records, equivalent in number to the Trainee Self-Service staging database at the point the tests were run.

Test design

Testing was conducted with the database state modified:

  • without the traineeTisId field indexed (test A) and

  • with a regular ascending index applied to the traineeTisId field (test B).

200 trainee TIS Ids were selected at random from the complete set, and the Trainee Details API was called iteratively to retrieve trainee details for each trainee TIS Id. (Result 1, below)

CPU profiling using Visual VM was used to identify wait time, to identify when the application was waiting for a response from the database. (Result 2, below)

MongoDB Query Performance was assessed using the 'Explain Plan' output in MongoDB Compass for a query to retrieve a single TraineeProfile document using traineeTisId as the filter. (Result 3, below)

Notes

It is necessary to allow a newly created database index to stabilise for a minute or two before running tests, otherwise performance metrics will be misleading.

 

Results

Test A: (Without index)

  1. Retrieving 200 random trainee profiles via the API: 87.4s (+/- 4.8s) (n=5)

  2. getTraineeProfileByTraineeTisId waiting time: 985ms

  3. Query fetch execution time: 5ms (documents examined: 79602)

Test B: (With index)

  1. Retrieving 200 random trainee profiles via the API: 38.6s (+/- 6.3s) (n=5)

  2. getTraineeProfileByTraineeTisId waiting time: 75ms

  3. Query fetch execution time: 0ms (documents examined: 1)

 

A: Control

B: Variation

Change

 

A: Control

B: Variation

Change

Retrieving profiles via API

87.4s

38.6s

-56%

getTraineeProfileByTraineeTisId waiting time

985ms

75ms

-92%

Query fetch execution time

5ms

0ms

-100%

 

Conclusions

Highlights

  • Overall speed to retrieve a trainee profile using the API using the indexed database was over twice that of the unindexed database.

  • Database-level metrics of performance were even better than this: the difference reflects the fact that the API performs other work apart from database access. This work is unaffected by the indexing improvements.

Takeaways

  • Given the low amount of effort required, adding an index to the database is a cost-effective way of improving performance.

Follow-up

  • Roll-out database index to staging and production environments