Prevent repeated submission

This issue was raise in TIS21-2837 Prevent creation of multiple matching placements when user repeatedly clicks save.

BE investigation

Database unique index:

The unique index can be used to stop inserting the duplicate records in DB, but it’s not good to rely on the DB to do the validation.

pros:

  • Simple and easy to implement.

cons:

  • If the duplidate records are allowed in the table, this method is not applicable.

  • When the data reach database, it means backend has already validated them. And If the data in the request need to be saved in multiple tables, the whole transaction should be rolled back. So this means it might cause a waste of backend resources.

Validation in service:

Locking method:

When backend receives a request from the frontend, it generates a key (eg. URL + IP + user ID or name), and caches it with a expiry time (eg. 3s). When the key exists, the endpoint is locked for this particular user, and backend rejects the following repeated requests.

The key:

It’s very important to figure out the rule how the key is generated, because it means what is identified as repeated submission.

We probably can’t use the hash of the json object in request body as part of the key. Because some of the data can be auto-generated by the frontend, eg. timestamp.

pros:

  • No frontend changes are needed.

cons:

  • For server-side requests, the expiry time is too long, so when the user is recognised as the machine user, the expiry time needs to be set separately or the checking needs to be skipped.

Token method:

The procedure is

  1. when the form is loading, frontend sends a POST request /api/token to get a token.

  2. backend service generates the token, and caches it with a expiry time, then return it to frontend.

  3. when the form is submitted, frontend sends the request with the token in the header.

  4. backend service checks the token first, if the token exists, backend deletes it from the cache and saves the form. If it doesn’t exist, rejects the submission.

Pros:

  • The token can be a ramdom uuid (or with some prefix).

  • This method may prevent requests by network retransmissions. (not sure if this scenario exists)

Cons:

  • An additional request is required to get the token, no matter for the frontend or server-side requests (server-side can be skipped too).

About cache - why Redis:

On the one hand, The key/token is only used for a short period of time, so there’s no need to persist them, which makes MySQL not a good option.

One the other hand, we use load balancer / blue-green strategy to distribute user requests, so other cache options such as HttpSession, Spring cache seem to be not applicable.

So Redis is probably a good choice. Redis is a in-memory database which can be shared by all other services.

Elasticache - Redis

This is an option to set up a Redis on AWS.

An implementation of Token method

https://github.com/Health-Education-England/TIS-TCS/tree/feat/investigate_noRepeatSubmit

How to test?

  • install Redis on local machine:

Redis is not officially supported on Windows. To install Redis on Windows, you'll first need to enable WSL2 and install a Ubuntu subsystem on Windows first. Then install Redis on Ubuntu, refer here: Install Redis on Windows | Redis

  • run the tcs service and use Postman to send a Post request (http://localhost/tcs/api/token) to get the token. Then send another Post request (http://localhost/tcs/api/placements) with this token in the header, header name is PPS. If the token is not set, you will get an error, and if the token is used, you will get an error too.

  • If you want to prevent repeated submission on other endpoint, just add a @NoRepeatSubmit annotation on it.

Code structure:

  • Config the Redis in the project:

In application.yml, we need to configure the host (on local, this is localhost) and the port (6397 by default). Then, other configs are in RedisConfig class.

  • why not using Elasticache directly for this investigation?

Because by default we cannot access elasticache clusters from outside, how to enable this: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/accessing-elasticache.html#access-from-outside-aws.

  • Add a RedisRepository to do operations on Redis

  • Add a TokenService to create/check the token. The Token class is just a placeholder atm.

The expiry of the token is taken care of Redis itself. When it expires, it will be removed from Redis. But if we don’t put the token in the root key, we will need to handle the expiry of the token all by ourselves. Some explanation can be found here: EXPIRE | Redis.

  • Add the annotaion @NoRepeatSubmit config and the NoRepeatSubmitInterceptor to deal with the request where the validation is required.

Another implementation:

API-gateway + lambda, an example is here: TIS-Lambda/lambda_function.py at main · Health-Education-England/TIS-Lambda (github.com)