About
This explains the process for identifying and deleting placement or programme memberships which still exist in TIS Self-Service after having been deleted from TIS. It is unoptimised, requires several manual steps and includes workarounds that may have better alternatives.
But, it does the job in a pinch.
Steps
1. Get a list of IDs from TIS
A simple SQL query can be used, with the results exported to a CSV file, this must be done against the DB directly as Metabase limits the export to 1,048,576 rows.
Placement
SELECT traineeId, id AS tisId FROM tcs.Placement ORDER BY traineeId ASC, placementId ASC
Programme Membership
SELECT personId AS traineeId, uuid AS tisId FROM tcs.ProgrammeMembership ORDER BY traineeId ASC, programmeMembershipId ASC
2. Get a list of IDs from TSS
This is a little trickier to get, a Mongo shell script could be used to simplify it but this approach uses Metabase instead. Use the following query and export the result to CSV named metabase_tss_ids.csv
.
Placement
[{"$project":{"_id": 0, "traineeId":"$traineeTisId","tisId":"$placements.tisId"}}]
Programme Membership
[{"$project":{"_id": 0, "traineeId":"$traineeTisId","tisId":"$programmeMemberships.tisId"}}]
You’ll get a CSV with the IDs in an array.
traineeId,tisId 315594,"[""ae5350af-2a3c-4439-b118-f57ea1d8bc94""]" 140114,"[""e92a313c-d83d-11ec-9eb2-0638a616fc76"" ""e92a3259-d83d-11ec-9eb2-0638a616fc76""]"
The following script will expand the array in to individual rows, change the input filename as needed and send the output to a new CSV file ./my_script_file.sh > expanded_tss_ids.csv
#! /bin/bash echo "traineeId,tisId" while IFS="," read -r traineeId tisIds do tisIds=${tisIds//[\[\]]/''} tisIds=${tisIds//[\"]/''} tisIds=${tisIds//[ ]/','} IFS="," read -ra array <<< "$tisIds" for tisId in "${array[@]}" do echo "$traineeId,$tisId" done done < <(tail -n +2 metabase_tss_ids.csv)
3. Import to SQL database
Spin up a local MySQL instance (or use the TSS local DB from the handbook) and create two tables, one for the IDs from TIS and the other for the IDs from TSS.
Placement
CREATE TABLE `tcs_placement_ids` ( `traineeId` int NOT NULL, `tisId` int NOT NULL ); CREATE TABLE `tss_placement_ids` ( `traineeId` int NOT NULL, `tisId` int NOT NULL );
Programme Membership
CREATE TABLE `tcs_pm_ids` ( `traineeId` int NOT NULL, `programmeMembershipId` varchar(36) NOT NULL ); CREATE TABLE `tss_pm_ids` ( `traineeId` int NOT NULL, `programmeMembershipId` varchar(36) NOT NULL );
Then import the previously exported CSVs in to the appropriate tables.
4. Query for Orphans
Run the query to identify orphaned placements/PM
Placement
SELECT traineeId, tisId FROM ( SELECT tss.traineeId, tss.tisId FROM tss_placement_ids tss WHERE tss.traineeId > 0 AND tss.tisId NOT IN (SELECT tisId FROM tcs_placement_ids tcs) GROUP BY tss.traineeId, tss.tisId ) AS distinct_results ORDER BY traineeId ASC, tisId ASC;
Programme Membership
SELECT traineeId, tisId FROM ( SELECT tss.traineeId, tss.tisId FROM tss_pm_ids tss WHERE tss.traineeId > 0 AND tss.tisId NOT IN (SELECT tisId FROM tcs_pm_ids tcs) GROUP BY tss.traineeId, tss.tisId ) AS distinct_results ORDER BY traineeId ASC, tisId ASC;
Export the result to a new CSV named orphaned_ids.csv
5. (Risky Approach) Publish Messages
At the time of writing the below scripts can be used to publish suitable message for Placement and PM deletions, check that the structure is still valid before use and the queue name is correct.
An optional, but recommended, buffer queue can be used if publishing directly to the live queue is undesirable
Placement
#! /bin/bash while IFS="," read -r traineeId tisId do trimmedTisId=$(echo $tisId| xargs) trimmedTraineeId=$(echo $traineeId | xargs) aws sqs send-message --region eu-west-2 --queue-url https://sqs.eu-west-2.amazonaws.com/430723991443/tis-trainee-sync-prod-record.fifo --message-group-id corrective_delete --message-body '{"data":{"id":"'"$trimmedTisId"'","traineeId":"'"$trimmedTraineeId"'"},"metadata":{"record-type":"data","partition-key-type":"schema-table","schema-name":"tcs","table-name":"Placement","operation":"delete"}}' done < <(tail -n +2 orphaned_ids.csv)
Programme Membership
#! /bin/bash while IFS="," read -r traineeId tisId do trimmedTisId=$(echo $tisId | xargs) trimmedTraineeId=$(echo $traineeId | xargs) aws sqs send-message --region eu-west-2 --queue-url https://sqs.eu-west-2.amazonaws.com/430723991443/tis-trainee-sync-prod-record.fifo --message-group-id corrective_delete --message-body '{"data":{"uuid":"'"$trimmedTisId"'","personId":"'"$trimmedTraineeId"'"},"metadata":{"record-type":"data","partition-key-type":"schema-table","schema-name":"tcs","table-name":"ProgrammeMembership","operation":"delete"}}' done < <(tail -n +2 orphaned_ids.csv)
5. (Safer Approach) Publish Messages Using “Buffer” Queue
A temporary queue can be created as a target for the publish, this allows all messages to be collected and verified before they are sent to the live system. Publishing from a local machine is a slow process, so it’s recommended to use the buffer queue to avoid any issues if the process is interrupted and has to be restarted.
Create a queue in SQS (since this is temporary creating it in the console is fine, just clean up when you’ve done!) and use the same script with your temp queue name instead.
Placement
#! /bin/bash while IFS="," read -r traineeId tisId do trimmedTisId=$(echo $tisId| xargs) trimmedTraineeId=$(echo $traineeId | xargs) aws sqs send-message --region eu-west-2 --queue-url https://sqs.eu-west-2.amazonaws.com/430723991443/temp_my-buffer-queue.fifo --message-group-id corrective_delete --message-body '{"data":{"id":"'"$trimmedTisId"'","traineeId":"'"$trimmedTraineeId"'"},"metadata":{"record-type":"data","partition-key-type":"schema-table","schema-name":"tcs","table-name":"Placement","operation":"delete"}}' done < <(tail -n +2 orphaned_ids.csv)
Programme Membership
#! /bin/bash while IFS="," read -r traineeId tisId do trimmedTisId=$(echo $tisId | xargs) trimmedTraineeId=$(echo $traineeId | xargs) aws sqs send-message --region eu-west-2 --queue-url https://sqs.eu-west-2.amazonaws.com/430723991443/temp_my-buffer-queue.fifo --message-group-id corrective_delete --message-body '{"data":{"uuid":"'"$trimmedTisId"'","personId":"'"$trimmedTraineeId"'"},"metadata":{"record-type":"data","partition-key-type":"schema-table","schema-name":"tcs","table-name":"ProgrammeMembership","operation":"delete"}}' done < <(tail -n +2 orphaned_ids.csv)
Once messages have been published you can use the console to receive messages to verify them.
To send them to the live queue we’ll use some DLQ trickery, setting the “target” live queue as the DLQ for our temporary queue. Then using receives to trigger the messages to be sent to the “DLQ”.
When you are ready to send the messages to the real queue, set the following in the queue’s config
Visibility timeout: 1 second
DLQ:
Max receives: 1
Queue ARN: your target queue
In the “Send and receive messages” section of the queue in the AWS console set the polling settings
Poll duration: 90 seconds
Maximum message count: 999
This combinations of settings will ensure that we receives as many messages as quickly as possible, after each receive the message will instantly be sent to the “DLQ”.
Now just click “Poll for messages” and watch as 999 messages are sent to your target queue, repeat as needed until the buffer queue is empty.
Add Comment