Linking programme memberships and FormRs
Trainees must now manually link their FormR submissions to specific programme memberships, and indicate whether the form is submitted as a new starter or for ARCP. There are fairly large numbers of forms that were submitted before this feature was included that consequently lack this linkage information. This makes reporting more complex, and hinders attempts to improve the FormR submission process and compliance.
This document comprises the outcomes of an investigation into the automation of the form R linking, retrospectively and in the future.
Outputs of investigation
Rules to link Form R to Programme with confidence measure
Rough numbers in each confidence level of the above
Rules to link Form R to ARCP vs New Starter with confidence measure
Rough numbers in each confidence level of the above
Summary
Using the two-step matching process described in more detail below, 82% of FormRs can be linked with no obvious errors. This leaves ~46 000 FormRs that cannot be linked automatically.
Additional steps to link ~95% of FormRs can be added (leaving ~12 000 unlinked) with error rates estimated to be 0.02% (projected to 50 mis-linked forms).
Certain explicitly linked forms were tagged as dubious and ignored for the purposes of estimating error. These were forms linked to programme memberships that had finished more than 18 months ago, had started more than 18 months ago but were linked in a non-ARCP capacity, where the programme membership no longer exists, where (for PartA) the form start and end dates and specialty match a different programme membership to the one linked to, etc. These attempts to identify and mitigate against user-error would need to be reviewed, as would the set of linking rules applied.
ARCP / non-ARCP metrics have not yet been assessed: given our sample data excludes the peak period for ARCPs, this might need to be more strongly rule-based as opposed to modelled off existing form linkages. While all trainee-linked forms for ARCP were submitted on or after the start of the programme, so were a significant percentage of the non-ARCP forms.
Approach
FormR and TIS data as of early Oct 2024 were retrieved and used to evaluate potential methods of linking FormRs and programme memberships. Both submitted and draft forms were assessed.
Given that a subset of the these FormRs had been manually linked by trainees to particular programme memberships, these manual linkages were used to assess the potential accuracy of automated linkages. If the programme membership automatically selected for a given FormR was the same as that chosen by the trainee, then it was considered a ‘correct’ linkage; if it differed then it was considered erroneous. Two factors made this assumption less clear-cut:
Trainees manual form linkages are not infallible, and in some cases appear to be incorrect
The subset of manually linked forms do not comprise a particularly large number, are limited to a specific recent time period, and are not necessarily representative of ‘business as usual’ (in particular, they do not cover the ARCP-intensive period of Apr-May).
Nevertheless, having manually linked forms allowed us to take some basic rules for automated linking and to refine these with respect to edge-cases and other scenarios not initially considered.
An iterative ‘divide-and-conquer' approach was used to establish automated linkages. This permitted more clear-cut scenarios to be dealt with before handling more complex situations, without having to deal with every eventuality in a single step. A disadvantage of this method is that no overarching rule is available, and that the linkages established are highly dependent on the sequencing of the steps involved: later steps are implicitly dependent on earlier steps having already dealt with forms that the rules of the later steps do not address (or would link incorrectly).
Pre-processing and standardisation
FormR local office names are captured manually by trainees, and as such may vary from the ‘canonical’ name. In addition, official local office names have recently been updated as part of the standardisation to merge Health Education England into the NHS proper.
Details of the rules for standardising FormR local offices are item A.
To make TIS programmes and specialties more consistent, TIS programme names were standardised and assigned a parent as defined in items B1 and B2.
Linking rules
A. Trainee has only a single programme membership
Rule:
If a trainee has only one programme membership, then any forms they have submitted must relate to that programme membership. SQL.
Matches:
FormR PartA: 33271 (27.9% of test dataset); 798 manually linked forms
FormR PartB: 40381 (28.6% of test dataset); 1310 manually linked forms
Errors:
7 trainee-linked forms referred to programme memberships that no longer exist, and were mis-linked to new programme memberships. As such, any rule to link forms in this way must allow for the possibility of programme memberships being deleted after having being linked to a form. This may best be handled by automatically delinking any form referring to a programme membership that is being deleted.
B. One matching programme membership active within 1 year before and 4 month after submission date
Rule:
Trainee has only one current programme membership in the time-period window of interest (from 12 months before form submission date to 4 months after form submission). SQL.
Matches:
FormR PartA: 62934 (52.8% of test dataset); 1788 manually linked forms
FormR PartB: 76637 (54.2% of test dataset); 2542 manually linked forms
Errors:
None detected.
C. One matching programme membership for submission local office and specialty active within 1 year before and 4 month after submission date
Rule:
Trainee has only one current programme membership with matched local office and specialty in the time-period window of interest (from 12 months before form submission date to 4 months after form submission), with matching local office and programme details. SQL.
Matches:
FormR PartA: 17156 (14.6% of test dataset); 629 manually linked forms
FormR PartB: 17886 (15.3% of test dataset); 659 manually linked forms
Errors:
27, of which 4 were not captured as trainee errors. However, inspecting these errors, at least half appear to trainee error after all:
FormR B (ambiguous)
ID 94f9360d-39f3-47d6-a4cf-ffdda6daf570
trainee 151236
submitted 27-09-2024
LO North Central and East London
programme specialty: ACCS Emergency Medicine
linked to programme membership 8cda9360-2b22-46d9-839e-6a9df2904100 specialty Emergency Medicine.
autolinked to e9829933-d83d-11ec-9eb2-0638a616fc76 specialty ACCS.
FormR B (trainee linked to PM with wrong LO)
ID 3439addb-29f2-46d9-933b-babb9b7f7fe1
trainee 264960
submitted 01-09-2024
LO Health Education England North East
programme specialty: Paediatrics
linked to programme membership 8d815dfa-84e1-4515-a7a9-89958758b33b specialty Paediatrics but LO Yorkshire and Humber
autolinked to dffa0afc-7fc3-4c8d-b4a9-25145e5a90d1 specialty Paediatrics, LO North East
FormR a
ID b0e02e9f-f096-4908-a12a-a9cc0c8d1dd9
trainee 287773
submitted 16-08-2024
LO Health Education England East of England
programme specialty: Internal Medicine Training Stage One (cct1: Renal medicine)
linked to programme membership e26a2708-e323-4c71-9d03-7342059a592a specialty Renal medicine, LO East of England
autolinked to eb31441e-d83d-11ec-9eb2-0638a616fc76 specialty Internal Medicine Training Stage 1, LO East of England
FormR a (trainee linked to PM with wrong LO)
ID e29c8357-b4c7-483d-af9b-5edbbad6ca35
trainee 292919
submitted 08-09-2024
LO Yorkshire and the Humber
programme specialty: Geriatric Medicine
linked to programme membership cadfa7f6-79f0-41d5-8508-ffdb104e0e9e specialty Geriatric medicine, LO East of England
autolinked to f11e2738-2aa3-4525-9863-f7f9d32ae898 specialty Geriatric Medicine, LO Yorkshire and the Humber
Related content
Slack: https://hee-nhs-tis.slack.com/
Jira issues: https://hee-tis.atlassian.net/issues/?filter=14213