2023-06-26 Recommendations failing to submit to GMC

Date

Jun 26, 2023

Authors

@Steven Howard @Yafang Deng

Status

Documenting https://hee-tis.atlassian.net/browse/TIS21-4706

Summary

 

Impact

Staff would have had to resubmit recommendations after the fix was introduced.

Non-technical Description

 


Trigger

  • When a Reval admin submits a Recommendation, a success message is shown in the UI followed by an error, but no submission is made to GMC


Detection

  • User reported issue on teams


Resolution

  •  


Timeline

All times in BST unless indicated

  • Jun 26, 2023 09:16 : User reports on Teams
    I have tried to make a revalidation recommendation in TIS this morning and got 'An error occurred! - Please try again'. i have checked GMC Connect and it looks like it has gone though, can you advise if there are any issues.

  • Jun 26, 2023 10:15 : User contacted requesting GMC number

  • Jun 26, 2023 11:19 : Call with user to demo the issue

  • Jun 26, 2023 11:49 : Huddle to debug and understand if isolated or system wide

  • Jun 26, 2023 13:00 : Identify that requests to profile service from integration service are not appearing in logs so redeployed integration service

  • Jun 26, 2023 13:10 : Call with user to test. Still broken.

  • Jun 26, 2023 13:30 : Other admins confirm they are also experiencing the same issue.

  • Jun 26, 2023 15:30 : Checked API-Gateway logs and found no logs for submitting recommendations on Monday.

  • Jun 26, 2023 16:00 : Found there were lots of timeouts in prod-revalidation-api-gateway-authoriser , then updated it from 3s to 10s.

  • Jun 26, 2023 19:30 : Updated the version of node to run Lambda prod-revalidation-api-gateway-authoriser from 12 to 16.

  • Jun 27, 2023 08:45 : Checked awslogs-prod-tis-revalidation-recommendations logs for "submitting request to GMC for recommendation:" which were showing this morning

  • Jun 27, 2023 09:00 : Verified with users that recommendations were being submitted.

5 Whys (or other analysis of Root Cause)

  1. Why was the user seeing both a success and fail message on submission of recommendation?
    The error handling on the client for submission to GMC method is not set up correctly and will always show both messages whenever there is an error

  2. Why was the user seeing the error message in the browser?
    An error was thrown in the client application on calling the submit to GMC endpoint but no details of the error were captured by the FE
    Recommendation service contained no logs for submitting recommendations to GMC

  3. Why were no logs showing for submissions in recommendation, profile or integration services?
    Requests were failing at the API Gateway and not reaching the other services - hence nothing to log.

  4. Why were request for submission failing at API Gateway?
    Requests to prod-revalidation-api-gateway-authoriser Lambda were reaching the 3 second timeout limit. However, all other requests to this function were not failing, just submissions.

  5. Why were just submissions crashing in the Lambda and not saving, for example?
    ????



Action Items

Action Items

Owner

Comments

Action Items

Owner

Comments

Fix error handling in UI for submissions to GMC

@Steven Howard

 

Extend logging in Lambda authoriser

@Yafang Deng

 

Set an alarm when authoriser is approaching/hitting the timeout maximum

 

 

 

 

 

 


Lessons Learned

  •