Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Authors

Steven Howard Yafang Deng

Status

Documenting

Jira Legacy
serverSystem JIRA
serverId4c843cd5-e5a9-329d-ae88-66091fcfe3c7
keyTIS21-4706

Summary

Impact

Staff would have had to resubmit recommendations after the fix was introduced.

Table of Contents

Non-technical Description

...

  • When a Reval admin submits a Recommendation, a success message is shown in the UI followed by an error, but not no submission is made to GMC

...

  • 09:16 : User reports on Teams
    I have tried to make a revalidation recommendation in TIS this morning and got 'An error occurred! - Please try again'. i have checked GMC Connect and it looks like it has gone though, can you advise if there are any issues.

  • 10:15 : User contacted requesting GMC number

  • 11:19 : Call with user to demo the issue

  • 11:49 : Huddle to debug and understand if isolated or system wide

  • 13:00 : Identify that requests to profile service from integration service are not appearing in logs so redeployed integration service

  • 13:10 : Call with user to test. Still broken.

  • 13:30 : Other admins confirm they are also experiencing the same issue.

  • 15:30 : Checked API-Gateway logs and found no logs for submitting recommendations on Monday.

  • 16:00 : Found there were lots of timeouts in prod-revalidation-api-gateway-authoriser , then updated it from 3s to 10s.

  • 19:30 : Updated the version of node to run Lambda prod-revalidation-api-gateway-authoriser from 12 to 16.

  • 08:45 : Checked awslogs-prod-tis-revalidation-recommendations logs for "submitting request to GMC for recommendation:" which were showing this morning

  • 09:00 : Verified with users that recommendations were being submitted.

5 Whys (or other analysis of Root Cause)

  1. Why was the user seeing both a success and fail message on submission of recommendation?
    The error handling on the client for submission to GMC method is not set up correctly and will always show both messages whenever there is an error

  2. Why was the user seeing the error message in the browser?
    An error was thrown in the client application on calling the submit to GMC endpoint but no details of the error were captured by the FE
    Recommendation service contained no logs for submitting recommendations to GMC

  3. Why were no logs showing for submissions in recommendation, profile or integration services?
    Requests were failing at the API Gateway and not reaching the other services - hence nothing to log.

  4. Why were request for submission failing at API Gateway?
    Requests to prod-revalidation-api-gateway-authoriser Lambda were reaching the 3 second timeout limit. However, all other requests to this function were not failing, just submissions.

  5. Why were just submissions crashing in the Lambda and not saving, for example?
    ????

...

Action Items

Action Items

Owner

Comments

Fix error handling in UI for submissions to GMC

Steven Howard

Extend logging in Lambda authoriser

Yafang Deng

Set an alarm when authoriser is approaching/hitting the timeout maximum

...

Lessons Learned