2024-04-05 Reval system was reported slow.

Date

Apr 5, 2024

Authors

@Cai Willis @Yafang Deng @Jayanta Saha @Steven Howard

Status

Documenting

Summary

the system is really slow and when I try to select records they don't open sometimes and the error below has been appearing for a few days

Impact

Users were unable to retrieve information from Recommendation and Connection detail list as the system was reported slow.

Non-technical Description

Revalidation  system is really slow when users tries to  select records they don't open sometimes an error message occurs" oops something went wrong" which has been appearing for a few days. A user in a different region states it is clearly a speed / data retrieval issue. Three users from three different had raised this issue via the UR Teams Channel.(North West, South East and South West).


Trigger

Core service was failing but unclear of the cause.


Detection

  • User raised the issue on Teams


Resolution

We restarted the core service and all appeared well soon after.


Timeline

 

  • Apr 5, 2024 10:04 - Message received from user to inform that the application is slow followed with error messages

  • Apr 8, 2024 10:06 - First responder informed that the Team did work on this and requested if users could report if performance is still impacted.

  • Apr 8, 2024 15:22 - another user from a different Region reported still having the same issue

  • Apr 8, 2024 15:25 - User raised “clearly a speed / data retrieval issue - we are having this problem also, even using a direct URL is incredibly slow”.

  • Apr 10, 202411:00 - Further investigation was commenced to establish the cause of the issue

  • Apr 10, 202412:00- We restarted the Core service in Production

  • Apr 10, 202412:34 - No responses or timeout since

5 Whys (or other analysis of Root Cause)

Q. Why were users unable to load doctor details reliably?
A. The application was intermittently slow/timing out when clicking on individual doctors on the list page

Q. Why was the application slow/timing out?
A. There was a (~2 min) delay between the request for the doctor details and the request to retrieve doctor notes, (resulting in timeout errors reported in X-ray(?))

Why was there a a (~2 min) delay between the request for the doctor details and the request to retrieve doctor notes?

 


Action Items

Action Items

Owner

Comments

Action Items

Owner

Comments

To investigate why core service action was failing

@Cai Willis

 

Monitoring alarm to be set up

@Cai Willis

 

 

 

 

 


Lessons Learned

  •