2024-10-15 Bulk Upload unavailable
Date | Oct 15, 2024 |
Authors | @Rob Pink @Joseph (Pepe) Kelly |
Status |
|
Summary | Bulk Upload service was deployed with out of date configuration values, which made it unusable. |
Impact | Bulk upload was unavailable for c. 3-4 working hours. https://hee-tis.atlassian.net/browse/TIS21-6637
|
Non-technical Description
A user attempting to do a bulk upload repeatedly saw a message that the “server took too long” and then refreshed. On investigation, it was found that an out-of-date piece of configuration information that was released the afternoon before. The latest configuration was made available which restored service.
Trigger
Deploying / Approving a deployment
Detection
User alerted via Teams
Resolution
Synchronised infrastructure definition from IaC repository used by the build process and reran the CICD pipeline.
Timeline
All times BST unless otherwise indicated.
“Infrastructure definitions” left out of date.
Oct 15, 2024 ~12:02-13:30 The configuration used for deploying was manually edited and the pipeline executed. It was then released to production.
Oct 15, 2024 14:44 User reported problem.
Oct 16, 2024 09:10 - 09:26 The Infrastructure Code definitions were updated where they are used by the build process, the pipeline was run and users notified.
5 Whys (or other analysis of Root Cause)
The page was refreshing because API calls returned 401 errors.
401 errors were being returned, probably, because bulk upload could not communicate with other services. Lack of logs from the service defined prevent us saying so with certainty
The bulk upload service was using out-of-date information.
Actions for an earlier Live Defect had not been completed and this meant that builds were using an earlier copy of our infrastructure definition.
Action Items
Action Items | Owner |
|
---|---|---|
“Unresolve” Build Server card until it has been fully resolved |
| Done |
Repair persistent logging for bulk upload | @Joseph (Pepe) Kelly |
Lessons Learned
Slack: https://hee-nhs-tis.slack.com/
Jira issues: https://hee-tis.atlassian.net/issues/?filter=14213