2021-05-07 TIS not accessible due to expired certificates
Date | May 7, 2021 |
Authors | @Andy Dingley @Liban Hirey (Unlicensed) |
Status | Done |
Summary | The SSL certificates used by TIS were accidentally overwritten with outdated certificates |
Impact | Users unable to access TIS |
Non-technical Description
SSL certificates are used by websites to ensure that communication to/from the website is secure, each certificate is valid for a certain length of time.
The certificates used for TIS were due to expire on the 15th April and were replaced on the 6th April.
While making some unrelated changes the outdated certificates were accidentally deployed back to TIS.
Trigger
api-gateway playbook ran, removing manually updated certificates
Detection
Detected by dev team and then reported by users shortly afterwards
Resolution
@Liban Hirey (Unlicensed) managed to recover new certificates from the stage environment and copied them to the prod blue/green servers
forgotten branch containing the new certificates was pushed and merged to github
Prod tested and found to be working well.
Timeline
May 7, 2021: 14:55 BST - Ansible playbook ran to apply changes to
api-gateway
May 7, 2021: 14:59 BST - Detected by dev team and users
May 7, 2021: 15:30 BST - Fix deployed
May 10, 2021: Discovered failures in reading files from ESR. There were 3 attempts to trigger processing the file
DE_NWN_APC_20210506_00002693.DAT
which all failed with an SSL Exception.
Root Cause(s)
Ansible playbook replaced latest certificates with outdated ones
The updated certificates were added manually, so the playbook didn’t know about them
branch containing the new certificates and not been pushed to GitHub
Action Items
Action Items | Owner |
|
---|---|---|
Ensure certificates can be applied automatically by playbook | @John Simmons (Deactivated) | Done |
Improve alerting from Lambdas (ESR) https://hee-tis.atlassian.net/browse/TIS21-1564 | @Joseph (Pepe) Kelly |
|
|
|
|
|
|
|
Lessons Learned
Make sure anything applied manually will be handled automatically or do it automatically from the start
Slack: https://hee-nhs-tis.slack.com/
Jira issues: https://hee-tis.atlassian.net/issues/?filter=14213