Date |
| ||
Authors | |||
Status | In ProgressCompleted | ||
Summary |
| Impact |
|
Impact |
|
...
Table of Contents |
---|
Non-technical summary
GMC added a new security platform
The platform allows them to block incoming requests
They blocked ours but then disabled it
They will not block requests anymore until the new reval is live
Will not be an issue with new reval - more update to date language and tech
Timeline
08:08 | |
08:30 | Created ticket and incident page 2021-03-01 Legacy Reval: unable to submit revalidation - User Agent/Cloud Flare GMC Issue |
09:15 | Rerun failed with the same error… Forbidden (HTTP 403) |
09:33 | Pinged users to let them know we have contacted GMC to check the issue |
09:34 AM | Emailed GMC |
10:43 | Replied to GMC to confirm its the production/LIVE Envrionment and not our new Reval module |
11:47 | GMC have made some changes over the weekend, they are looking into it |
16:10 | Email to chase GMC |
16:31 | |
- 16:53 | Confirming with the GMC what IPs we are hitting |
17:50 PM | Email back from GMC |
10:55 | We found that with the new reval module, no issues, we are able to get data However, the issue is with legacy/existing. Need to check if the IPs/Servers that legacy reval runs on are still whitelisted |
11:01 AM | |
11:46 | Cheking if authentication error |
16:10 | We confused the GMC |
- 11:17 | Call to investigate further - more fault analysis done,
Some Findings
|
- 14:00 |
|
-14:11 |
|
-14:41 |
|
- 15:17 |
|
- 15:30 |
|
- 16:33 | Email to GMC |
- 16:38 | So just to sum up our thought process:
Conclusion: most likely cause is a permissions error (Authorization issue) internally on GMC side |
- 16:42 | |
- 16:50 | Updated users on teams with our conclusion |
- 17:06 | Clarification of IPs |
-05:47 | |
- 22:02 | Email from the GMC - they have been able to re-create the 403 error |
- 22:48 to 23:19 |
|
- 00:05 |
|
- 08:02 |
|
- 08:08 |
|
- 09:59 | Trying to get some clarification from the GMC relating to the user agent |
- 11:26 | GMC disabled the security feature relating to the user agent so thats why it worked We need to update the user agent |
- 13:05 | |
| Chasing GMC - No reply regarding User agent / is there filtering back on? |
- 14:18 | Reply from GMC and some thoughts
|
- 15:37 |
|
- 16:33 |
|
Root Causes
Job failed
Request to “Get to GMC Doctors From GMC API” 403
The GMC Contacted the GMC (peter.mcnair@gmc-uk.org) - they moved the API behind some additional security over on the weekend - cloud flare - a cloudflare CND, DDoS Platform
Requests were being sent with a user agent blocked by Cloudflare were being blocked based on filtering rules they had set up - blocking the user agent
Trigger
gmc-sync-prod alert in monitoring channel
A user reported in Teams Support Channel and slack message on Monday AM
Resolution
Sending a user agent with the request that is not blocked by GMCGMC has turned off the filtering and will leave it disabled until the new reval module is deployed
Detection
Slack monitoring and user report in Teams Support Channel
Actions
New reval will replace, need to keep it up to date
Confirm with GMC if they have allowed our user agent or is this going to happen again?Inform GMC when only new reval module is live so
Lessons Learned (Good and Bad)
Cloudflare blocks certain adds additional security e.g. block requests based on certain rules such as user agents
Try and get updates from GMC when they are doing maintenance
Don’t have legacy applicationsGMC do server upgrades/maintenance etc on Sundays/over the weekend