Rerun failed with the same error… Forbidden (HTTP 403)
09:33
Pinged users to let them know we have contacted GMC to check the issue
09:34 AM
Emailed GMC
10:43
Replied to GMC to confirm its the production/LIVE Envrionment and not our new Reval module
11:47
GMC have made some changes over the weekend, they are looking into it
16:10
Email to chase GMC
16:31
- 16:53
Confirming with the GMC what IPs we are hitting
17:50 PM
Email back from GMC
10:55
We found that with the new reval module, no issues, we are able to get data
However, the issue is with legacy/existing.
Need to check if the IPs/Servers that legacy reval runs on are still whitelisted
11:01 AM
11:46
Cheking if authentication error
16:10
We confused the GMC
- 11:17
Call to investigate further - more fault analysis done,
review errors and look at gmc-sync repo to see where the problem might be
Some Findings
403 in place in Prod as it is an authentication issue in our side.
We are able to curl the GMC end points which shows that the credentials are correct but the Java code is not able to get them due to some restrictions (?) in our Prod2.
We know 99 error in stage as the IPs are not white listed by GMC for our stage env3.
We want to trace logging in Java code so that we can see the credentials are fetched correctly
- 14:00
Call to review draft PR created to check authentication issues - what are GMC sending us - what response
1st of the month and timings - nothing has changed for 6 months so why a change now - version updates?
-14:11
tried to update java base - didnt do anything
-14:41
Updated PR waiting for review
- 15:17
PR reviewed
Ran gmc-sync-prod and it failed as expected
Investigating now with the extra logging
- 15:30
its getting the correct username and password but still failing
Before it gets to the gmc return code it throws exception
The app is failing in the SOAP api call
- 16:33
Email to GMC
- 16:38
So just to sum up our thought process:
We can CURL Prod (Authentication is fine)
There have been no significant recent changes to the codebase
The Java app appears to be building the request correctly
The Java app appears to be using the correct credentials
We are still receiving 403 from GMC endpoint
Conclusion: most likely cause is a permissions error (Authorization issue) internally on GMC side
- 16:42
- 16:50
Updated users on teams with our conclusion
- 17:06
Clarification of IPs
-05:47
- 22:02
Email from the GMC - they have been able to re-create the 403 error
- 22:48 to 23:19
Cloudflare (now being used by the GMC) - Cloudflare Browser Integrity Check seems to block java 8 user agents by default.
GMC-Sync with java 9 and java 11 didnt work (no surprise there)
Changed back to java 8 and its working now
- 00:05
00:05 cron run of the GMC-Sync ran successfully, so after the intrepid etl runs at 01:00 Reval should be working again
- 08:02
Slim-buster image (presumably with a later update of java8) - we don’t think the GMC modified any the rules in cloudflare.
Using a recent maintained docker base image (as long as it works with the GMC config) - that would be using the slim-buster image.
- 08:08
User reports fixed
- 09:59
Trying to get some clarification from the GMC relating to the user agent
- 11:26
GMC disabled the security feature relating to the user agent so thats why it worked
We need to update the user agent
- 13:05
Chasing GMC - No reply regarding User agent / is there filtering back on?
- 14:18
Reply from GMC and some thoughts
What the LTS status of those two Java 8 verrsions
Assuming they're in LTS, are the GMC willing and able to tweak their cloudflare config to allow in future?
New reval is Java 11, So just keep the filtering disabled until we switch over is an option?
Root Causes
Job failed
Request to “Get to GMC Doctors From GMC API” 403
The GMC moved the API behind some additional security over the weekend - cloud flare CND, DDoS Platform
Requests were being sent with a user agent blocked by Cloudflare
Trigger
gmc-sync-prod alert in monitoring channel
A user reported in Teams Support Channel and slack message on Monday AM
Resolution
Sending a user agent with the request that is not blocked by GMC
Detection
Slack monitoring and user report in Teams Support Channel
Actions
New reval will replace, need to keep it up to date
Confirm with GMC if they have allowed our user agent or is this going to happen again?
Lessons Learned (Good and Bad)
Cloudflare blocks certain user agents
Try and get updates from GMC when they are doing maintenance
Add Comment