Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Authors

Adewale Adekoya Jayanta Saha John Simmons (Deactivated)

Status

Done

Summary

Impact

Reval Users cannot manage trainees using the current Reval App

Non-technical Description

GMC was inadvertently blocking our connection to their prod. production environment somewhere.

 What What was the issue?

The API issue was coming from their cloudflare infrastructure, a quick google of the error states: “Cloudflare Error 1020: Access Denied indicates that you’ve violated a firewall rule and your connection request has been blocked” This code (1020) was not on the list of GMC codes.

After contacting GMC, we got to know that GMC When trying to connect to the GMC database for our nightly synchronisation ETL, our application was getting blocked by the firewall at the GMC, we contacted the GMC to highlight the issue after double-checking our infrastructure to make sure it wasn't anything we were responsible for. They confirmed their oversight, confirming that they had implemented some additional security on 20/07/2021 due to some nefarious activity that was hitting their service.   That mitigation That security update was a little too zealous and blocked our connection to the API. On 21/07/2021 at 11am, GMC added a whitelist to that mitigation and that the HEE IP addresses to a ‘whitelist’ which resolved the problem of us connecting to them, but still keeps the unwanted traffic away.

...

Trigger

Additional security had been added to Cloudflare's firewall due to them being attacked the previous day. This blocked HEE access to their API.

...

Detection

Alert in monitoring channel:

...

We then re-ran the GMC sync and the associated ETL’s and all responded well.

...

Timeline

/07/ at : 09:40 - Joseph Kelly noticed issue in monitoring channel

/07/ at : 10:26 - Katy raised issue on Teams

/07/ at : 10:26 - John raised issue on Slack

/07/ at : 11:33 - Ade raised issue with GMC

/07/ at : 11:55 - GMC requested for more details

/07. at : 12:35 - Ade supplied more details

/07/ at : 14:16 - GMC emailed issue resolved

...

No Lessons learned as the problem was completely at the GMC end, and there isn't anything we could have put in place to mitigate this.

As we’d noticed the issue via our monitoring, we arguably shouldhave alerted Reval Admins in Teams before they highlighted the problem. And then regularly updated progress/resolution (for this incident, the turnaround was very quick anyway, as it goes).