Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Date

Authors

Philip Wilsdon (Unlicensed) John Simmons (Deactivated)

Status

Resolved

Summary

  • Ran the GMC SYNC ETL and INTREPID REVAL ETLs (intrepid-reval-etl  and intrepid-reval-etl-all-prod)

  • We didn't bother adding monitoring to the existing sync jobs as new revalidation is soon going to be live

  • Do need to add monitoring to sync jobs and ETLs for new revalidation - https://hee-tis.atlassian.net/browse/TISNEW-3264

Impact

Non-technical summary

Reval was showing information that had not been updated from GMC, on further investigation it turned out that the jobs that run overnight to get the data from GMC and then add it into TIS had not run sucessfully. Once they had been fixed and rerun, Reval showed the correct information.

Timeline

-09:48 AM

- 10:25 AM

Created ticket and incident page https://hee-tis.atlassian.net/browse/TISNEW-5728

2020-11-18 Reval Legacy/Old GMC Sync

- Between 10:25 and 11:21

Ran the jobs intrepid-reval-etl and intrepid-reval-etl-all-prod

Fixed the refresh of data but difference between the under notice values between TIS legacy/existing reval and GMC Connect

12:06

Ran the correct ETL’s (gmc-sync-prod and intrepid-reval-etl-all-prod) via Jenkins

12:25

Seems to be fixed

Root Causes

  • The Jenkins scheduled jobs had been amended to run on a different server when we had the prod outage on Friday 13th November 2020 as a quick fix to get the data back into the prod environment, this change had been overlooked and not rolled back. Therefore the jobs could not run afterwards as there was a conflict in the inventory (as shown in the Jenkins output for each job, and the fact that each job ran for less than 1 second).

Trigger

  • A user reported in Teams Support Channel that their connections had not been working correctly

Resolution

  • Running of the Sync, intrepid-reval-etl and intrepid-reval-etl-all-prod jobs

  • Running of the correct ETL’s, gmc-sync-prod and intrepid-reval-etl-all-prod jobs fixed the issue.

Detection

  • A user reported in Teams Support Channel

Actions

Lessons Learned (Good and Bad)

  • Still limited knowledge within the existing teams about how the existing module exists

  • Need monitoring - we raised this on August 22, 2019 and still have not done it (??? see below ???)

  • Jobs need to be started via Jenkins

  • check the jenkins jobs for 1. what they do, and 2. what the logs for that run said.

  • No labels