Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Authors

Andy Dingley

Status

DocumentingDone

Summary

NIMDTA reference service could not start

Impact

TIS NIMDTA unavailable

Table of Contents

Non-technical Description

The reference service for NIMDTA was stuck in a cycle attempting to start-up due to an issue with our database migration tool. An upgrade to the migration tool changed the behaviour for missing migration steps, causing a failed start-up instead of ignoring the missing steps as it previously did.

...

Trigger

  • Upgraded Flyway version deployed to production.

...

Detection

  • NIMDTA user reported having an issue loading TIS.

...

Resolution

  • Delete the schema history entries for the seed data loaded by the Consolidated DR ETL.

  • Allow the reference service to restart and re-run Flyway validation.

...

Timeline

  • : 12:39 BST - Upgraded Flyway deployed to production

  • : 12:42 BST - Notification in #monitoring-prod, thought to be due to an undeployed ops change.

  • : 12:45 BST - Ops change merged and applied.

  • : 14:22 BST - NIMDTA user reported being unable to use the application

  • : 15:02 BST - Fix deployed.

Root Cause(s)

  • Flyway migration validation failed after an upgraded version was deployed.

  • There were migrations applied that were missing from the reference service.

  • The Consolidated DR ETL used Flyway to load seed data, those scripts were not available to the reference service.

  • The combination of behaviour change in latest Flyway versions and the way the migrations were ran caused the validation failure. The validation behaviour is correct, but was not foreseen in this case.

...

Action Items

Action Items

Owner

Document how to investigate Flyway migration issues.

Andy Dingley

...

Lessons Learned