Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Authors

Reuben Roberts

Status

Resolved

Summary

PersonElasticSearchSyncJob ran simultaneously on both blue and green production servers (HEE-TIS-VM-PROD-APPS-BLUE and HEE-TIS-VM-PROD-APPS-GREEN). This caused duplicate Person entries in Elasticsearch.

Manually rerunning the PersonElasticSearchSyncJob resolved the issue.

Impact

 Users observed duplicate entries in the list of people.

Table of Contents

Non-technical Description

...

Trigger

  • Teams notification

  • Slack notification

    Image RemovedImage Added

...

Detection

  • A user reported in Teams Support Channel. The issue was also raised in the TIS tis-dev-team Slack channel.

  • The overlapping jobs could be viewed in the server logs


    and also in the monitoring-prod Slack channel (started 1:29 AM and 1:33 AM):

    Image RemovedImage Added

...

Resolution

...

Action Items

Owner

  • Investigate and resolve OOM Out of Memory errors

  • Improve locking mechanism to make it more robust

Reuben Roberts

Marcello Fabbri (Unlicensed)

...

Lessons Learned

  • The ‘locking’ to prevent the job running in parallel only takes account of scheduled runs. Any container restarts or manually running the job can cause duplication if it overlaps with the job running on the other server instance.