Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Date

Authors

Joseph (Pepe) Kelly, Marcello Fabbri (Unlicensed)

Status

LiveDefect done. Investigating mitigations for the future

Summary

Some exported placements show an unknown ESR status (?) on TIS instead of correctly displaying their exported status (✔)

Impact

Inaccurate information regarding some placement’s ESR status

Non-technical Description

  • .The TIS-ESR interface exports data to ESR daily. The interface failed to successfully communicate the completed export of some Placements to TIS, due to TIS’s momentary unavailability. As a result, these Placement’s status on TIS remained unclear (marked with a question mark on the frontend (?) instead of displaying the correct exported status (marked with a tick ✔).


Trigger

  • . TCS momentarily unavailable (updates sent via REST calls not processed)


Detection

  • .


Resolution

  • .When Placements are exported to ESR, the Data Export service sends a message via RabbitMQ queue to the Inbound Data Writer service.


Timeline

  • :

  • :

  • :

  • :

  • :

  • :

Root Cause(s)

  • .When Placements are exported to ESR, the Data Export service sends a message via RabbitMQ queue to the Inbound Data Writer service.

  • The Inbound Data Writer service normally sends the updates to TCS via REST call, which is responsible for updating the PlacementEsrEvent table where this data’s stored.

  • TCS was momentarily unavailable right when the Inbound Data Writer service sent the REST call and didn’t accommodate that call. It didn’t update anything.

  • The Inbound Data Writer service, receiving a specific error in response to TCS’s unavailability, had a clause in place aimed at not requeuing the message in such case.

  • The message was not requeued (therefore re-processing was not attempted), and the updates where not applied.

Action Items

Action Items

Owner

Status

Fix current Placements whose status is currently inaccurate

Edward Barclay

ongoing

Make the Inbound Data Writer service more resilient so it requeues the messages when TCS doesn’t respond

Marcello Fabbri (Unlicensed)

done

Check elsewhere in the ESR interface for places where requeuing would be appropriate

Marcello Fabbri (Unlicensed)

ongoing


Lessons Learned

  • Consider more carefully when it’s appropriate to requeue a message (re-attempt processing it) and when it’s ok not to requeue a message.

  • No labels