Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Authors

Joseph (Pepe) Kelly Yafang Deng Steven Howard

Status

Jira Legacy
serverSystem JIRA
serverId4c843cd5-e5a9-329d-ae88-66091fcfe3c7
keyTIS21-4449

Summary

Bulk upload page is continually refreshing and showing “the server took too long to respond“.

Impact

Users can not use Bulk upload as usual.

Sometimes when bulk upload service is up for a short period of time, users are able to upload the file. Once the service is restarting, it can miss the response from other services and the file is stalled in progress.

...

Bulk upload service is continually restarted and bulk upload webpage is continually refreshing.Spreadsheet . This meant that users were less able to submit and check their bulk uploads for a large part of Friday. Some users were able to submit smaller uploads which were processed but the failure continued to reoccur until the TIS team intervened.

On Thursday afternoon, 3 uploads of 1 or more spreadsheets had lots of rows that were blank other than a hyphen in the address field. Bulk upload treated these as rows that required processing so produced significant numbers of errors (see below). By temporarily allocating more resources to the service that processes uploads, it was able to cope with the additional pressure of letting users know about the number of errors.

...

Trigger

  • There were 3 super large file uploaded and completed with thousands of errors.

...

posted in TIS Support Channel / General at 21 April 2023 09:28:59

...

Resolution

  • Increated Increased the jvm memory from 0.5G to 4G, and reserved memory (container memory) for bulk upload from 1G to 5G, until the files with lots of errors were no longer loaded on the first bulk upload page

  • Backed up the ApplicationType records for those 3 large files.

...

Action Items

Comments

Owner

Fix up service deployment configuration (volume mappings for logs & heap dump)

Preferred: Move the service to ECS

Don’t know if ECS would make the heap dumps available

Joseph (Pepe) Kelly / Jayanta Saha

Improve memory use: Change what columns are retrieved from the database for the /status search.

Yafang Deng

Analyse the data uploaded.

View file
nameAnalysis of rows in files uploaded via bulk upload.pdf

This would be to inform setting limits on the number of rows that are uploaded.

Steven Howard or James Harris / Stan Ewenike (Unlicensed)

Get feedback from Local Office about what happened

James Harris

Lessons Learned

  • We noticed there were 3 large files at the first sight, but didn’t recognise them as the root cause in the very beginning. It was because the data received in the API response doesn’t contain the error messages.

But the backend service does load them from the DB.

  • If some thing looks unusual (too big!) on the UI, it’s probably the cause.