2022-04-11 TIS Self-Service user could not log in (migration from previous user pool not working)

Date

Apr 12, 2022

Authors

@john o @Andy Dingley

Status

Done

Summary

User couldn’t log in to TIS Self-Service

Impact

The user was unable to view their submitted Form R

Non-technical Description

TSS user notified TIS that they could not log into the TSS app using their credentials. They had submitted a Form R back in December 2021 and wanted to log in again to view their submitted form.


Trigger

  • User emailed TIS to say they couldn’t login to TIS Self-Service

Detection

Inspecting the user pool configuration, there were several permissions issues:

  • The user pool did not have permission to invoke the lambda

  • The lambda did not have permission to call the admin auth endpoint

  • The lambda did not have permission to call the admin get user endpoint

NOTE: A new user pool (v4) was created for the latest pilot (allowing for self-sign-up). When the user originally logged-in and submitted their Form R, they were a member of user pool v3.


Resolution

  • Resolving the above permissions results in successful lambda completion and migration of a user from prod user pool v3 to v4.


Timeline

  • Apr 11, 2022 10:00 Issue flagged-up in TSS channel. Initial checks/ assumptions user hadn’t signed-up yet so email reply sent to user advising them to sign up.

  • Apr 11, 2022 15:35 Reply flagged up in TSS channel. User had already submitted a Form R and was trying to sign in again using the same credentials.

  • Apr 11, 2022 15:35-16:40 Checks to migration process/ user pool config to try and establish the root cause.

  • Apr 11, 2022 16:45 TSS channel msg asking for email reply to be sent advising user on next steps (try logging in again, sign-up again, or wait for a fix).

  • Apr 12, 2022 15:20 TSS channel msg - email from user saying they have successfully signed in.


Root Cause(s)

  • Missing permissions (see above).

    NOTE: When testing, the console login error message “incorrect username or password” (NotAuthorizedException) was not very helpful. (In this case the email and password were correct; the blocker was not having a fake profile in the the prod db which the pre-sign up lambda trigger was looking for). This is unlikely to be an issue for “real” users but might be worth seeing what ways the error msg text can more closely reflect the underlying issue.


Action Items

Action Items

Owner

Action Items

Owner

Ticket to making more helpful Cognito error messages

https://hee-tis.atlassian.net/browse/TIS21-2895

Terraform the permissions required for the TSS user pool migration lambda

https://hee-tis.atlassian.net/browse/TIS21-2896

Dev checklist for testing some common AWS processes (e.g. migration between user pools)

https://hee-tis.atlassian.net/browse/TIS21-2902


Lessons Learned

  • Probably first of its kind of request i.e. user wanting to log in again (via migration between user pools with different configs) to view submitted Form R. This is a good ‘stress test’ !

  • Reminder not to make too many assumptions (with hindsight, more clues were in the first email from user). Maybe a quick call might help to clarify things?

  • Testing the AWS processes with dummy data etc. is a bit fiddly. Maybe draw-up a checklist of things to remember?