Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Documenting

Date

Authors

john o Andy Dingley

Status

Done

Summary

User couldn’t log in to TIS Self-Service

Impact

The user was unable to view their submitted Form R

...

  • The user pool did not have permission to invoke the lambda

  • The lambda did not have permission to call the admin auth endpoint

  • The lambda did not have permission to call the admin get user endpoint

...

NOTE: A new user pool (v4) was created for the latest pilot (allowing for self-sign-up). When the user originally logged-in and submitted their Form R, they were a member of user pool v3.

...

Resolution

  • The ideal resolution is us to find and fix the the root cause of the above NotAuthorizedException error.
    In the meantime, there is a workaround available: sign up again using the same details and all existing data will be restored - which includes their submitted forms

  • Resolving the above permissions results in successful lambda completion and migration of a user from prod user pool v3 to v4.

...

Timeline

  • 10:00 Issue flagged-up in TSS channel. Initial checks/ assumptions user hadn’t signed-up yet so email reply sent to user advising them to sign up.

  • 15:35 Reply flagged up in TSS channel. User had already submitted a Form R and was trying to sign in again using the same credentials.

  • 15:35-16:40 Checks to migration process/ user pool config to try and establish the root cause.

  • 16:45 TSS channel msg asking for email reply to be sent advising user on next steps (either wait for fix or try logging in again, sign-up again, or wait for a fix).

  • 15:20 TSS channel msg - email from user saying they have successfully signed in.

...

Root Cause(s)

  • Missing permissions (see above)

...

  • Not entirely sure of the root cause for NotAuthorizedException error (see Detection section above).

    NOTE: When testing, the console login error message “incorrect username or password” (NotAuthorizedException) was not very helpful. (In this case the email and password were correct; the blocker was not having a fake profile in the the prod db which the pre-sign up lambda trigger was looking for). This is unlikely to be an issue for “real” users but might be worth seeing what ways the error msg text can more closely reflect the underlying issue.

...

Action Items

Action Items

Owner

Ticket to making more helpful Cognito error messages

Jira Legacy
serverSystem JIRA
serverId4c843cd5-e5a9-329d-ae88-66091fcfe3c7
keyTIS21-2895

Terraform the permissions required for the TSS user pool migration lambda

Jira Legacy
serverSystem JIRA
serverId4c843cd5-e5a9-329d-ae88-66091fcfe3c7
keyTIS21-2896

Dev checklist for testing some common AWS processes (e.g. migration between user pools)

Jira Legacy
serverSystem JIRA
serverId4c843cd5-e5a9-329d-ae88-66091fcfe3c7
keyTIS21-2902

...

Lessons Learned

  • Probably first of its kind of request i.e. user wanting to log in again (via migration between user pools with different configs) to view submitted Form R. This is a good ‘stress test’ !

  • Reminder not to make too many assumptions (with hindsight, more clues were in the first email from user). Maybe a quick call might of helped help to clarify things?

  • Testing the AWS processes with dummy data etc. is a bit fiddly. Maybe draw-up a checklist of things to remember?