Date |
|
Authors | |
Status | Documenting |
Summary | The TIS profile service went down |
Impact | TIS could not be used at all |
Non-technical Description
Our “Profile” service, which is used to check user permissions, went down due to a breaking change being deployed. As user permissions could not be checked the TIS application blocked all users actions, to a user this would have appeared like a log in failure.
Trigger
A change was deployed to
tis-profile
which caused the service to fail to start
Detection
Slack notification
Resolution
Reverted the breaking change
Timeline
: 14:03 - Breaking change deployed to production.
: 14:05 - Notification sent to slack channel for stage
#monitoring-prod
: 14:10 - Notification sent to slack channel
#monitoring-prod
: 14:11 - Issue picked up by dev team.
: 14:17 - Fix deployed to production.
Root Cause(s)
Profile service failed to start
Change to the Sentry configuration caused a breaking change
The implemented Sentry configuration requires Spring Boot 2.1.0 and newer (Profile uses 1.5.2)
The build continued despite failures
The alert about stage going down (14:05) was obscured by other alerts.
…
Action Items
Action Items | Owner |
---|---|
Find out why the configuration made the service fail | |
Find a working solution to migrate to sentry-spring-boot-starter 4.3.0 without failures | |
Improve the profile pipeline, e.g.:
| |
|
Lessons Learned
Test changes properly locally and on stage before pushing to production.
Add Comment