Date | |
Authors | |
Status | Documenting |
Summary | Reference service failed to start after upgrade: https://hee-tis.atlassian.net/browse/TIS21-2573 |
Impact | some Some users were having problems accessing TIS for ~ 10 minutes |
...
The offending component in the reference service was reverted back to the previously known working version and re deployedredeployed
...
Timeline
13:14 - Alert on Slack: AWS Service 10.160.0.137:8088 is down
13:21 - Docker reports reference container is unhealthy and boot looping (syslog:
2022-01-18 13:21:44.810 WARN 1 --- [ main] ConfigServletWebServerApplicationContext : Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'inMemorySwaggerResourcesProvider' defined in URL [jar:file:/app.jar!/WEB-INF/lib/springfox-swagger-common-3.0.0.jar!/springfox/documentation/swagger/web/InMemorySwaggerResourcesProvider.class]: Bean instantiation via constructor failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [springfox.documentation.swagger.web.InMemorySwaggerResourcesProvider]: Constructor threw exception; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'swaggerSpringfoxApiDocket' defined in class path resource [io/github/jhipster/config/apidoc/SwaggerConfiguration.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [springfox.documentation.spring.web.plugins.Docket]: Factory method 'swaggerSpringfoxApiDocket' threw exception; nested exception is java.lang.NoSuchMethodError: springfox.documentation.builders.PathSelectors.regex(Ljava/lang/String;)Lcom/google/common/base/Predicate;
")13:24 - Andy Dingley creates new PR to revert to known working state
~13:24 - John Simmons (Deactivated) approved PR
13:24 - Jenkins started building repaired version
13:27 - Fixed version starts on stage environment and is checked and approved
13:28 - New version deployed to production
13:29 - Fault is fixed in prod and everything is now working as it should be.
...
Action Items
...
Lessons Learned
add Add some health check monitoring to the pipeline to stop rebooting containers from reaching the production environment.