...
13:14 - Alert on Slack: AWS Service 10.160.0.137:8088 is down
13:21 - Docker reports reference container is unhealthy and boot looping (syslog:
2022-01-18 13:21:44.810 WARN 1 --- [ main] ConfigServletWebServerApplicationContext : Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'inMemorySwaggerResourcesProvider' defined in URL [jar:file:/app.jar!/WEB-INF/lib/springfox-swagger-common-3.0.0.jar!/springfox/documentation/swagger/web/InMemorySwaggerResourcesProvider.class]: Bean instantiation via constructor failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [springfox.documentation.swagger.web.InMemorySwaggerResourcesProvider]: Constructor threw exception; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'swaggerSpringfoxApiDocket' defined in class path resource [io/github/jhipster/config/apidoc/SwaggerConfiguration.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [springfox.documentation.spring.web.plugins.Docket]: Factory method 'swaggerSpringfoxApiDocket' threw exception; nested exception is java.lang.NoSuchMethodError: springfox.documentation.builders.PathSelectors.regex(Ljava/lang/String;)Lcom/google/common/base/Predicate;
")13:24 - Andy Dingley creates new PR to revert to known working state
~13:24 - John Simmons (Deactivated) approved PR
13:24 - Jenkins started building repaired version
13:27 - Fixed version starts on stage environment and is checked and approved
13:28 - New version deployed to production
13:29 - Fault is fixed in prod and everything is now working as it should be.
...
Root Cause(s)
.An Updated component killed the reference service and wasn't checked fully before being approved into the production environment
...
Action Items
Action Items | Owner |
---|---|
|
|
|
|
|
|
...
Lessons Learned
add some health check monitoring to the pipeline to stop rebooting containers from reaching the production environment.