Release approach

Status

Decided

Decision leader

@Andy Nash (Unlicensed)

Contributors

@Dev team, Chris Mills and PO group

Date

Oct 11, 2018

Outcome

To transition from the current weekly release train to a CI/CD approach (many multiple timely individual releases throughout the Sprint)

Background

1. Where we were:

  • We had originally no release train, occasional releases on an ad-hoc basis. Devs deployed locally functioning code, through manual steps, till the code eventually reached Stage, where E2E tests were kicked off and POs invited to approve a release of multiple bits of code changes.

  • We moved to a weekly release train - to Stage - on a Friday (sometimes) which resulted in a release - to Prod - on a Tuesday (sometimes).

  • The approach was very much a transitional one in advance of a move towards full CI/CD.

  • This necessitated lots of manual testing which wasn't fool proof anyway. And a waterfall-style hand-off to Shivani and POs for testing/approval.

  • There was a very slow pace of development.

  • The existence of many repos causes difficulties for Devs working across multiple services for one ticket/feature

2. Where we are now:

  • Small incremental releases that can be done multiple times per day.

  • There is the potential to fail fast (each incremental release is of a tiny amount of code, and therefore represent tiny amount of risk) and roll back (with each release being tiny, roll-back is comparatively simple. Roll-back itself is a functional that never previously existed and was a fiddly manual Dev task).

  • Time spent on manual testing of a weekly/two weekly release can now be spent getting the failing ~40 or so E2E tests fixed, get other tests up to scratch, and invest time in newer testing to increase the confidence levels for all future releases.

  • There is more responsibility on developers to write good code which should result in better code.

  • Pace of development will increase as deployment is not held up and subject to so much of a dependency queue.

3. Where we are going:

  • Team will build up a dependency graph with a view to then taking steps to remove dependencies over time.

  • Moving to a mono-repo, allowing work for one ticket/feature to sit in one branch. This will be easier to PR. Additionally it will reduce the incidence of back end / front end code being deployed in isolation of the other, meaning Jira Stories are released, rather than individual Sub-tasks.

  • Integrating notifications of failure into Slack.

  • Detect errors in logs. Once being detected, they can be tracked. With that information the team can make decisions as to whether to roll-back, or fix and deploy

  • All services except Reval, Generic upload, NDW and ESR are set up with the new pipeline. Devs themselves can apply new pipeline to these services themselves (as a good test/practice - especially for those Devs that haven't done this sort of thing before)

Action items (my suggested actions and owners)

@Panos Paralakis (Unlicensed) : Lead on build up a dependency graph (using POMs and REST calls)
@Simon Meredith (Unlicensed) : Lead on coordinating the move to a mono-repo in Git (can you delegate to someone for your week off, next week, please?)
@Chris Mills (Unlicensed) : Work on integrating failures to a Slack channel (presuming this will require a new Slack channel)
@Chris Mills (Unlicensed) : Work on detecting errors in logs such that they can be tracked and the Dev team / POs can determine whether / when roll-backs need to be initiated
@Oladimeji Onalaja (Unlicensed) : Lead on coordinating getting new pipeline applied to those remaining services (Generic upload, ESR, Reval - others? @Chris Mills (Unlicensed) are you able to confirm, please?)
@Shivani Rana (Unlicensed) : Lead on addressing the failing E2E tests. The sooner these can be fixed, the sooner we can then use a correctly failing E2E test as a blocker for build
@Chris Mills (Unlicensed) : Can you add links here for: How to roll-back | (team: is there anything else you want Chris to link to here o clarify the new release approach?)
Whole team : Work out as soon as possible, an approach to clearing the releases currently awaiting approval, to take away any interference for the team using the new approach