Authority to port app over to AWS (given a host of issues we've had with Azure and the lack of professional support received from Microsoft)

Status

Decided

Decision leader

@Simon Meredith (Unlicensed)

Contributors

@John Simmons (Deactivated), Martin Hall

Date

Aug 22, 2019 

Outcome

Decided to move back to AWS.

Ops team will build up the AWS infra and port TIS over to it when ready

Background

  1. TIS was originally built on AWS.

  2. HEE took the decision that we are essentially a Microsoft house, and have other hosting with Azure, so TIS should be hosted on Azure.

  3. On Azure we have encountered a few problems which we have raised with Microsoft Azure support.

  4. Support received has been negligible, unhelpful, stressful (calls coming in from Microsoft at all hours of the day and night) and frankly unprofessional (Microsoft unbelievably outsource support for Azure!). They handle tickets based on which timezone is operational when the ticket comes in, and then pass it between teams in different timezones as those in one timezone clock off for the day - resulting in repeated explanations and re-running of tests to get to the root of the problem.

  5. Team re-suggested a move back to the much more stable and mature service from AWS.

  6. On 22 Aug, a conversation between Simon (acting Head of Software Development) and Marin Hall (HEE, CTO), resulted in a green light to move back to AWS.

  7. We were also asked to move to UK data centres at the same time (in order to calm the Brexit jitters about public sector data).

  8. In addition we were asked to ensure everyone has AV and encrypted hard drives on Linux machines (@Simon Meredith (Unlicensed) leading on this)

Options considered

 

Option 1:

Option 2:

 

Option 1:

Option 2:

Description

Stay with Azure

Move (back) to AWS

Pros and cons

POSITIVES

Microsoft product and HEE predominantly uses Microsoft products - therefore economies of scale in terms of licences and support.

NEGATIVES

Microsoft support for Azure is abysmal and not fit for purpose for a live, production, product such as TIS.

POSITIVES

Stable, mature hosting service

Opportunity to set up the infra again from the bottom up, with all the knowledge we have from using the app in a live, production, environment for over a year - will enable us to configure it exactly as we want it - security, load balancing, disaster recovery, scaling etc.

Opportunity to sync in with the Kubernetes work we're doing at the same time.

Opportunity to sync in with moving the data to UK datacentres that we were planning to do as well.

NEGATIVES

Will need significant testing before porting the whole app back over to AWS - suggest it is developed in parallel initially, and the migration is handled gradually, with testing baked in throughout.

Estimated cost

small

small

Action items

Stephen to engage intermediary to procure AWS
TIS Ops team to architect the AWS infra
Architecture to be reviewed by AWS
Migrate TIS over to new AWS EKS architecture
Set up a sync between production version of TIS and this new pre-Prod version on AWS
Test AWS infra by redirecting a small percentage of traffic to the AWS set up
Port whole app over to AWS, assuming all testing passes
Archive / delete Azure infra