So it's been quite a while since the initial “Moon on the stick” session and we’ve learnt quite a bit since then. It’s been decided that the migration to AWS would happen as soon as possible, with the intention of switching off Azure (In regards to TIS) and serving TIS from AWS. This has meant that a gradual move to AWS with the intention of redesigning parts of TIS won’t be possible as it will impact costs as we would be running in 2 environments concurrently. This subsequently means that an “As is” migration will happen, keeping things a static as possible with the intention of changing TIS via technical debt tickets
Where are we now?
The following is a diagram of what we have for TIS in both a hardware and software view.
The Devops team has done some preliminary Spikes/Investigations to gauge possible technical strategies. This has left us with a number of virtual machines (EC2 instances) and a large part of the TIS services already deployed in AWS.
Methodology/Strategy/Focus
TIS is a medium sized application with many developers working on it at any particular time. It also has downstream dependencies which we don’t control but may influence. With this in mind, we should try to adhere to the following principles.
Developer Flow
Whatever we do, we must bring the rest of the team with us by keeping them “in the know”. We’ll also need to make as little change to the environments and applications as possible. This is so that if anything happens, developers will not need to do anything “special” or at least they would know what they’d need to do to achieve what they need to in the new environment. tldr minimise hacks and keep things as close to what they are now.
Seamless
The migration should be completely seamless to a user. While browsing on TIS while on Azure, should look and feel no different to when it is deployed on AWS.
Downstream dependencies such as the NDW and GMC should also have no impact.
To infinity and beyond
Any work done for the migration should have some thought to migrating towards “The moon on a stick”. So any solution should make it easier and not harder
The work
We currently see a number parts of this migration work. This consists of:
Data
We currently have MySQL holding the majority of the TIS data, as well as MongoDB holding data used for the ESR (new world) integration work.
Our intention is to get the data migrated into AWS “As Is” with MySQL vm’s (just like in Azure) as EC2 instances and have “DMS” streaming updates from Azure into AWS. These DB VM’s will have the same network addresses and credentials so the current deployed applications will not need to be updated in any way (or have any different environment variables) in order for them to serve data.
These new DB VM’s will then have another “DMS” instance streaming data into the AWS managed service databases. These databases will be the final destination of the data, where new systems such as Trainee UI and “Moon on a stick” TIS will be able to access data from
This data strategy will allow us to have all services running in parallel (both aws and azure), destroy the services in Azure and have AWS serve all traffic and keep data for trainee ui fresh
Warning: with this strategy, it's very important that users that have access to the AWS version of TIS while it's still running in Azure should NOT be allowed to modify data. This would cause a disjunct of what the correct data is. So only only Azure has been decommissioned, should users be able to change data in AWS
Developers and Services
During the migration, we’re not going to stop developers from doing their jobs. It won’t be feasible to tell the business that all features/bug fixes will need to stop until we’ve completed the migration. With this in mind, we are having to come up with ways to allow the building/deployments of services to continue with zero impact.
With this in mind, we’ve come up with initial plans to have builds run concurrently in both Azure and AWS. This will allow us to have both systems in sync in terms of maven artefacts and services.
This will involve creating additional webhooks for all of the github repositories that makes calls to the AWS jenkins instance at the time of merges, branches and PR’s. Additional changes to all of the pipelines will need to happen to push docker images straight to AWS’s ECR (docker container registry) - this will then save us from paying for 2 lots of storage for docker images.
An additional change to remove the DEV environment from the pipelines will also be required as it’s been decided that we will only have a PROD and PREPROD environments in AWS once this is done, we can destroy the DEV systems in Azure saving money early on.
Environments
As stated earlier, as the DEV environment in Azure isn’t used much, we'll want to only have 2 environments in AWS. This would simplify deployments as well as save HEE money.
The new AWS environments will also have the exact same subnet CIDR blocks too. This will enable us to keep all of the same internal IP’s in AWS - further simplifying things for both developers (they will not need to learn new IP’s and machines) as well as simplifying the pipeline deployments as the Ansible configuration (inventory, var’s and vault) will not need to change
Flexibility and Managed services
Tooling used to migrate vms….
What can we deliver early and ensure value to the customer?
How do we ensure that the existing system continue to work?
Add Comment