/
Key Results: Benchmarking AWS and Azure

Key Results: Benchmarking AWS and Azure

One of the Key results for this quarters OKR for the AWS migration is to perform a stress test against AWS. The following details this as well as other benchmarks targeting different users

Stress Testing

The following is a table of HTTP result counts against both Azure and AWS on one of the larger TIS components (TCS) against one of the slower endpoints (get post by id with placements).

Azure:

AWS:

Insights

Here, we can see that AWS has a lot lower error response rate during load. This could because the way the load is spread between each server.

Response Times

The following is a graph of response times in milliseconds from the same endpoint with 30 concurrent users accessing the same data

 

Insights

Again, AWS here is responding to responses at a faster rate than Azure with worse case scenarios being 110 milliseconds faster than Azure

More Response times

At the end of the day, the end users will be the main customer of the TIS system, we need to show that the move has not had any detrimental affect on the day to day work.

The following are some response times from the browser as the user will see them while working on TIS

Azure TIS after login:

 

AWS TIS after login:

 

Azure TIS view person (person id 28):

 

AWS TIS view person (person id 28):

 

Azure TIS view post:

 

AWS TIS view post:

Insights

From the screen grabs above, what we learn is that the browser spends the majority of the time running code from TIS but its clear to see that the idle time (time waiting for things like TIS responding) is greatly reduced in AWS. This could be because any number of things (better hardware, located closer etc) but at the end of the day, it shows that users are spending less time waiting

Reliability

Below is a demo of the reliability checks in play. In this video, we run TIS on 2 servers. Once a server has been disabled, health checks detect it and stop routing traffic to that server, allowing users to continue to access TIS. It does take a while to kick in but it also allow users to continue with their day to day and give IT time to fix any issues

https://www.loom.com/share/7a210877bd4242c3ac56cfa14b6c29f2

Insights

Theres more work to be done here as alerting could be configured but as AWS gives this feature with minimum effort, we’ve already got something better than Azure

Build times

One thing to make the development experience better for developers is to have fast turnaround (feedback) from external systems. Typically, when a developer develops a feature, they would push code to a central repository regularly, this code is a possible release candidate and therefore needs to go through a pipeline of different quality checks. This pipeline could take some time to complete, so you don’t want this developer waiting around for some feedback.

Below is several screen caps of the same pipelines for the same code/features with how long it takes to respond

Admins UI Azure:

Admins UI AWS:

TCS Azure:

 

TCS AWS:

ESR Inbound reader Azure:

ESR Inbound reader AWS:

Insights:

Comparing pipelines in both AWS and Azure, it's easy to see that there is up to circa 1 minute improvements in some stages over Azure. If multiple developers push multiple times a day, the compound savings could penitentially be enormous

Cost

For other stakeholders (management and C level staff), costs can be a defining factor in choosing a cloud provider.

Currently in Azure, we have an inventory of virtual machines and registries using storage space, the average monthly cost to run TIS in Azure is unknown due to being denied access to that information. AWS on the other will give an estimation.

The issue with this estimation, is that we currently have a lot of experiments and resources being used to for the migration and other projects (TIS SS and Reval)

Insights:

It's not currently possible to do an apples to apples comparison on the cost of running TIS in Azure to AWS. Also at the moment, we have done little to optimise cost and resource usages.

Its probably better to come back to this point at a later date

 

Related pages