...
Below is a demo of the reliability checks in play{video}. In this video, we run TIS on 2 servers. Once a server has been disabled, health checks detect it and stop routing traffic to that server, allowing users to continue to access TIS. It does take a while to kick in but it also allow users to continue with their day to day and give IT time to fix any issues
https://www.loom.com/share/7a210877bd4242c3ac56cfa14b6c29f2
Insights
Theres more work to be done here as alerting could be configured but as AWS gives this feature with minimum effort, we’ve already got something better than Azure
Build times
One thing to make the development experience better for developers is to have fast turnaround (feedback) from external systems. Typically, when a developer develops a feature, they would push code to a central repository regularly, this code is a possible release candidate and therefore needs to go through a pipeline of different quality checks. This pipeline could take some time to complete, so you don’t want this developer waiting around for some feedback.
...