November 21, 2017
Dave Bakshani and Willie Wheeler (@williewheeler)
Time To Read
Estimated reading time: 2 minutes

re:Invent 2017 – Use EC2 Systems Manager to Perform Automated Resiliency Testing in Your CI/CD Pipeline

By Dave Bakshani and Willie Wheeler (@williewheeler) in Cloud, Devops on November 21, 2017 |

DEV338: Use Amazon EC2 Systems Manager to Perform Automated Resiliency Testing in Your CI/CD Pipeline

This is one of the topics that was selected by Amazon to be presented at AWS re:Invent this year. At re:Invent, we will talk about what is Resilience Engineering, and why it is important to build resilience into your applications. Also covered will be some strategies for improving resiliency, and what a resilience experiment looks like.

As we explored this subject, we realized that while it is important to randomly test the resilience of our applications in the production environment, it is equally important—if not more so—to conduct structured resilience experiments of various aspects of our apps in the test environment. Such testing can help us uncover defects in our applications and possible mis-applications of resilience measures, in a safer environment and without the costs of an accidental outage on our production site. A CI/CD pipeline is an ideal place to ensure such experiments become a standard part of our build and deployment process.

We evaluated some tools to help us with performing attacks, which are a part of a resilience experiment. One of the tools we found to be a great fit, is the Amazon EC2 Systems Manager (aka SSM). SSM allows us to remotely launch attacks on EC2 instances running our applications. This allows us to programmatically test various aspects of our applications’ resiliency.

At re:Invent, we will go into details of how we use SSM Documents and the “Run Command” feature of Amazon EC2 Systems Manager to attack an application remotely. We will demonstrate the use of SSM to conduct a resilience experiment, and see how our application behaves under such an attack.

I would like to acknowledge Kuldeep Chowhan — who originally came up with the concept of using Amazon EC2 Systems Manager to run attacks — and Jay Spang— who implemented several attack types in a service utilizing SSM. Jay also submitted the presentation proposal that was accepted for re:Invent 2017.

See the complete list of Expedia team members speaking at re:Invent 2017.