Press "Enter" to skip to content

Month: August 2019

Disaster Recovery with Terraform, AWS and a few lessons learned

What would happen if you, unexpectedly, had to build your entire production infrastructure from scratch? Would you be able to perform a full recover off all services and dependencies to an acceptable level? How long would it take? Hours? Days? Would the engineering team know what to do? What problems would you encounter? What about full data recovery and databases? Are backups available? How to manage operations and set expectations across the business and clients? This is the kind of nightmare situation that keeps any SRE awake at night, specially if you’re running a SaaS platform. There is a common perception that these events are similar to something coming out of the Black Swan Theory: they can have a profound impact when they (rarely) happen and always arrive as a surprise. But they are less rare than we think. In the last couple of years, I’ve seen major security incidents…

Leave a Comment