The role of a typical software engineer is to build an application. For a devops engineer or SRE, their role is to deploy the application to end users and keep it running smoothly.
Let’s say you have a web app, I’ll pick a big online store as an example.
The engineers are building the features. They’re writing the code to make the app work well on mobile. They’re adding features so you can pretend to try on a shirt before purchasing it. They’re implementing coupons that give you 40% and expire at the end of the week. They’re writing algorithms to compute what other items each customer might like based on their past purchases.
These days, web apps run “in the cloud”. The cloud is really just renting a bunch of computers in a big warehouse from companies like Amazon, Google, or Microsoft that have machines for rent all over the world.
When a customer uses your web app, they’re connecting to these computers in the cloud. Those computers host the website content, large files like images and videos, and the databases where all of the inventory is stored. Those computers also handle transactions with other partner companies – like maybe a payment processing site that you use to take money from customers.
The job of devops / SRE is to make sure all of that stuff is running 24/7. That includes:
* Adding more computers if customer traffic is higher
* Upgrading to devices with larger disks or more RAM if needed
* Running in multiple data centers around the world so that users can connect to one close to them
* Monitoring each new version of the app that’s deployed, and rolling it back to a previous version if something’s wrong
* Dealing with power failures, network failures, anything that can go wrong
* Making backups and restoring from backups in a catastrophe
* Working with engineers to seamlessly make changes to the system, like switching to a new database, or a new payment provider, with minimal downtime
Python scripting: it’s just programming / writing code. The only difference is that instead of writing the code to build the app, it’s writing code to automate things that happen behind the scenes to keep the site running.
Kubernetes: the system that manages all of the software running on those computers in the cloud.
Terraform: a tool used to help safely make changes to all of your stuff running in the cloud with less opportunity for human error
One last note: devops and SRE are not quite the same role. There are different visions of how this role should work and how responsibility should be shared. The common thread, though, is focusing on deploying and maintaining the app in the cloud rather than building the app.
Latest Answers