Site Reliability Engineering is OperationsI saw two emotionally charged opinions on Twitter this week about SRE and Operations. They really made me think.Dec 19, 2018A response icon3Dec 19, 2018A response icon3
SRE Consensus BuildingI did a call out on Twitter to ask about what I should write about. Two of the responses resonated with me.Jul 2, 2017A response icon1Jul 2, 2017A response icon1
Release EngineeringHow do you release software in a safe way, with reliability in mind? How do you bring together your development process with SRE practices…May 25, 2017A response icon6May 25, 2017A response icon6
Service Level Objectives in PracticeService Level Objectives,or SLOs are the fundamental basis of all Site Reliability Engineering. Without them you can’t have error budgets…May 16, 2017A response icon1May 16, 2017A response icon1
Service Level Indicators in PracticeHow well is your system working, right now?May 11, 2017May 11, 2017
Planned OutagesPlanned outages can make systems at Google more reliable.Apr 11, 2017A response icon2Apr 11, 2017A response icon2
Service Level ObjectivesDefinitions of what a SLI and an SLO are, and talking about how to define one.Apr 5, 2017A response icon4Apr 5, 2017A response icon4
Motivation for Error BudgetsError budgets represent the amount of failure we expect to actually have.Apr 2, 2017A response icon2Apr 2, 2017A response icon2
Published inHackerNoon.comRisk Tolerance of ServicesHow to decide how fault tolerant you really want to be and defining the value of reliability.Mar 29, 2017A response icon7Mar 29, 2017A response icon7
Commentary on Site Reliability EngineeringIn-order index of all my published articles on the SRE book.Mar 23, 2017A response icon1Mar 23, 2017A response icon1