No, unfortunately this is not about some recently found JK Rowlings manuscript, it would probably be much more captivating if it was, but rather I just finished reading The Visible Ops Handbook: Implementing ITIL in 4 Practical and Auditable Steps which you could call the prequel to the Phoenix Project. Although I don’t think it was the authors intention, you could look at the Phoenix Project as a case study and Implementing ITIL as the install guide. One seeks to create understanding via a narrative, while the other is a prescriptive method for implementation.
At first it may seem hard to believe that the same authors who in 2015 published The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win wrote a book about Implementing ITIL. After all, a poorly implemented ITIL strategy can result in layers of bureaucracy and a slowdown of all IT operations. Combine this with the image that many people have of DevOps as a cowboy or shadow IT movement and the two books at a glance appear to make for odd bedfellows. In reality ITIL and DevOps can and should be partners in creating a more effective IT organization that serves to meet the needs of the business at increasing velocity. They both can lead to the same results of providing faster turnaround, greater visibility & security, lower failure rates and less firefighting, so why shouldn’t these two frameworks coexist?
Before going much further, I think it make sense to take a lesson from the authors and provide a few definitions for the sake of conversation.
ITIL as stated in Wikipedia is “is a set of practices for IT service management (ITSM) that focuses on aligning IT services with the needs of business.“. It’s a way to define a standard set of processes and controls around IT service. As the authors are fond of pointing out it can also be used as a universal IT language to define processes, but the true power of ITIL is in cleaning up a messy ITO.
- DevOps is often looked at as more of a culture or movement than a framework, but it seeks to more tightly integrate the Development and Infrastructure (or Operations) side of IT organizations. Where ITIL is extremely metric driven, DevOps is often very focused on tooling. In the context of DevOps, the Development side of the equation includes traditional Developers as well as Test, QA, integration, etc. Operations refers to what I’d prefer to call the Infrastructure team: Sys Admins, DBA’s, Release and so on.
- I think it also would also help to define an IT organization or ITO. In terms of the book, they reference Development, testing, release, QA, operations and the traditional Infrastructure groups all as parts of an ITO. In smaller companies these may be broken up into disparate groups, however you just as often see them combined as a larger ITO body. In full disclosure, my opinion and experience show that having these functions all combined within an ITO creates for better harmony (aka less finger pointing), and enhanced cooperation right out of the gate without taking into account the prescribed recommendations in the Handbook.
Now that we are speaking the same language, the Handbook presents a simple framework for how to turn your ITO into a highly functioning organization. The book is broken down into four steps to help guide you on your journey towards being a high performing IT organization.
For those of us in the trenches it can be daunting to try and figure out how to start. When you are constantly fighting fires it can be hard to see a way forward. The first step prescribed in the Handbook is “Stabilize the Patient”. I prefer to call it “stop the bleeding” because often IT practitioners are prone to death by 1000 cuts. It’s hard to look at process improvement when you have to read 1000 emails a day (not an exaggeration), fix the latest crisis du jour, deal with the Executives pet project, manage vendors and the list goes on and on.
The premise of Step 1 is pretty simple and it’s stated pretty bluntly in the first sentence “Our goal in this phase is to reduce the amount of unplanned work as a percentage of total work done down to 25% or less.“. Now if you’re like me the immediate reaction is “HOW THE HELL CAN I DO THAT!”. And the simple answer is: Change Management. I’m pretty sure there was a collective groan from anyone who may be reading this page. The reason that many people react that way is because we’ve all seen change management done badly. I’ve seen change management run the gamut from honor system spreadsheets, to long droning and monotone CAB meetings. The reason they fail is because they are not focused on the spirit.
If you’ve read any of my previous posts, you’ll know that I’m a proponent of a business first and technology second mentality. In the Introduction to the guidebook a lot of attention is focused on the fact that to succeed you have to have a culture and belief system that supports and believes in three fundamental premises:
- Change Management is paramount and unauthorized change is unacceptable. When you consider that 80% of failures are caused by change (human or otherwise), it’s quickly apparent that all change needs to be controlled.
- A culture of Causality. How many outage calls have you been on when someone has suggested to reboot the widget “just to see if this works”? Not only does this approach burn out your first-responders and extend outages, but you never get to cause and therefore resolution with this approach. My favorite phrase in the book: “Electrify the fence” goes more into that, which we’ll discuss soon.
- Lastly, you must instill the belief in Continual Improvement. It’s pretty self explanatory: you fought hard to get to this point and if you’ve already put in this level of effort you obviously want to see the organization continue moving forward. If you’re not moving forward, then everyone else is catching up.
Now you can’t just have an executive come in state that “We have a new culture of causality” and expect everyone to just get on board. It’s a process, and by successfully demonstrating the value that the process brings, people will come on board and begin embodying the aforementioned culture. What you do need your Executives to get behind and state to the troops, is that unauthorized or uncontrolled change is not acceptable. They say it over and over in the book, the only acceptable number of changes is Zero.
But how do you get people to stop with the unauthorized changes?
- You need a change management process. You must must MUST have a change process, and it can’t be burdensome.
- There has to be a process for emergency changes. Don’t just skip the whole change approval process for emergencies, because then you set a precedent that says you can avoid the process when it’s important enough. Once you’ve done this you’ve created an issue where everyone thinks their issue is an emergency.
- Consider having a process for routine, everyday activities with normal levels of risk that get auto-approved. The important part is that they are tracked and that …
- The people implementing the changes are accountable. They are accountable for following the process as well as for executing on the change. Many people don’t want to follow change processes, so they won’t. You have to have a means to monitor, detect and report on changes. Once you have this in place, people can truly be held accountable.
And why do you want to go through this process of reducing unplanned and unauthorized change? The number cited in the book is that 80% of all outages are caused by change. Reducing unplanned changes reduces outages and the duration of outages. Less outages means less firefighting. A formal process that MUST be followed, drops the number of drive-by’s that your engineers have to appropriately deal with as all changes are going through the process.
Lastly, going through a change process forces the change initiators to think intentionally. Someone I respect immensely had me read Turn the Ship Around! last year (another book I’d highly recommend), and in this book there is a concept of acting intentionally. You state to your teammates “I intend to turn the speeder repeater up to 11.” By stating the actions you are about to take, you are forced to think about them and what their results may be. It can also slow you down a touch when you’re about to just make a “harmless little update.” By acting intentionally through the change process, you consider (and hopefully document) exactly what you’re going to do, what the outcome will be, how you’ll test, and what your rollback plan is. All of these acts will provide for higher quality changes, fewer outages and ultimately provide more time for your engineers to focus on the remaining steps in the handbook.
Now by the time you’ve gotten through step one, I’d argue that the heaviest lifting is done, and you’re hopefully learning more about ITIL as you go. The remaining steps that the authors detail in the Handbook share a lot of commonalities and are where you can really find opportunities to start blending the best parts of ITIL and DevOps into your own special IT smoothie.
Step 2 is pretty straightforward. You have to know what you have in order to effectively support your business. You must create an inventory of your assets and your configurations. Honestly this step can be summed up pretty succinctly: Go build a CMDB and and asset DB, otherwise you’re subject to drift and you can’t hold people accountable for their changes. It’s the bridge between Step 1 where you have to know how the environment is setup and configured, and between Step 3 where you begin to standardize.
Now bear in mind that when this book was originally written, DevOps hadn’t been coined as a term yet, but you can see it as a precursor with the title of Step 3 “Establish a repeatable build library”. In 2017 the benefits of this are pretty obvious. If you have a standard build, you can hand that release process off to more junior members, or ideally have the process automated. By having your builds standardized and your configurations documented, your environments are not pets, they are cattle. With a standardized environment your outages are likely to be more infrequent, but when they do occur the time to resolution will be dramatically smaller because you have a known footprint.
I did struggle a little bit with section 2 & 3. Section 2, “Catch and Release” is six pages long and consists mostly of the benefits having a known inventory will provide. It’s obvious that the authors find this point important enough to break it, but if it were an easy task everyone would already have the information and documents the authors specify.
This isn’t necessarily a knock on the authors, as it’s a twelve year old book, but section 3 “Establish a Repeatable Build Library” is a bit dated and heavily focused on the ITIL processes. No doubt having your process repeatable is very important, but as we’ve already pointed out in this day and age velocity matters and for this you have to have tooling and automation in place. Again it’s certainly not a knock on the authors, it’s just that you may be able to find better, more modern guides on how to achieve a build system in 2017.
The final section is really interesting to me as it’s part summary, part recap, and part advisories on the pitfalls to watch out for. Any topic on “Continual Improvement”, the heading of section 4, will obviously have a focus on data and metrics. Typically in an ITO metrics revolve around system or availability metrics like is the system up, is the database running too hot, etc, whereas the authors advise looking at more qualitative and performance metrics. After all the goal is to control the environment, and reduce administrative efforts so that your knowledge workers can spend more time working on value-add efforts. As you read this section it’s easy to see that many of the ideas in the “Continual Improvement” section are the seeds which the Phoenix Project grew out of. The biggest takeaway for me is that to become a highly effective ITO, you need less six-shooters and cowboy hats and more process roles and controls. Only by controlling the environment can you actually expect predictable results.
The book effectively wraps up with a summary of the objective “As opposed to management by belief, you have firmly moved to management by fact.” If you’re struggling to obtain this goal, The Visible Ops handbook may be a good place to start, just be prepared to augment it with up to date technologies and data.