Measuring DevOps - Where are you on the cheat scale?
Much has been written on the subject of DevOps metrics. Especially how many organizations are using manufacturing-like KPI’s such as lead/cycle times and defect counts to assess IT performance. Of course these metrics are worth scrutinizing, since the fast delivery of high-quality apps from our software factories has become a major imperative in an application-driven economy.
“Most people don’t want to do the wrong thing, but they will when process and procedure works against them”
When we adopt these metrics, we understand that they’re influenced by many others. For example, a Net Promoter Score or customer satisfaction index will depend on a combo of elements – like how fast you deploy valuable widgets and how quickly you can fix them when things go south. Similarly, customer conversions through mobile channels, will be influenced by factors within and beyond our control – like app deployment frequency rates and ease-of-use/design, but also by device and network reliability.
This is all great and well understood, but often in the quest for metrics nirvana we forget to include the basic and obvious. In fact ignoring one basic element can derail a DevOps initiative faster than a sneeze disappearing in a snowstorm.
It’s called cheating.
Now this doesn’t involve teams of agile developers juicing up the code-base with illegal substances, or operations participating in elicit cloud gambling activities. Nope, it’s more about any individual or team taking advantage of existing conditions and loopholes to circumvent processes or remove constraints. Cheating might seem harsh because the act isn’t necessarily dishonest. However, if the individual is incentivized to do it, or the act puts the organization at risk, then the business has been cheated – it’s that simple.
Cheating manifests itself in many ways across the software pipeline. For example:
Ace up the sleeve – a mobile app developer who needs infrastructure for some critical A/B testing. Frustrated at the amount of time it takes operations to release systems needed for testing and with the cajoling of a business unit manager, he procures some cloud services and starts mocking up some test patterns – easy peasy.
There’s a limit to how often you can use that extra card – it doesn’t scale. When unavailable systems, lack of test data and performance information cause people to take short cuts in design, coding and testing there’ll eventually be quality and cost problems.
Runner in the crowd – a web developer is assigned some refactoring or documentation work, but doesn’t want to be distracted from working on a cool new project. She knows that the release process is usually delayed, so decides to hold off on the updates; intending to slip them back into the next release at the last possible moment – it never happens.
Some marathon runners have been cheated by casually slipping into the race from the crowd just before the finish line. This example is no different – release bottlenecks are causing bad behaviors, resulting in unnecessary technical debt and apps that are hard to support.
Nobbling the ‘competition’ – a new autonomous agile development group ignores advice from the Enterprise Architecture (EA) team. In their minds the current architecture is too rigid and cumbersome, making it completely impractical to implement. By excluding EA they gain some immediate wins, but over time they start to encounter problems integrating and scaling applications; with many new reliability and security problems surfacing.
People will duck, weave when processes and architecture are complex, inflexible and overgoverned. It happens when you have rigid change advisory boards, standardization dictates and many other procedural roadblocks, which while not necessarily wrong, are often out of step with agile and DevOps practices. In the short term these dodges and workarounds might seem fine, but watch out when they’re accepted as “normal behavior” and start compromising the quality of applications.
What’s your Cheat Score?
To start addressing these issues build what’s called a cheat index – a scale from the ridiculous to the sublime measuring our tendency to introduce practices that circumvent established procedure or sub-optimal technologies. Doing so requires taking a Gemba Walk leaf out of the Lean book; traversing your software pipeline and observing all your poor practices, elements of waste and people behaviors - up, close and personal so to speak.
Next up begin linking and correlating this metric/score to your key indicators of performance – lead times, deployment frequency, and customer conversions and so on; gaining a warts-and-all picture of the business outcomes from cheating. Be prepared for some surprises because you never know - some cheats might actually be beneficial.
Now enact a series of initiatives designed to improve your cheat profile. These could be as simple (or difficult depending on your perspective) as education and training, but most likely involve automation to remove any constraints, bottlenecks and waste that cause people to cheat. Then it’s a case of ‘rinse and repeat’ – assessing the impact of initiatives, adjusting your goals/targets, and finding more opportunities for improvement.
Most people don’t want to do the wrong thing, but they will when process and procedure works against them. Often the behavior isn’t seen as wrong; it’s just a “white cheat” needed to get a good job done. Any DevOps initiative should take this into account; examining the people factor at all times. Always consider that a culture is governed by behaviors – anything you can do to improve these will go a long way to increasing employee satisfaction and value.