Size matters. No I’m not making an off-color joke, but I did get your attention. One thing that seems to get lost on people is the idea of opportunity for a defect. For example, my computer crashed today. My wife’s computer did not. Is my computer unreliable? Not necessarily. Yes, my computer did crash once, but due to my job, I also use my computer almost 12 hours a day (let’s ignore the fact that I probably work too much). My wife, by comparison, just checks her email for a few minutes, let’s say 30 minutes at most.
In any given day, I use my computer 720 minutes, my wife uses hers 30 minutes. Because I use my computer more than she does there is a lot more chance that my computer will crash. If your computer is off, you can’t crash it.
When we talk about improving processes, we discuss the problems in terms of defects per million opportunities (DPMO). DPMO is useful because it allows you to compare disparate volumes. If I make a million widgets and have 2 defects and you make 100 widgets have have 2 defects, who’s worse? We both had 2 defects. By having common denominator, you can tell. One could assume that if you had made a million instead of a hundred widgets that you’d have about 20,000 defects. Now we’re talking apples to apples. My 2 defects per million or your 20,000 defects per million.
In most types of process work, the unit of opportunity is obvious. It might be defects per order filled, or defects per transaction processed. Software, well, that isn’t so pretty. What exactly is an opportunity to create a defect when it comes to code?
In fact, I had an experience over the past few days that brought up this exact issue. I was asked “are we getting better at delivering this software?” In other words, are we getting fewer defects in production?
First off, in terms of pure defect counts we seemed to be getting worse. There were simply more defects over the past few months than we had ever had before. Oh, the sky was falling, I tell you! More defects!?!? The world was ending! Ah, not so fast, I argued. What if there were more opportunities to have a defect recently? Maybe things were getting better, or at least not getting worse.
Of course, if you’re the person who is on the receiving end of being told that your software development process is a problem, you’re pretty ready to listen to someone who tells you that it might not be as bad as it seems. So, they said, let’s look at defects per transaction. We ended up with a control chart that looked something like this:

To me, looking at this chart, there are two populations. There’s a population hovering around the lower control limit (LCL) and then a second population around the upper control limit (UCL). Between observation 8 and 9, something appears to have changed, even though nothing is different about the process we thought. What it seems to me is that defects per transaction caused the data to be segregated into two populations. Since the process was believed to be stable, this seemed odd.
I decided perhaps a different denominator would be appropriate. Maybe a transaction wasn’t the right measure of an opportunity for a defect. This seems to make sense, since software is one of the most repeatable processes we have. It’s digital, after all. Given the same inputs, absent any funny memory leaks, the same input to code should generate the exact same output. Therefore running more transactions is unlikely to yield more code defects.
Instead, I proposed we look at defects per hour of coding effort. Yes, like lines of code, function points and any other measure you can think of, there are problems with “hour of coding effort” as a measure of opportunity. Fear not, I’m just using it to illustrate a point. When I created the control chart using that denominator, I got this:

Ah ha! A process that appears to be in statistical control if I look at defects per hour of effort. Now, for my unhappy manager who was being yelled at about the code quality, this wasn’t the best story in the world. I saw no evidence of “good” special cause variation. They didn’t appear to be getting better at delivering software, but they didn’t appear to be getting worse either. Crisis averted!
Here’s my proposal. If you don’t know what the correct unit of work is for the denominator, simply throw out a few ideas. Then, create control charts using each as the denominator. If using that denominator doesn’t bring what is supposed to be a stable process under control, it’s probably the wrong choice. In my example above, we discarded defects per transaction because it seemed to create two populations we didn’t expect. And instead, we went with defects per hour of coding because it did stabilize the process. Now, I’m not suggesting you should allow your own bias to find some complicated way of showing there is or isn’t variation, make your denominator something simple to count. And it’s likely that more than one denominator would probably work. In my case, not only would hours of effort worked, but number of lines of code changed, function points created or even requirements filled would probably have had the same effect. The point is to find a good, not necessarily perfect, denominator to normalize for the amount of work being done.
Now, when we go to change the process, you’ll have some way of knowing whether we’re getting better or not, even if we do more or less work.