I’m not much of a reader honestly. I know that seems strange, but I like to take information in small quantities, like articles, short chapters, etc. Works that need me to read from beginning to distant end to get the whole story don’t keep my attention. So it’s a bit odd that I would be reading “Fooled by Randomness” recently, but it was given to me by a coworker and I felt some obligation to read it. So far, so good, actually. It’s a little less data than I like, in fact holding what might amount to a handful of truths wrapped by long, long prose.
Regardless, as I read through I came upon “the black swan” which finally allowed me to tie together a blog entry that I’ve been hanging onto for a long, long time.
So, here’s the idea of the black swan. “I have never seen a black swan, therefore no black swans exist.” This is a very difficult statement. On one hand, it might be true, for through sampling of the population of swans randomly, you’d probably not see one. They do in fact exist, in Australia, but even with a really large sample, if you never took a swan from down under, you might conclude that no black swans exist.
The author’s point is that the converse statement is much easier. If you can find even one black swan you can confidently make the statement “not all swans are white.” Say perhaps you sampled as few a 2 swans, and one was white and one was black. Tada, your point has been proved. On the other hand, the absoluteness of the first statement, no matter how large your sample size, until you can observe the entire population of all swans that have ever or will ever exist, cannot be proved.
Anyway, hearkening way back to the Fixed Price Shop entry we encounter a similar story. I had concluded with a high degree of confidence that all estimates fell within a very small range of results and therefore we were conclusively a fixed price shop.
Karl Popper coined the term falsifability. It is the ability to prove something false. Scientific research is falsifable; religion is not. That’s not to say that religion is correct, but because it is untestable is cannot be proved nor disproved. I’m probably paraphrasing too much, but essentially the point of falsifability is that we take a hypothesis and attempt to beat it to death to prove ourselves wrong. By doing so, once we have excluded the alternative hypothesis which make our “discovery” a non-event, we can finally call what we have a discovery.
So here we were with me claiming to be a fixed price shop. All my observational data said so! That’s when we needed to exclude the other possibilities. See, being a fixed price development shop meant that regardless of the work requested that we’d produce the same estimate anyway.
So, what would that mean? Well, it’d have to mean that there was no correlation between the requirements and the price paid for the work. The first issue is that statistics cannot tell you that there is a non-event it can simply tell you that there isn’t an event. It’s subtle, I know, but the point is that just because you can’t find statistical evidence of a relationship doesn’t mean the relationship doesn’t exist. But let’s assume for a minute that because I couldn’t find a relationship that it meant something. As a non-event option, however, it could mean that all projects were simply the same size. If people always requested about the same amount of work, then it’d make sense that the projects cost about the same.
I admit, it is hard to figure out how big the requirements are, but I just did something simple. I took a sample of projects and counted the number of rows of requirements in the documents. It was very convenient that most of our requirements documents are written in a nice list format.
Indeed, I could find no correlation between the cost of my sample projects and the number of requirements. It sure seemed that the number of requirements varied but the cost did not. The alternative hypothesis that all requests were the same size was defeated, right? Wrong!
What if I found no relationship between the two not because there was no relationship but because my method of counting was no good. Maybe all work really was the same size and the way I sized the work was bad? How could we prove that?
Well, we knew that the method for estimating was essentially: read the requirements, estimate the development effort, tack on all the other surrounding resources. In fact, this is exactly why we believed that the estimates were fixed price. There was a variable price to the project (it is the development work) but it was dwarfed by all the garbage that people put around it. The end result could vary very little because the signal was essentially drowned out by the noise.
Still, there was a potential non-event here with the prior research. We might have a black swan. We knew that the sample projects requirements bore no relation to the costs of the sample projects, but we’d never found a case where it did.
Well, I couldn’t wait to observe all possible projects that ever were or ever will be, so we had to figure this out somehow. Recalling that the estimation methodology, in theory, estimated the development portion of the work off the requirements, if we could show that requirements to development effort were correlated but that requirements to overall effort weren’t correlated, that would show that we were correctly observing the process.
So, we tested the same counts of requirements against just the development portion of the effort. And wouldn’t you know it, the counts of requirements were highly correlated to cost of development!
It’s a long way round to get to a simple point. Once you find that you have proof of something, it is your responsibility to posit the non-events that could undo your discovery. Then, test the non-events to see if you’ve just found something mundane or something truly interesting. And then, if necessary, test the non-events of those non-events. In my story, though I left out some paths for brevity, it goes as follows:
Statement: All projects cost approximately the same because we don’t estimate the work, we just put in arbitrary efforts.
Possibility: But, what if really what is happening is that all work requested is just the same size?
Statement: But we can show that the number of requirements varies from project to project and has no correlation to the cost.
Possibility: But what if really what is happening is that your counting of requirements is no good?
Statement: But we can show that the counts of requirements are highly correlated to the development effort, but not to the overall effort.
It’s my job to prove me wrong as part of a thorough analysis. It’s my job to explore the possibility that I have found nothing, to show that my claims are potentially falsifable. And then to show that I can refute those claims of non-events. It’s my job to cut down my own research until the alternative explanations have been pruned away.
It’s your job as well. Try to prove yourself wrong, or you can be sure that someone else will.