Friday, January 27, 2012

The news of Story Points' demise is greatly exaggerated

My friend Vasco Duarte just wrote an interesting piece on the subject of story points considered harmful. We've discussed this at length some years back (oh, the amounts of coffee consumed!) and I think his concept is good. However the article is fraught with errors and lacks a constructive conclusion.

By writing this blog post I intend to point out and correct some problems in the reasoning in Vasco's article. I don't have any special feelings for or against estimation. I don't think it's good if it becomes a cargo cult, something you just do because Scrum says so. That's true for most practices, by the way. But I believe estimation has a place as a "gateway drug" into collective software architecture and backlog maintenance. Once the team gets there, they can safely dump the story points.

Statistics

Let's start out by noting that statistical mathematics can explain many things. Small batches improve flow. More flow means more items flowing through the pipe, which means that the statistical variation of any attribute will be smaller. In the context of work queues and backlogs, a rate of perhaps 20-30 work items per week is enough to provide meaningful data based on counting work items (rather than summing up estimates).

The statistics presented by Vasco don't account for the fact that the estimate sum is dependent on the number of items. More items means a larger sum of estimates. To illustrate, I wrote a small script to output random data for a fictional team with 10 to 20 items completed every sprint and estimates ranging randomly from 1 to 13 using the Fibonacci scale. The correlation for this random data turns out to be around 0.73 and that is fully and totally the result of variable dependence. As mentioned above, increasing the number of items per sprint makes the statistical variation go down which increases the correlation between the two variables. At 70-130 items per sprint, the correlation is around 0.92. At 100-200 items it's 0.95.

Interestingly, three of Vasco's teams are actually doing worse than random. This is probably an argument against estimating. :-)

Complexity, predictability and chaos

Then let's move on to more human issues. Humans are unpredictable, but not as much as one would expect. The reason is that humans are social animals that can live and work together in groups for fun and profit. Indeed, we actually prefer to be part of a society or group. Hermits are rare and, bluntly put, a bit weird.

One of the mechanisms that allows us to live and work in groups is that we collectively construct social identities and institutions that constrain our actions. One such constraint could be: "In our team it's not OK to physically hurt people you disagree with". These identities and behaviors make us much more predictable. We can pretty much rely on Helen not to club down Tom in the meeting room with a chair during a heated technical debate, right?

If these identities don't exist or they are constructed wrong, there will be chaos. For example, the Vikings or Mongols of old would perhaps say that "in our society, physical power equals political power". In a conflict situation, Helen would club down Tom and everyone would laugh and nod and say what a nice chop that was, he hardly saw it coming. Next thing you know, Tom's brother Tim has set Helen's house on fire in the middle of the night and butchered everyone that tried to escape. But I digress.

Incomplete argumentation

Finally I'd like to point out that some of the counter-arguments to the six pro-estimation claims by Mike Cohn are weak. E.g. in claim 3 the article states "it can take days of work to estimate the initial backlog for a reasonable size project". It can (and often does) take days of work to construct an initial architecture for a reasonably sized project! You can't just start coding at random.

The blog post ignores the most important pro-estimation claim: the real benefit of estimating is that you get the whole team to participate in planning the software. This can be achieved in other ways, but most estimation techniques make it into a game that is fun to play. In the blog post I was looking for something to replace this activity.

Some other nitpicks:
  • The butterfly effect is commonly used to describe chaos theory, not complexity theory.
  • The complex environment is different from chaos in that it is causal but not predictable: in other words, causality is evident only when looking back but not when looking forward.
Conclusions

All in all, I think the key take-away of Vasco's post has a lot of merit, especially for seasoned Agile teams. However the article contains some serious leaps of faith, and the "myth of Story Points" is certainly not busted.


EDIT: Here's the Python script I used to create random data.

import random

for i in range(100):
  sumitems = 0
  sumestimates = 0
  for j in range(10 + random.randrange(10)):
    k = random.choice([1, 2, 3, 5, 8, 13])
    sumitems += 1
    sumestimates += k
  print sumitems, "\t", sumestimates


I then wimped out and did the correlation calculations in Excel.

Since random.randrange(10) actually results in a value between 0 and 9 inclusive, the script actually generates random data for sprints with 10 to 19 items, not 10 to 20 as specified. Doesn't invalidate the point though.