Large groups can behave as one, but how predicatable are they?
Welcome to Overload 87. If you were thinking the post had something missing, then you'd be right - there's no CVu this month. This is the first of the new monthly mailings, and we're starting with Overload. You'll get the next edition of CVu next month followed by Overload the month after, and so on. In this period of change it would be good to have a think about what we'd like from CVu and Overload, what's good, and what could be improved. I welcome opinions and ideas on this or any other issue.
The domino effect
I'm writing this at the end of September, after a few quite remarkable weeks in international finance and the markets, after a year of the so called Credit Crunch crisis. Perhaps by the time you read this the causes and consequences will be clearer, but I thought one story in particular was interesting because of the role of technology.
Our world now uses computers to do huge amounts of work. Examples include online newspapers plus their archives, indexing for searching, automatic collation of information sources, processing vast numbers of market trades, and even automating tasks using rule-based systems.
But recently all these examples came together and caused major headaches for United Airlines [ BBC ].
What appears to have happened started with a newspaper, the Florida Sun-Sentinel , which has an online version with an archive of all past stories. For some unknown reason, enough people clicked on a story about United Airlines to put a link to it on its 'Most Popular Stories' section. Unusually this story was an old one from around 2002, when many airlines were struggling with the aftermath of 9/11, and was about United filing for Chapter 11 protection (which roughly means bankruptcy). However, when you looked at the web page the only date visible was the current date: September 7th, 2008.
Google News' web crawler found the page, and reasoning that as it hadn't seen that link last time and the only date it could find was recent, decided that it was a new news story and duly indexed and published it. As airlines are having a rough time due to high oil prices recently, such a story was plausible and of interest, causing plenty of people to read it, which made it rise up the rankings, gaining even more attention. Then someone thought it important enough to put it on the Bloomberg newswire service that is used by the financial markets.
At which point traders and automated trading systems saw the bad news and sold United Airlines stock, causing the price to drop quickly. This triggered automatic stop-loss rules (that is, if the price falls below a pre-set limit, sell your shares to avoid losing even more), which sent the price even lower, triggering more automatic stop-loss rules and panicking traders, and so on in a vicious cycle. By the time the stock was suspended, it had lost around 75% of its value, around $1 billion, in just fifteen minutes! All because a few people had clicked on an old story...
Interestingly, pretty much everyone had acted rationally and cannot really be blamed (I would say the exceptions are the original website's developers for not clearly tagging stories with a date and time, and the journalist who posted the story to Bloomberg without checking it sufficiently.) And yet the consequences were anything but rational.
One big problem here was the way computers automated some simple rules which were fine in isolation, but when combined with many other similar rules led to the computer equivalent of a market panic. Because of course, markets aren't as perfectly efficient and rational as some simple theories make out, for two reasons - they involve people who can respond irrationally to rumour or panic or exuberance; and computers blindly carrying out rules, that while locally rational in normal market conditions, interact with the rest of the system to produce an overall irrational result in unusual situations.
This is just one example of how when things are highly interconnected, effects can ripple out and unexpected emergent behaviour can arise suddenly. (A version of the Law Of Unintended Consequences [ Wikepedia ])
These can be almost impossible to predict, and take some tricky mathematics to model, but computers are making it possible to process the vast amounts of data involved, or even simulate these huge networks of autonomous agents, and understand some of the resulting behaviours. This could be very useful for policy makers trying to predict the effects of new laws, taxes, incentives etc, which cause all sorts of unexpected results as people try to work around or game the new systems.
The book Critical Mas s [ Ball ] is a readable introduction to how some of these problems can be tackled, looking at the ways large groups act according to the statistical laws that were first used to model molecule velocities in a fluid. So while each agent acts independently and unpredictably according to their own desires and circumstances, the higher-level pattern that emerges is often highly ordered and predictable.
As well as the trading patterns of markets, other examples include the properties of matter arising from the interactions of atoms and molecules; the structure of galactic spiral arms (which are thought to be pressure waves and not static structures [ SpiralArms ]); traffic flow and how congestion occurs; how birds flock; epidemiology; the spread of email viruses along social networks; prediction of power grid usage; population changes and urban planning; and the evolution of populations and genomes.
This latter is an interesting one: the basics of evolution are so simple - all you need is reproduction (so successful populations grow), heritable variation (to produce a range of creatures), and inadequate resources (so that not every creature can reproduce) - and yet the results are enormously complex. Partly because the sheer enormity of the time involved allows changes to accumulate, but also because of the complex feedback of each creature itself being part of the environment that determines which genomes survive (a classic example is the arms race between preditor and prey). For a taster, Richard Harris looks at some of the maths behind how natural selection builds complexity.
The black art of estimating release dates
These sorts of ideas can also be applied to trying to work out when a product will be good enough to ship. A simple rule-of-thumb is to look at a graph of open bugs over time and estimate when it will go down to zero. Of course this is an overly simplistic model - depending on where you are in the development cycle, the number of bugs could be increasing!
So obviously there are different phases that have their own distinctiveness: such as development of features, stabilization, first mass test, release preparation, and post-release.
We can categorise bugs in many ways, such as the probability of being triggered, how severe are their effects, what effort is needed to fix them, and the risk of the fix causing further bugs (and what sort those are). Each development phase has a different mix of bugs. For example, early on creating new features will cause many new bugs to be created of many different types, from trivial spelling mistakes to serious design flaws; conversely, close to release most of the easy bugs will have been found and fixed, the design has settled down, and what remains will be the hard bugs whose fix will likely cause many more bugs.
Thinking about this reminded me of Ross Anderson's keynote at the conference a few years ago [ Anderson ] which analysed what would be the expected number of security flaws in open and closed source software. Doing a similar analysis of expected bug numbers based on the number of bugs already found, the number of testers, and the rate of fixes, could provide a much better set of rules that can be easily applied. For example, when in the pre-release phase, work out how much effort it took to get bug numbers halved from the peak, and the expected effort to get it to zero will be twice that. (This is just my guess based on the fact that you won't have found all the bugs yet, mostly the hardest bugs will be left, and fixes will cause further bugs).
Others have done just such an analysis: a quick search for terms such as Reliability Growth Models, or Software Release Estimation reveals plenty of examples and research into their effectiveness. And yet I've rarely seen or heard about people doing much more than simple extrapolation and guestimates. This is probably good enough for a rough guess for non-critical software, but as most projects are classed as 'failed' due to being later than predicted there's plenty of room for improvement.
As an example of how using even slightly more sophisticated models can help, I once worked on a team where things always took longer than the estimates. So I introduced a sightly different technique. Instead of asking how long something would take, I asked for estimates in three situations: a best case estimate if everything went really well, most likely (which is the estimate people usually give you), and a worst case if things go badly. These numbers tell you several things: if they vary wildly, it indicates the task is risky and probably not very well understood. But interestingly they are rarely symmetrical - e.g. a task that is most likely to take 10 days will have a best case estimate of 7 days, but a worst case of 20. By combining these three numbers in a simple weighted distribution, something like E=(best + 4 x likely + worst)/6, you get the Expected time, which was always a little bit longer than the 'likely' time, e.g. (7 + 4x10 + 20)/6 is 11.16. By using this number in the plans instead (and adding a half day a week for synchronising with the main code base and checking in), the estimates became remarkably accurate, and everyone became a lot happier.
[ Anderson] Anderson, Ross 'Open and Closed Systems are Equivalent (that is, in an ideal world)' http://www.cl.cam.ac.uk/~rja14/Papers/toulousebook.pdf
[ SpiralArms] http://www.astronomynotes.com/ismnotes/s8.htm
[ Wikipedia] http://en.wikipedia.org/wiki/Unintended_consequence