Many products over-promise. Frances Buontempo muses on how to get things just right.
I was too slow to start thinking what to write an editorial on in this issue. The last two months have flown by in what felt more like a week, though maybe I wasn’t paying full attention. I should try to keep a record of all the ideas I start thinking of writing about. No doubt, trying to decide how to record these and make sure I could understand my notes later when I wanted to use them would give me another excuse for not getting round to writing an editorial. Documentation is a waste of space, right?
Have you ever asked a colleague to demonstrate how to do something, only to watch them typing at 100wpm, using keyboard shortcuts and ended up none the wiser as to what they just did. Or possibly worse, they say, “I just run a script I found on the internet”. Your colleague has zoomed through what they do and you have learnt nothing from them. “Too fast!” you mutter under your breath. Perhaps you persuade them to write documentation, which can cause another set of problems we’ll come to later. Perhaps they follow the current trend to make a screen capture of what they are up to, making sure they use mouse clicks to show what to do, clearly talking through every single step in full detail. The five minute task now has a 20 minute video. Better than 200 pages of documentation? Or worse? You can copy and paste the commands from the docs but not a screencast. If considerate, they may provide scripts too, but it now takes 20 minutes to watch and you have to make notes on where the pertinent bits are rather than just highlight a couple of lines in a document. The cunning may manage to run the talk at ‘chipmunk’ speed (extra fast, making the voice high-pitched and squeaky) so it only takes 5 minutes to watch. However you work round this, you have gone from the instructions being way too fast to being way too slow. Sigh Perhaps new ‘AI’ algorithms combined with speech to text APIs let you build a table of contents, or just translate their words back into text you can grep. Over-engineered? Perhaps; there must be a better, quicker way.
Perhaps you are provided with documentation instead. We have moved away from comprehensive write-ups filling two lever arch files or a print out of the source code. You’ve seen the pictures of Margaret Hamilton with the stack of Apollo Guidance Computer source code, forming a pile of paper as tall as she is [ NASA ]. Who prints source code out nowadays? I did get embroiled in an attempt to find a bug in some FORTRAN code a couple of years ago, and a younger team member did print out the unstructured ‘function’ but we couldn’t find a corridor long enough for all the paper. He did find the bug. Respect! Nowadays you probably expect a short README file showing how to get up and running quickly when you try new code. You’re happy to dig into API documentation, or skim through a wiki or tutorial if needed. Sometimes just a README is enough. Sometimes it isn’t. There’s always a sweet-spot and it depends on the context. How long does it take a new starter on your team to get up to speed? Can they release new functionality within a week? Does it take them a month to even get a computer in the first place? Or two months for a door pass. (True story. Don’t ask.)
Documentation and instructions can be found in various places and formats. You may find an online discussion group for software you are using, or failing that, resort to StackOverflow . When do you ask a question? As soon as you are stuck? Do you immediately get yelled at for asking a duplicate question? Or do you spend hours reading what’s already been posted? Do you only resort to online help after spending a few weeks trying to solve the problem yourself only to find the person who answers is sitting next to you? More generally, online applications often have a questions section on their website. How many times does some rough and ready AI/machine learning algorithm suggest items from a FAQ section that frankly have nothing to do with your problem, and you are forced to type it in and promised your query will be responded to within 24 hours? On the dot of 24 hours later, you get an automated response assuring you of the importance of your custom and that your problem is being investigated. To be fair, a timely response is good manners. Promising someone you’ll just be five minutes and then taking an hour or two is rude. If you say five minutes, get back in five minutes with a status update. Conversely, if someone claims they will be “Just five more minutes,” respect that and give them the space they’ve asked for. Of course, the “just five more minutes” is often a symptom of over-optimistic guessing or a play for extra time. Computers can give over-confident completion times too. You’ve been there:
- 1 minute to completion, which then goes up to 1 hour
- 0 seconds left, staying in that state for at least five minutes
- Updating Visual Studio – why does it take so long? What is it doing ??
I presume the scripts know how many bytes or steps there are in total and attempt to report time left by approximating the velocity. It’s often clearer if the code reports in units it can measure, avoiding the surprising blur that happens with a conjectured speed or velocity. Using appropriate units can make things less confusing, clearer and even accurate.
Stating the average fuel consumption in miles per gallon (bearing in mind varying definitions of gallons and other units) seems sensible. Does your car have a fuel consumption of 50 miles per gallon? How often do you manage this? Perhaps the units were correct, but the word ‘average’ needs disclaimers in the small print. What about your CO 2 emissions? Perhaps you have an electric car, so try to calculate a miles per gallon equivalent. How far can you go on a full tank (gas or electric)? Does it matter if you have the lights on?? Does it depend on how fast or slow you drive? Perhaps you don’t have a car. You may have used middleware though, or have seen statistics on throughput and latency in a datasheet. Some middleware named “Faster than light” guarantees to deliver one million messages per second under specific conditions; there are always conditions, though messages don’t seem to travel faster than light. [ Tibco ] Jokes aside, having a ball-park maximum can be useful. This allows you to do back of the envelope calculations to see what’s possible, and how to carve up your messages and micro-services or other software. Complexity notation also provides a way to analyse the possible speed or memory usage of an algorithm. An amortised or worst-case scenario is as useful as your average fuel consumption. The worst (or best) case may almost never happen, but allows you to compare algorithms or automobiles. It may almost always happen too, if you keep trying to sort already sorted data with a naive implementation of the Quicksort algorithm. Either way, if your code is too slow, you may need to start profiling to see what’s really going on, but big-O notation gave you a starting point.
Many products give upper bounds on performance. Consider a dandruff shampoo that removes “up to 100% of flakes”. I, for one, am relieved to know it won’t give 110%, presumably dissolving my skull in the process. Some maximum limits are enforced by science – as we know the speed of light, again under specific conditions, is a hard limit. 100% is often the limit on how much you can remove or lose, unless you trade a contract for difference (CFD) – in which case losses can exceed original investments. One product that almost never gives 100% is broadband, at least in the UK, or certain parts of the UK. How fast is your broadband? Or how slow? What were you promised? An ‘up to’ I presume. Make sure you write in and complain if it ever exceeds this!
I recently heard about a new attempt to break the land-speed record aiming for 1,000 miles per hour. [ Bloodhounds ] Managing to go over 100 miles per hour back in 1904 [ Redbull ] was significant. Wikipedia [ Wikipedia_1 ] tells me the 117 or so miles of a full orbit of M25 (London Orbital) has been made in under an hour, late at night when no law enforcement officers were around. The M25 is often more like a car park than a motorway. In fact, signs often suggest a speed of (up to) 40 miles per hour due to congestion. Usually no one does what they are told, just pootling along at 1 mile per hour. Somebody, somewhere needs to invent a Star-Trek style transporter, quickly. Beside the saving of time and the lack of pollution, “Transporting really is the safest way to travel,” at least according to Geordi La Forge [ MemoryAlpha ]. You may travel faster than the speed of light, though might miss your target destination by up to (there it goes again) 4 metres.
Back to reality. Do you need to deal with large data sets on a regular basis? Or grep giant log files? As you build up a script to find needles in haystacks, you probably try them out on small data first, to verify they do what you want. You probably build up regex to hunt in log files gradually, checking it does match some examples and furthermore doesn’t match other close but incorrect examples. Under stress, this sounds like it will slow you down, though the temptation to try it on all your data and announce, “There are no matches” can take you more time in the long run. “More haste, less speed” as an old saying goes. Going too quickly can have the overall effect of slowing you down. If you do try your scripts and programs on small data sets first, where do you get that data? Some people are horrified at the thought of using artificial datasets. However, you can create artificial data to cover all the edge cases and combinations, with one or two examples of each. Real data may not provide the perfect storm you need to test, so make some up. Someone somewhere will tell you this is a waste of time. I disagree. Real data may uncover other problems – partially filled or invalid fields, fragmented records, formats that don’t match the thousand page document you read and so on. However, a one in a million event will break your system nine times out of ten (to misquote Terry Pratchett), so try out the black swan events in a test setup. Test your code on small sample data too, but you need both angles to be better covered.
To build up your understanding of a problem domain or technology, you probably try a small experiment. Baby steps first. You might even build an end to end system, and pay attention to the logging and data feeds, just using static data. Someone might complain that you can’t test end to end until you have a live data feed. Prove them wrong! When you set up your logging, watch out for too much noise. If you cry “Warning!” over and over will anyone take any notice? Does your log file have known, expected and therefore ignored ‘ERROR’ lines over and over, slowing down your search for something specific? Do they get archived away too quickly for you to even search in the first place?
Some things are too fast. Some things are too slow. Some things, once in a while, are just right. How can you achieve this Goldilocks sweet-spot? The planet Earth is in the so-called Goldilocks zone – at just the right distance from our Sun to support life. This circumstellar habitable zone [ Wikipedia_2 ] has a long history, and gets refined over time. Starting with supporting liquid water, we’ve added atmosphere requirements and this will probably continue to change. And yet, of several million planets in goldilocks zones, we’ve only found one with life on so far. Are you doing things just right, to allow life, ideas and creativity to flourish? I’m not suggesting that you need to do something earth-moving to hit a sweet-spot, but the analogy with a code-base, project plan or team being habitable is a recurring theme. Neither too hot, too cold, but just right.