The pandemic could be a ‘Once in a Lifetime’ opportunity to stop and think. Frances Buontempo misses this chance and muses on lyrics by Talking Heads instead.
Though I should write an editorial, I might take a break this time, and think about the purpose of an Overload editorial instead. As I muse, I am looking out of the window watching rain dripping off trees, shoots and leaves. Without a strong sense of purpose, a compelling reason behind a chore, distractions creep in. Not that an editorial is a chore – I’ve never written one, so wouldn’t know. Even with a sense of purpose, fear can creep in if you suspect your opinion may be in the minority, or controversial. I could write about diversity and inclusion. I could call into question excuses as to why there are fewer women in programming, and science or mathematics, in some parts of the world. I could discuss why some people claim women don’t enjoy mathematics or their brains aren’t wired up properly for it, or I could consider racial prejudice, and its preconceptions about innate ‘racial’ ability. Such biases tend to become self-fulfilling prophecies and even feedback loops. If most people in prison are black, police are more likely to stop and search black people, so in all probability more black people end up in prison, reinforcing the conception that ‘they are criminals’. I’ve been reading geneticist Adam Rutherford’s How to argue with a racist [Rutherford20], which shows why the idea of race is ill-formed to begin with. Highly recommended.
One very striking point made by Rutherford is that race, either in terms of genetics or other measures, is based on clustering. This is a statistical technique, often also used in machine learning. Data points, say people, are described by a tuple of features, say genes, skin colour, IQ, whatever. The clustering algorithm is then run, and reports back who belongs in which cluster. There are various ways to assign tuples to different groups, though all rely on a distance measure and almost all need to know how many groups you want. So, how many races are there? Two. Great you’ll get the tuples of people split between two groups. Five. Ditto. What is the point of clustering data? Well, sometimes you get a clear ‘decision boundary’, helping you disambiguate between labels. If I plot the height of a random collection of mice and elephants, against specimen number, I can draw a straight line between the two groups. Above the line means elephant; below means mouse. If I asked for three clusters, I’d get three. The data would be labelled into one of three groups, but that wouldn’t help spot elephants, in the room or otherwise. Why try clustering? To fish around and see if you do have some ‘different’ potential labels going on. If you do, you can then ask which features correlate best with the labels. Even if you discover something via clustering, you still need to think about what you have found out. If you discover AI can work out someone’s ‘criminality’ by their skin colour, you may have discovered systemic racism. Just saying. Statistics often forms a null hypothesis; nothing to see here, and an alternate hypothesis; that’s significant. The purpose of subsequence analysis is to decide which hypothesis is more likely correct. Some machine learning is applied without an up-front purpose. That’s ok. A data pipeline producing visualisations for people to consider can be useful or interesting. Data science often does this – it’s an initial step, but hasn’t fully covered the ‘science’ part to my mind. It can be the start of some science though.
That’s enough of controversial subjects. As I was saying, without a sense of purpose, random distractions creep in. If you’ve ever worked on a legacy code base, you may have found yourself deep in a call stack wondering how you got there. ‘And you may ask yourself, “How do I work this?”’ [Talking Heads]. Furthermore,
And you may ask yourself, ‘What is that beautiful house?’
And you may ask yourself, ‘Where does that highway go to?’
And you may ask yourself, ‘Am I right? Am I wrong?’
And you may say to yourself, ‘My God! What have I done?’
Actually, listening to music can help you concentrate and not get too distracted by the highways in the code. If you get to the end of a playlist or album and haven’t got anywhere, it’s time to get up and walk, grab a drink, do something else for a bit to get your focus back. Legacy code can be hard work. If you understand what part of it is trying to achieve, for example if you have some kind of code test round a bit of it, you can experiment with the code and try to refactor to something clearer. Without tests, you’re probably trying to dig yourself out of a hole though:
Water dissolving and water removing
There is water at the bottom of the ocean
Under the water, carry the water
Remove the water at the bottom of the ocean
If you’re trying to bail out the ocean, you need to give up and learn to swim. Don’t drown in a confusing code base. There are many resources to help with such code, for example Working Effectively with Legacy code by Mike Feathers, or The Legacy Programmer’s Toolbox By Jonathan Boccara. Make sure you are using version control, then you won’t be afraid to try changes to see what happens. Don’t trust the comments, though they may show the original purpose of some code. Think about what you need to achieve, and don’t get lost in a maze of twisty passages, all alike. Use a heuristic – always go left, always delete the comments, never change code without a test. Don’t get put off – keep your eyes on the prize. Above all, don’t drown.
Our garden looks slightly drowned at the moment, which is a distraction. Several of the bushes look a bit overgrown, and a few weeds are trying to take over. Looking at it, I can see how it possibly was, a while ago. I can sense what sort of shape might be trying to happen – where to prune, and cut back. However, I’ll get soaked if I go out there so these refactors will have to wait. If you look at legacy code you can sometimes see an older, simpler code base in the undergrowth and behind the weeds. I often see branches spring up to bolt-on new behaviour. Rather than grafting new growth on to established root-stocks, a cutting in a pot has been balanced in a tree, fallen over, smashed and turned into some weird looking triffid. OK, perhaps the analogy isn’t perfect, but you get the idea. Why does this happen? Because the simplest ways to shoe-horn new features into code is often by slapping an
else in. If code has been designed with potential future variations in mind, there may be a place to use a new strategy, send in a lambda, or similar. The Open/Closed principle (OCP) might be driving at this idea. Jon Skeet blogged about the OCP a while ago [Skeet13]. He questions what ‘open’ and ‘closed’ mean here, and quotes Wikipedia’s summary of Bertrand Meyer’s version:
The idea was that once completed, the implementation of a class could only be modified to correct errors; new or changed features would require that a different class be created. That class could reuse coding from the original class through inheritance. The derived subclass might or might not have the same interface as the original class.
Kinda like repurposing the original, via inheritance, or as Jon puts it, “A ghastly abuse of inheritance”. He is having more of a dig at Uncle Bob’s annotation than Meyer’s original quote though. Jon talks about ‘Protected variation’ as a clearer idea:
Identify points of predicted variation and create a stable interface around them.
In order to achieve this, some forward thinking is required. On the face of it, this seems to be in conflict with the YAGNI principle – ‘You aren’t going to need it’ – a mantra from Extreme Programming. The words, ‘on the face of it’ are important. There is a difference between building something now, that you might need in the future, versus architecting your code so that you can add new features in the future. Otherwise the twisty maze of
elses will happen. Martin Fowler talks about YAGNI in a blog [Fowler15]. He asks devs to consider what refactoring would be needed to introduce a new feature later on, and as a side effect, to:
…add something that’s easy to do now, adds minimal complexity, yet significantly reduces the later cost. Using lookup tables for error messages rather than inline literals are an example that are simple yet make later translations easier to support.
A little bit of forward thinking and a sense of purpose can make your life easier.
One advantage the pandemic has brought is many local Meetups now take place online. You can therefore join people who are miles away, and virtually attend talks you would otherwise have not been able to get to. A case in point, for me, being the Norfolk developers [Nor(Dev)]. On the 2nd July, Jez Higgins gave a talk entitled, ‘Journey into space’ [Higgins20]. He started by reminiscing about computers from the 1980s and how he learnt to code. He talked about good code, agile, and how he ‘read an article by one of the original signatories of the manifesto for agile software development, and accidentally ended up writing a version of Asteroids for [his] phone.’ The talk was excellent and is available on YouTube. By talking about previous work on a long project, which had been designed up front, Jez reminded us to ask why the agile manifesto had been written. I suspect many scrum masters and agile managers haven’t worked on an old-school Waterfall project, so don’t fully appreciate the context on this. I’ve heard people talk about being ‘rigidly agile’, or insisting you need specified meetings for prescribed lengths of time. Four-hour back-log grooming meetings etc. The first of the four better ways of developing software stated in the manifesto says, ‘Individuals and interactions over processes and tools.’ [Agile]. Ron Jeffries, the signatory to whom Jez referred, blogged about ‘Dark Scrum’ [Jeffries16]. He talks about how Scrum can end up being a tool of oppression. If you understand why you’re using scrum, and how to keep communication open things will improve. If the goal is working software, then being able to show new features, no matter how small/lean, that do something useful is a win for everyone. Getting to that point can be hard work though.
Jefferies talks about testing quite a bit in the blog. Who doesn’t? Well, I know I do, and it annoys some people. He also talks about continuous integration. I notice build/test/deploy pipelines referred to frequently, for example on Jenkins’ ‘Continuous Delivery’ articles [Jenkins]. Recently, the UK prime minister announced, “We’re going to build, build, build, and deploy jobs, jobs, jobs.” To build and deploy without testing is asking for trouble. When we test, we check what effect our construction has, and ask if it does what we expected. This inspired me to write a dreadful little ditty:
We’re going to build, build, build
And deliver jobs, jobs, jobs
sudo crontab -e e e
Didn’t test first. What a berk.
Apologies. I’ll stick to the day job. So, what’s the purpose of an Overload editorial? I’ll go have a think about that for next time, while you read this issues’ articles. Do feel free to contact our writers and encourage them. It gives a sense of purpose, and might spark new ideas to think about, and hopefully more articles in the future.
[Agile] Manifesto for Agile Software Development at: https://agilemanifesto.org/
[Fowler15] Martin Fowler (2015) ‘Yagni’ posted 26 May 2015 at:https://www.martinfowler.com/bliki/Yagni.html
[Higgins20] Jez Higgins (2020) ‘Journey Into Space’, talk on 2 July 2020. Abstract: https://www.meetup.com/Norfolk-Developers-NorDev/events/271181133/Recording: https://www.youtube.com/watch?v=8BOnppFZo6s
[Jeffries16] Ron Jeffries (2016) ‘Dark Scrum’, posted 8 September 2015 at https://ronjeffries.com/articles/016-09ff/defense/
[Jenkins] ‘Continuous Deliver Articles’ at https://www.jenkins.io/solutions/pipeline/
[Nor(DEV)] Norfolk Developers meetups: https://www.meetup.com/Norfolk-Developers-NorDev/
[Skeet13] Jon Skeet (2013) ‘The Open–Closed Principle, in Review’ on Jon Skeet’s coding blog, posted 15 March 2013 at: https://codeblog.jonskeet.uk/2013/03/15/the-open-closed-principle-in-review/
has a BA in Maths + Philosophy, an MSc in Pure Maths and a PhD technically in Chemical Engineering, but mainly programming and learning about AI and data mining. She has been a programmer since the 90s, and learnt to program by reading the manual for her Dad’s BBC model B machine.