Reasoning and understanding code have fundamental roles in programming. Lucian Radu Teodorescu highlights the importance of reasoning and its philosophical underpinnings.
This is a two-part article. I wanted to write an article about complexity in software engineering (in memory of Fred Brooks, 1932–2022), but then I realised the complexity of such an endeavour. Reasoning is hard, tackling complexity is also not easy, and, moreover, the relation between reasoning and complexity in software engineering deserves a lot of attention too. Thus, there are multiple subjects that are interconnected. And covering them doesn’t fit the space of a single article.
The first part is structured as an essay. We will highlight the importance of reasoning in software engineering, and how our field is close to philosophy.
The second part (to appear in the next issue of Overload) is structured as a play in 14 acts. Here we tackle the problem of complexity, and discuss how much it is essential versus how much it is accidental.
The two parts are deeply related. While discussing complexity, we refer to the first part in two main ways: the approach itself involves a form of pure reasoning, and also the object of our study, i.e., dealing with complexity, involves reasoning.
Reasoning in software engineering
How do we reason about things in software engineering? Moreover, do we even need to reason about things, and, if so, to what extent do we need to reason about them?
Software engineering is, as the name says, an engineering discipline. There are people who argue differently, but I think we have sufficient evidence to say that it is indeed an engineering discipline [Wayne21a, Wayne21b, Wayne21c, Farley21]. Like any other engineering discipline, our field needs to be based on facts, and ultimately knowledge. We should be able to conduct experiments to acquire empirical evidence.
But, the sad reality is that we have a relatively small amount of definitive empirical knowledge in software engineering [Wayne19]. Moreover, we lack this knowledge for some fundamental aspects we are using. For example, we don’t have good empirical knowledge on whether object-oriented programming is better than functional programming. There are multiple reasons for this, but probably the most important one is the fact that the most important instrument in developing software is our mind; not the compilers, not the programming languages, and not even the computers. And, our mind works in mysterious ways.
The main job while programming is not writing code, but reading, reasoning and understanding code; then, after one forms a mental model of how the program should be written, the job is to translate that mental model into a concrete form that both humans and computers can understand and reason about. As Kevlin usually puts it, programming is applied epistemology [Henney19]. And epistemology is a branch of philosophy.
Software engineering and philosophy
If there is one book that every programmer should read, then that book is The Mythical Man-Month by Fred Brooks. Especially the 1995 edition, which contains the No Silver Bullet article, published initially in 1986. In a nutshell, Brooks argues that software is essential complexity1. Ever since I read this, many years ago, it remained with me as one of the few fundamental ideas of software engineering.
However, we cannot actually prove that software is essential complexity, we can only argue that it must be correct. The core reason for which this statement cannot be proved is because it is a metaphysical statement, and we cannot prove metaphysical statements. In the article, Brooks mentions that he is following the distinction between essential and accidental that was made by Aristotle. This distinction appears in Aristotle’s ‘Metaphysics’ [Aristotle-1].
This definition of software implies that there is a strong connection with philosophy at the core of software engineering.
Another very influential book is Elements of Programming by Alex Stepanov and Paul McJones [Stepanov09]. Reading it feels very similar to reading Aristotle. It has a similar reasoning style, it uses terms like entity, species, and genus that Aristotle introduces [Aristotle-1], and it even gives examples with Socrates (chapter 1); later on, in chapter 5.4, it directly refers to Aristotle’s ‘Prior Analytics’ [Aristotle-2].
The important aspect to notice about Elements of Programming is how the foundations of programming are exposed: through philosophical reasoning.
All these seem to converge to the idea that philosophical reasoning is somehow fundamental to software engineering. Reasoning seems far more utilised in our field compared to having controlled experiments. While we are still in an engineering discipline, the fundamentals appear to be very close to philosophy.
The extent of reasoning
Now that we’ve established that at the core of software engineering there needs to be philosophical reasoning, let’s look at some examples. We’ll use these examples, somehow similar to a qualitative study, to infer some possible characteristics of the type of reasoning we need.
Functions should be small
The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.
~ Robert C. Martin [Martin08]
This is regarded as very good advice2. However, the way it is explained in the book doesn’t convince me. The main reason for this is that the statement is not properly argued. One can easily read this advice as “Functions should be very small because Uncle Bob said so”. In fact, I got an overall impression of the Clean Code book [Martin08] as being perhaps too dogmatic. This is not something that we can put at the foundation of software engineering.
If I need to reason about the size of the function, I would do it along these lines:
- the main activity in programming is reading and reasoning about code; thus, functions should be written in order to be easily reasoned about
- there is a fundamental limitation of the brain that makes it harder for us to reason about systems made of too many parts (studies show that we can hold in our mind about 7 different things at once)
- thus, in order to easily reason about functions, we should make them as small as to accommodate our limitation
- however, there is a downside to this as well: making small functions pushes the complexity outside the functions
- there is also a cost of abstraction that we need to pay – any abstraction makes the code harder to read and reason about
- thus, we should not make the functions small, but not too small to have too many extra abstractions and to push the complexity outside the functions.
This line of thought, at least the first part, can be found in [Seeman21]. The reader can see a big difference between the two stories, even if the result is (roughly) the same. One has fragile reasoning, and one has more reasoning. One doesn’t touch on consequences, the other does.
As Hillel points out, the most popular paradigm in our software industry is charisma-driven development: we do things because high-profile speakers and authors are telling us what to do [Wayne19]. Instead of blindly following advice, we should look at the arguments behind the advice. In lack of a better empirical evidence, we should at least evaluate whether the argument makes sense, and only follow the advice if the argument is convincing.
One thing that I’ve learned in my career is that there isn’t a perfect solution to a problem. Every solution consists of a (potentially long) series of design choices, and each choice has advantages and disadvantages. That is, for every design choice, one can argue in favour, or against it.
Moreover, on each side, there isn’t just a simple argument that one can make. Usually, we have a long chain of assumptions and inferences that lead to arguing pro or against a design choice.
The way I would like to put it is that if one cannot argue on both sides of an argument, one must be confused.
To choose an adequate solution for a problem, we need to properly find arguments to support it in favour of the alternative solutions. That is, we need to be able to reason about selecting a solution for our problem. Considering the fact that we constantly need to make design decisions, the process of reasoning must be constantly present in our development activities.
Thus, at higher-level, we should constantly employ reasoning.
‘for’ loops and reasoning frameworks
Let’s now turn our attention to lower-levels and reason about reasoning code.
In a previous article ‘How We (Don’t) Reason About Code’ [Teodorescu21] we explored a possible meaning of reasoning about code. We first tried to define a somehow formal framework for understanding what are the implications of code; later we discussed a model that also takes into consideration the previous experiences of the programmer.
In the first approach, we defined the reasoning complexity (or reasoning effort) for a code as being the difference between the post-conditions and the pre-conditions of that code. If our minds were purely mathematical, we would write lemmas for everything that’s important to ensure we understand the implications. This reasoning complexity is equal to the number of such lemmas we need.
As an example, we tried to compute the associated complexity for using a classic
for loop (with incrementing variable) and for using a ranged
for loop. In our example, the classic
for loop has a complexity of 20, compared to a complexity of 8 for the ranged
for loop. That is, classic
for loop is much harder to reason about than a ranged
The result of comparing the reasoning complexities of the two styles of
for loops may not be that helpful. What I find significant is that we defined a framework for reasoning about code. We can take this framework and apply it to other problems; we can expand our reasoning capabilities.
In the second part of the article, we introduced another way of reasoning about code, one that is less formal and one that accepts the previous experience of the developer in considering complexity. We called this the inference complexity.
For this new complexity metric, we cannot calculate the complexity of the code just by looking at it. We have to factor in the experiences of its readers, which is very difficult. But, nevertheless, it still gives us a method of reasoning about code. And, especially if we put it near the previous complexity metric, it can tell us about the biases that we, the programmers, have.
We can have reasoning frameworks that are more formal, and we can have reasoning frameworks that are less formal. And, it appears that both of them are useful.
Programming paradigms and guarantees
Reasoning is often hard. To make the reasoning process simpler, we typically perform it in limited bounds. That is, we are drawing some bounds in which we reason, and start from assumptions. If we were to start reasoning without any bounds, we would have to start from metaphysics in order to analyse the merits of a design choice.
Programming paradigms are often useful bounds for the process of reasoning about code. They impose certain restrictions about what can be done in a program that respects that paradigm. Let’s take a couple of examples.
In structured programming, a function is an abstraction. Once we understand that abstraction, we can use the same reasoning each time the function invocation is seen. The meaning of the function doesn’t change depending on the context in which it is being used. If the function can mutate just one single global variable, we don’t expect it to mutate other global variables in any of the invocations.
In functional programming, all the functions are pure. If we look at a function call, we know that the function doesn’t have any side effects, and is idempotent. If we call the function twice, it will have the same effect on the correctness of the program as if we would call it once. Furthermore, the output of the function is only dependent on the arguments of the function. This reduces the amount of reasoning we need to perform. If we see the same function called twice, we don’t have to reason about it twice. Moreover, two functions that are not chained together cannot interfere with each-other; we can reason about them in isolation.
In my previous article, ‘Value-Oriented Programming’ [Teodorescu23], we discussed a new programming paradigm which focuses on value semantic and removing references as first-class entities in a language. If the programs conform to this paradigm, then the user doesn’t have to reason about safety issues (they cannot appear), and all the reasoning can be done locally (no spooky action at a distance).
As discussed in this article, a programming paradigm imposes a set of constraints on the possible programs that can be written, and by doing this it can provide some guarantees about the programs written in that paradigm. These guarantees can simplify our reasoning about the code.
Adding restrictions to a language can simplify it by reducing the number of things we need to reason about. On the other hand, it may make some problems harder to solve; this can increase the amount of reasoning we need to perform. It’s always a compromise.
If we generalise the notion of programming paradigm, the main takeaway of this section should be that, whenever we have to reason about a software problem, we should find the right frame that would be sufficient to contain the problem, and yet provide enough guarantees to simplify our reasoning process.
What can we reason about?
This one is easy: probably every aspect of software engineering. To make it more useful for the reader, let’s provide a few examples:
- Code: What’s the best way to reason about code? How can we easily find out the implications of a code snippet? What’s the right organisation of code? Are there any paradigms that we can set to make the reasoning easier? Etc.
- Software design: How should we design the software? When does a top-down approach works best, and when a bottom-up works? What are the fundamental units of design? How can we measure the quality of a design? What does it take to understand a design? What’s the easiest way to document a design? Etc.
- Software architecture: What is architectural and what is not? How do we make architecture coherent? How much time should we spend in defining the architecture upfront? How can we best document architecture? How to evolve the architecture? Etc.
- Processes: How many processes does a software organisation need to have? Does different processes work better than others? How much are the processes dependent on the organisation? How much can we influence the productivity of the software organisation with processes? How to tell when processes help and when they are just bureaucracy?
- Testing: How much testing does a software need? Do the testing needs differ (significantly) depending on the type of software? Are there any testing methods that are better than others? How to structure our tests?
- Complexity: How can we measure the complexity of a problem? What is essential and what is accidental? How should we reason about complexity? Etc.
- Reasoning: What’s the extent of our reasoning, and how can we ensure that whatever we do has any connection with the reality? How can we move from pure reasoning (prone to errors) into empirical data (better matching reality)? Etc.
This article (this part and the one appearing in the next number) approaches the last two items on our list.
A critique of pure reason
The reader may infer from the above text that I’m strongly arguing to drop any empirical arguments in software engineering and start using pure reasoning instead. This is far from my intention.
We should use this reasoning only in limited scenarios:
- to make sense of things when we don’t have empirical data, or making sense of incomplete empirical data
- to form hypothesis/models that can later be tested empirically.
A great example of reasoning in the first category is Brooks’ reasoning on essential and accidental complexity. We don’t have empirical data to make the distinction between essential and accidental complexity; after all, what is essential complexity?
The second category is far more interesting. Let’s take, for example, the Lean Software principles. They may have originated in practice at Toyota, but they originated in a different domain. In software, they first appeared as a conceptual framework [Poppendieck03]. It took us some time to validate this principle empirically and prove that they can lead to successful software organisations; the main results can be found in the Accelerate book [Forsgren18].
The book provides strong empirical data showing that certain software development practices lead to success. This is one of the few studies that we have in our field that can show empirical that certain practices are more valuable than others.
This should also be our goal with pure reasoning in software engineering: find models that we can later prove empirically.
Interlude (part 1)
We reached the end of the first part of our two-part article. It’s time to have some partial conclusions.
The goal of the entire article was to reason about complexity. In this first part, we discussed the importance of reasoning in software engineering so that we can later apply this reasoning about complexity in the second part.
We started to argue that software engineering doesn’t have enough good empirical studies to capture best practices in the field. Instead, they are replaced by some kind of reasoning. We argued then that some of the most important foundational work in software engineering draw their inspiration from philosophy. It seems that philosophical reasoning is essential to our field.
We then explored a few ways in which we can reason in software engineering. Exploring this landscape gives us an idea of what strategies we can employ when reasoning about software engineering topics. The reasoning that was used to analyse the two types of
for loops is especially important for this article, as we would use the same pattern when reasoning about complexity.
Reasoning is good, but reasoning should not be divorced from practice. Whenever it makes sense, we should test empirically the results of our reasoning.
In the next part, we will look at complexity. We would start by defining the problem: how we are introduced to essential complexity and accidental complexity, and a possible sharp distinction between the two (essential belongs to the problem, while accidental belongs to the solution). With this sharp distinction, we go on exploring essential complexity. We define a framework to associate a number with the essential complexity; this allows us to compare the complexity of two problems. Reasoning on accidental complexity is much harder, as most of the things we do in software engineering end up as accidental complexity. This is why the first part was needed to start exploring the extent of reasoning. At the end, we try to provide another answer to Brooks’ old question: can we find a ten-fold improvement in productivity for software engineering?
Read the next Overload issue for the continuation.
[Henney19] Kevlin Henney, ‘What Do You Mean?’, ACCU 2019, https://www.youtube.com/watch?v=ndnvOElnyUg
[Teodorescu21] Lucian Radu Teodorescu, ‘How We (Don’t) Reason About Code’, Overload 163, June 2021, https://accu.org/journals/overload/29/163/overload163.pdf#page=13
[Teodorescu23] Lucian Radu Teodorescu, ‘Value-Oriented Programming’, Overload 173, February 2023, https://accu.org/journals/overload/31/173/overload173.pdf#page=16
[Wayne21a] Hillel Wayne, ‘Are we really engineers?’, 2021, https://www.hillelwayne.com/post/are-we-really-engineers/
[Wayne21b] Hillel Wayne (2021) ‘We are not special’, https://www.hillelwayne.com/post/we-are-not-special/
[Wayne21c] Hillel Wayne (2021) ‘What engineering can teach (and learn from) us’, https://www.hillelwayne.com/post/we-are-not-special/
[Wayne19] Hillel Wayne, ‘Intro to Empirical Software Engineering: What We Know We Don’t Know’, GOTO 2019, https://www.youtube.com/watch?v=WELBnE33dpY.
- Brooks actually did not put it this way, I just remember it like this, as it makes the problem even more fundamental/metaphysical. Brooks actually argued that software development consists of overcoming essential difficulties and accidental difficulties, and that our main challenge comes from essential difficulties. Hope that the reader would agree with me that the two variants share the same essence. More on the two variants in the second part.
- At least the first sentence, that functions should be small. The second sentence, “smaller than that”, cannot be taken seriously. It tells us that any reasonable definition of “small” that we find is not good enough; it takes away from us the possibility of defining a reasonable threshold for the size of a function.
has a PhD in programming languages and is a Staff Engineer at Garmin. He likes challenges; and understanding the essence of things (if there is one) constitutes the biggest challenge of all.