In my early days as a programmer, I often found myself responsible for maintaining poorly written, buggy code with no clear references as to the intended behaviour (or design). Much of it I didn’t even write myself. A common reaction to such circumstances is a desire to re-write the code in a much more understandable and comprehensible form. I even tried that a couple of times.
Programmers like to rewrite systems In most projects code “rots” over the course of time. It is natural for a programmer to make the simplest change that delivers the desired (results rather than the change that results in the simplest code). This is especially true when developers do not have a deep understanding of the codebase – and in any non-trivial project there are always some developers that don't understand part (or all) of the codebase. The result of this is that when functionality is added to a program or a bug corrected then there is more likelihood that the change will result in less elegant, more convoluted code that is hard to follow than the converse.
As a result of my early attempts at rewrite, I discovered that there are problems in rewriting software one doesn't understand: the resulting code often doesn't meet the most fundamental requirement – to do all the useful stuff that the previous version did. Even with an existing implementation the task of collecting system requirements isn't trivial – and getting the exercise taken seriously is a lot harder than for a “green field” project. After all, for the users (or product managers, or other domain experts), the exercise has already been done for the old system – and all the functionality is there to be copied.
Of course, for a developer writing new code of one's own is less frustrating than trying to work out what is happening in some musty piece of code that has been hacked by a parade of developers with varying thoughts on style and levels of skill. So developers rarely need convincing that they could do a better job of writing the system than the muppets that went before. (Even on those occasions that the developers proposing a rewrite are those the muppets responsible for the original.)
That doesn't stop them trying.
Managers don't like to rewrite systems
Managers are typically not sympathetic to the desire to rewrite a piece of existing software. A lot of work has been invested in bringing a system to its current state – and to deliver the next enhancement is the priority. With software-for-use the users will be expecting something from their “wishlist” with software-for-sale then a new version with more items on the marketing “ticklist” is needed. In both cases, even if the codebase is hard to work with, the cost of doing that is usually trivial compared to the cost of writing all the existing functionality again together with the new functionality.
A typical reason for rewriting to be considered is that the old system is too hard to change effectively – often because the relationship between the code and the behaviour is hard to understand. So it ought not to be a surprise that the behaviour is not understood well enough to replicate. The consequence of this is that, all too often, by the time the programmers realise that they can't deal effectively with making changes to the codebase a point has been reached where they are incapable of effectively rewriting it.
In view of this it is understandable that management prefer to struggle on with a problematic codebase instead of writing a new one. A rewrite is a lot of work and there is not guarantee that the result of the rewrite will be any better than the current situation.
And they know from experience that programmers cannot estimate the cost of new work accurately.
When is a rewrite necessary?
There are, naturally, occasions where a rewrite is appropriate. Sometimes it is necessitated by a change of technology – one system I worked on rewriting was a Windows replacement for the previous DOS version. This example worked out fairly well – the resulting codebase is still in used over a decade later. Another project replaced a collection of Excel spreadsheets with programs written in C++, at least that was the intent – years later there are still Excel spreadsheets being used. (The project did meet some of its goals – the greater efficiency of the C++ components supported a massive increase in throughput.)
I remember one system I worked on where there was one module that was evil. Only five hundred hundred lines of assembler – but touching it made brains hurt (twenty five years later I still remember the one comment that existed in this code “account requires” - which, as the code dealt with stock levels, was a non-sequeter). Not only was the code a mess, it was tightly coupled to most of the rest of the modules: every change caused a cascade of problems throughout the system. After a couple of releases where things went badly following a change to this module - weeks of delay whilst unforeseen interactions where chased down and addressed - I decided that it would be cheaper to rewrite this module than to make the next change to the existing code. I documented the inputs, the outputs, the improved internal design and the work necessary and asked permission. As luck would have it the piece of code in question had sprung from the brain of the current project manager. He didn't believe things were as bad as I said. We tried conclusion: he would make the next change (whatever it was), and if it took longer to get working than the time (three weeks) I had estimated for a rewrite then I got to do the rewrite. The next change was a simple one – he had a version ready for testing in a couple of hours. He nearly got it working after two months and then conceded the question. I got to do the rewrite, which worked after my estimated three weeks and was shorter and measurably more maintainable (even by the boss).
But that was one of the good times – on most occasions a rewrite
doesn't solve the problem. There are lots of reasons why maintaining a
system may be problematic – but most of them will also be a problem
while rewriting it. A poorly understood codebase is a symptom of other
issues: often the developers have been under pressure to cut corners
for a long time (and that same pressure will affect the rewrite), on
other occasions they are new to the project and, in addition to lacking
an understanding of the codebase they don't have a grasp of the system
they propose to reproduce.
So, the caution is, to be sure that the underlying problem (the real cause of the problems with the codebase) will be addressed by the rewrite. An organisation (it is usually the organisation and not individuals) that produced messy code in the original system is prone to produce messy code in the rewrite.
The worst of all possible worlds
There is one thing worse than rewriting a system without a clear understanding of the causes of problems in the code. And it is seen far more often than common sense would suggest. This is to split up people with knowledge of the system and put some of them to rewriting the system and some of them to updating the existing one. The result is that both groups come under the sorts of pressure that cause code rot, and that the rewrite is always chasing a moving target. It is hard to maintain morale in both groups – both “working on the old stuff” and “chasing a moving target” demotivating. This is probably the most expensive of all solutions – it is more-or-less guaranteed that a significant amount of effort and skill will be wasted. And quality will suffer.
A curious change of roles
Recently I got was recruited to develop the fourth version of a system in three years. This rewrite was largely motivated by the developers – who felt they could offer a much more functional system by starting again. We never found out the truth of this conclusion as, several months into the project and following a change of line management, the rewrite was cancelled and our effort was diverted into other activities (such as getting the existing system deployed throughout the organisation).
As part of the re-evaluation that followed I had occasion to take a look at the the existing codebase and, after talking to the users of the system, it was clear that the most important of the desired new functions were easy to add. The principle problems with the existing code was a lack of specifications and/or tests (and the little user documentation that did exist was wildly inaccurate). Addressing these issues took some time and has involved a few missteps, but after a few months and a couple of release we have a reasonably comprehensive set of automated test for both the legacy functionality and for the new functionality we've been adding. It may take another release cycle but we will soon be in the position to know that if the build succeeds then we are able to deploy the results. (And, since the build and test cycle is automated, we know whether the build succeeds half an hour after each check-in.)
So we have discarded the work on the fourth version and are now developing the third and have addressed the most serious difficulties in maintaining it. As I see it our biggest problem now is the amount of work needed to migrate a change from development into production – we need to co-ordinate the efforts of several support teams in each of a number of locations around the world. It can take weeks to get a release out (to the extent that code is now being worked on that will not be in the next release to production, nor the one that will follow, but the one after). In these circumstances it seems strange that a rewrite of the software is now being mooted. The reason? Because it is written in C++ and, in the opinion of the current manager, Java is a more suitable language.
This situation is a novelty for me: none of the developers currently thinks a rewrite is a good idea (neither those that worked on the rewrite, nor those that prefer Java) and, rather than the developers, it is the manager suggesting a rewrite. There may, of course, be good reasons for a rewrite – but the developers have a bug-free, maintainable system to work with and are not keen on replacing it with an unknown quantity.