REVIEW - Software Reliability Engineering - More Reliable Software, Faster Development and Testing


Software Reliability Engineering

More Reliable Software, Faster Development and Testing


John D. Musa



McGraw-Hill/Osborne Media (1999)




Silas S Brown


June 1999



SRE is not to be confused with formal methods, zero defects, or proving programs correct. It is a compromise between reliability (or integrity against security attacks) and development resources (time and cost). It deals in probabilities to determine testing schedules and so forth. One striking thing about it is that even buffer overruns and database corruption are just 'part of the equation'.

This book, written by one of the most prominent contributors to the field and with some persuasiveness, describes the state of the practice at the start of 1998. If you are reading this in a back issue then you should check if there is a more recent edition, as quite a few things are "not yet known" or "we do not currently have guidelines". The book is meant for the practitioner (who doesn't mind reading such a foreboding-looking text, but it's not as bad as it looks), and although there is some background information, it is mainly an instruction book on how to do SRE. I would suggest that readers would have to be at least doing a degree to understand it comfortably.

The book's organisation is intended so you don't have to read all of it (there is much redundancy) but there are some things you can easily miss if you don't, and others can be unclear for a while. Organisation and clarity could be better and there are some notable omissions from the glossary and index, as I found out when I tried an exercise mentioning "adjustment factor" before it was mentioned in the text.

I always get worried when reading about processes that rely on probabilities; I imagine some pathological case coming along and changing them in ways you least expect. One such case is called Y2K and I'm surprised this is never mentioned in Musa's numerous FAQs. Another is the Xerox poisoned message; MTBF breaks down when whatever caused the failure does not back off. Another is the so-called "unreasonable user". I am one of those users because of a disability, and it's not nice to be told that the accessibility you need is not important in the operational profile of the system you've got to use.

Another thing the book fails to mention is the effects of a "here's how to do it" book on programming psychology. People might switch into a "these instructions must be right so I don't have to stay awake" mode, and a danger in any test-based development method is programmers using "the test will pick it up" as an excuse for sloppy coding (I've done this myself and don't recommend it). This does not mean that testing and plans are bad in themselves, but this issue needs to be addressed.

Overall worthwhile if you need to know about SRE, although with notable omissions and perhaps expensive if you're just curious. There is also a website,

Book cover image courtesy of Open Library.

Your Privacy

By clicking "Accept All Cookies" you agree ACCU can store cookies on your device and disclose information in accordance with our Privacy Policy and Cookie Policy.

By clicking "Share IP Address" you agree ACCU can forward your IP address to third-party sites to enhance the information presented on the site, and that these sites may store cookies on your device.