Software problems have much in common with diseases. Klaus Marquardt has a diagnosis and offers some treatments.
To recap, what is PERFORMITIS, or performance bloat?
Every part of the system is directly influenced by local performance tuning measures. There is either no global performance strategy, or it ignores other qualities of the system such as testability and maintainability
In the first part of this article we learned about a project one would never like to be associated with. Certainly it is more fun to read about a doomed project than to live within one. However, PERFORMITIS is not doom, it is a disease that can be cured.
The worst thing about PERFORMITIS is that many colleagues probably consider it a solution rather than a problem. Performance is a key issue of the system, and it is being cared for. Thus, when the evaluation of the symptoms indicate that the project suffers from PERFORMITIS, it may be wise not to spread the news bluntly. Even without naming it or any of its non-technical implications, there are many things to do that resonate with sound engineering practices. PERFORMITIS infected projects normally do not follow these - but the developers should be reasonably familiar with them, or able to understand them, so that the suffering is limited.
Does that help? Well, yes. It reduces the immediate risk: not being able to ship at all. However, the next project likely faces similar problems. In terms of the medical metaphor, we have then soothed the pain and treated some symptoms, but we have not cured the disease.
While PERFORMITIS appears to be a technical diagnosis at first, its real causes are with people and their socialization. The technical symptoms can be attacked by some therapeutic measures or suppressed by extensive processes. These require continued effort and can at best maintain a state of remission. Curative therapies need to address the pathogen.
The first set of therapeutic measures addresses the technical symptoms:
- MEASUREMENT-BASED TUNING can become a relief from thinking too much about tuning up-front and in the least relevant places. Plus, it goes together nicely with tuning attempts in later stages of development.
- Selecting the most efficient places for performance tuning is the topic of ARCHITECTURE TUNING.
- Separating the PERFORMANCE-CRITICAL COMPONENTS is the basic architectural technique to avoid performance bloat - spreading mediocre performance optimization techniques all over the code.
PERFORMITIS has root causes in the team culture and value system. Looking at the techniques to apply, PERFORMITIS-infected teams may not be willing to tackle them without intense discussion. These therapies help to establish a broader view on technical measures:
- The effort expended on performance can be limited and controlled by explicitly assigning room for performance in the development process. VISIBLE QUALITIES allows making a case for performance, but opens room for other qualities as well.
- In teams and organizations that do not define or assign an architect's role, DEDICATED ARCHITECT is a prerequisite
Both therapies change the way the team cooperates and communicates. Discussions about roles and responsibilities, and finally the way to ensure performance and other intrinsic qualities will arise. Speaking in the tongue of the medical metaphor, these changes cause a fever. Like an infection in a living organism, new ideas introduced become attacked. Hot discussions cause friction, like a fever - and finally result in a learning process that helps to improve the balance between different system qualities.
The feverish therapies can and need to be combined with any other therapy of choice. Apart from the stated overdose effects, no combination of the suggested therapies can be harmful to the project. They have mutually increasing effect. However, too many therapeutic changes at the same time might break morale and the team structure. It is worth taking the time to introduce one after another, and anticipate which one brings the most effect in the current situation.
Table 1, overleaf, gives an overview on the applicability of the individual therapies, and how they can best be combined. Some medical terms are used:
- preventive - keep a disease from happening;
- palliative - reducing the violence of a disease, soothing the symptoms so that the quality of life is maximized;
- remission - relief from suffering, while the disease is still present;
- curative - healing from a disease.
Before treating a system with some proposed therapy, due diligence includes learning about its mechanism, and its overdose and side effects. The education of doctors is taken seriously.
|There are rare systems where the absolute dominance of performance is not pathological but a conscious and justifiable decision. Hold on a minute - in all likelihood this is not your situation. To find out, make your priorities of different qualities explicit as in the VISIBLE QUALITIES therapy. Even if you are there, these technical therapies offer some suggestions for improvement: PERFORMANCE-CRITICAL COMPONENTS, ARCHITECTURE TUNING and MEASUREMENT-BASED TUNING.|
|Pattern form and approach|
Both diagnoses and therapies follow their own forms, including sections that contribute to the medical metaphor.
With each diagnosis, symptoms and examination are discussed and concluded by a checklist. A description of possible pathogens and the etiology closes the diagnosis.
Each diagnosis comes with a brief explanation of applicable therapies. This includes possible therapy combinations and treatment schemes that combine several therapies. These are suggested starting points for a successful treatment of the actual situation.
Therapies are measures, processes or other medications applicable to one or several diagnoses. Their description includes problem, forces, solution, implementation hints and an example or project report. Their initial context is kept rather broad since most can be applied for different diagnoses.
In addition to the common pattern elements, therapeutic measures contain additional, optional sections of pharmaceutical information. These are introduced by symbols:
Applies to projects in domains that require a high system responsiveness, especially when the team is only vaguely familiar with the domain and its specific requirements.
Every developer knows that tuning the system for performance is necessary. The key decisions are when to take measures, and which tuning measures to initiate.
Measures taken early are typically most effective,
... but measures taken on assumptions instead of proper knowledge are often inefficient and compromise other system qualities.
Being afraid is always bad advice,
... but being aware of possible problems is wise.
Therefore, measure where the actual performance bottlenecks are, and start tuning measures there. Do not take preventive measures against assumed performance problems. Spend your performance tuning effort where you know it is most effective.
Whenever you think there is a problem, turn your assumption into knowledge. Critical architectural issues can be clarified by spike solution projects [ Beck99 ] or prototypes [ Cockburn98 ] with the sole purpose of identifying the actual performance issues. These spikes are most useful when you have established load profile scenarios or performance budgets.
Where you cannot gain knowledge for some reason, follow sound practices and 'proactively wait' 1 [ Marquardt99 ]. Resist the temptation to begin with micro tuning. Instead, focus on other qualities, especially on testability and maintainability. Most performance tuning measures on architecture or design level require a clear distribution of responsibilities anyway, and you can spend the structure clean-up effort now, when it hurts least.
|MEASUREMENT-BASED TUNING is not directly effective in a curative way, but frees attention and effort to be put on relevant topics of the project.|
|All development team members and technical management needs to be involved with MEASUREMENT-BASED TUNING. Interestingly, the costs of MEASUREMENT-BASED TUNING are often negative. It prevents effort being put into misled measures, and enables you to proceed faster during initial development as well as during performance tuning phases. The costs spent on convincing other stakeholders of the validity of this approach, and of constant observation are typically low.|
|There are no counter indications to MEASUREMENT-BASED TUNING. The side effects are desired: the team does not inefficiently start tuning, and potentially pays more attention to other qualities. Measures are taken in an informed manner. Overdose effects would be to ignore the obvious common sense in your technical domain, or to wait too long before you take corrective action.|
|MEASUREMENT-BASED TUNING has the highest chances to succeed if the architecture has prepared for separation of concerns. One of the key practices to prepare for late tuning measures is to separate PERFORMANCE-CRITICAL COMPONENTS.|
To 'proactively wait' is an important, but difficult virtue. Key to this technique is not to miss the time when decisive action is necessary. This requires self-consciousness and constant observation. At some time the performance problems become noticeable, and then it is necessary to dig deeper. The threshold of when to take action is typically subject to personal taste and working style and requires significant experience. However, discussing them openly with colleagues helps not to miss important indications, and the rarely absent lack of time prevents from being overly responsive.
When we first used an object oriented database, we put all data in it that needed to be shared between different clients. This design led to a highly consistent system. Unfortunately, it was also horribly slow. We lived with that fact for some time, hoping that increasing our knowledge about the OODBMS would provide us with counter measures. After a handful of iterations, the GUI team decided to stub the database and leave the process of continuous integration. This was a severe warning, and we immediately checked the database performance.
It turned out that the most expensive data the OODBMS was occupied with was transient data; it was distributed among different clients but did not require persistency. We had naively not separated these aspects, hoping the OODBMS would be sufficiently fast.
Two concrete actions were initiated. First, the distribution mechanism became separated from the database access. Second, for the sake of consistent class interfaces, the classes meant to become persistent were no longer derived from the OODBMS base class. Instead we provided a distinct persistence service that we passed the objects to, and maintained the database schema by generating the persistent classes from the application's class model.
Applies to projects that need performance tuning.
A development team experiences severe performance problems, and needs to decide how to tackle them.
Local tuning efforts based on profiling help to improve the system's responsiveness,
... but you'd need to have a lot of local improvements to push the overall performance by orders of magnitude.
Performance can be attacked on every level of development,
... but initiatives on architectural level likely have the most impact.
Therefore, tune the system at the architectural level. Before you initiate any other improvement efforts, make sure that the architecture itself helps the system responsiveness, and exhibits no obvious flaws or 'black holes' in which processing power vanishes.
Ignoring the architecture level might get you lost in an endless sequence of severe battles, each saving in the sub-percent range of CPU load or other system resources. Your actual goal is not only to win the performance war but to win your peace with performance.
Looking at the scope of the architecture, performance is mostly decreased due to one or more of the following mechanisms:
- Random and inefficient use of system resources or infrastructure.
- Inappropriate locking of shared resources, or inappropriate transaction granularity.
- Multiple executions of identical calls without functionality gain.
- Network or inter process communication in many small messages, and amongst many different components.
- Blocking operations in tasks supposed to be responsive.
- Inappropriate distribution of responsibilities among clients and servers, for example how query result sets are transported and kept.
- Inappropriate database schemas, or example inadequate data types, or too little or too much normalization.
- Extensive error checking, tracing and logging.
- Error handling strategies that pollute the standard flow of operation.
- Interfaces requiring multiple data copying or data format conversion.
- Doing everything doable as soon as possible, or as late as possible.
A few changes to the architecture or top-level design can increase the performance by orders of magnitude. Performance tuning typically follows a few fundamental principles [ Marquardt02a ].
|ARCHITECTURE TUNING is a process to find curative technical solutions to technical symptoms. If the related diagnosis indicates that the pathogen is beyond the technical scope, it can lead to remission.|
|ARCHITECTURE TUNING involves the architect and every developer assisting in analysis and implementation of the performance tuning. Its cost are hardly predictable, they depend on the necessary refactoring effort and the actual implemented changes to the architecture. However, in large projects they are lower than the costs of numerous attempts for local optimizations, and it is more likely to be effective.|
|There are no counter indications to ARCHITECTURE TUNING - given that you do not consider abandoning tuning effort, or the project at all. The side effect is a well-structured system. No overdose effect is currently known.|
|ARCHITECTURE TUNING comes with the least cost when you separate the PERFORMANCE-CRITICAL COMPONENTS early in the project.|
If your system does not allow you to implement the changes you have identified being necessary, you need to refactor it in advance. Typical refactorings for tuning the architecture are the introduction of shared technical components, separation of resource maintenance from resource usage, and separation of performance critical tasks into distinct components.
While the system is developed, it is good practice to make these late changes as convenient as possible. As PERFORMANCE-CRITICAL COMPONENTS explains, this is most efficiently done by maintaining a design with a clear separation of responsibilities, and a dependency structure with few (if any) compromises. Such a structure also helps during development with respect to testing and task assignment, and to maintenance in later project phases.
A contractor had managed to become the mind monopole at one of my customers. He motivated his queer data model with reasons such as 'using DB/2, comparing integers shows higher performance than comparing strings'. The effect exists, but can be neglected compared to the costs of disk access or joins.
When the system went productive, it needed 1.7 seconds per transaction, which would have caused annual operation costs of several 100,000€. Tuning measures saved around 10-20%, but to reach a performance comparable to similar systems a factor of 100 would have been necessary. I was asked to evaluate the architecture for optimisation possibilities, and found sufficient opportunities to save a factor of 1000 - starting from simple measures like skipping consistency checks of large data structures with every internal call (the much too large structures contained several 100 sets of data), up to fundamental changes to the architecture.
Applies to projects in domains that require high system responsiveness.
A development team that is aware of performance tuning needs to prepare for tuning measures before they are actually done.
It is hard to foresee where changed will become necessary later in the project,
...but preparation early in development saves restructuring effort and time later.
Performance can be attacked on every level of development,
...but initiatives at the architectural level likely have the most impact.
Incremental development can proceed fastest when the user-visible functions are developed independently,
...but shared libraries and common infrastructure can become more mature and efficient than dispersed items all over the system.
Therefore, factor out the performance critical components, and limit preventive tuning measures to these. Start by dividing them in a way that business logic, display, distribution and technical infrastructure are independent of each other and can be tuned individually. Ensure that all applications use the same infrastructure so that central tuning measures become possible.
In systems that have been found to waste performance, these areas were good candidates where a responsiveness gain by an order of magnitude was possible:
- Inefficient use of system resources or infrastructure.
- Multiple repetitions of identical calls without functionality gain.
- Badly designed or agreed interfaces requiring multiple data copying or data format conversion.
- Extensive tracing and logging; error handling strategies that pollute the standard flow of operation.
To avoid these pitfalls, your system needs to be prepared to change internal access strategies and rearrange call sequences. Make sure that you can start with ARCHITECTURE TUNING whenever you need to. Once you start tuning, refactor to the degree that the planned tuning related changes can be made easily [ Fowl99 ]. To avoid large and late refactoring efforts, start with a piece-meal growth approach and a 'clean desk' policy that includes refactoring in the daily work.
Within each of the components above, again distinguish between published and internal functionality. Factor everything specific to your particular environment into distinct components, like services, calls to technical services, database queries, and handle acquisition. Make sure that the responsibilities among all components are clear and concise. Separate the logical contents from the physical execution, with few explicit linkage points.
|PERFORMANCE-CRITICAL COMPONENTS is a preventive therapy that increases the system's adaptivity to further therapies such as ARCHITECTURE TUNING. It fosters many qualities, but does not directly affect performance in and by itself.|
|The entire development team and technical management need to be involved. PERFORMANCE-CRITICAL COMPONENTS increase the costs you would spend on architectural decomposition. Though the initial costs will pay off if you really run into performance problems, consider PERFORMANCE-CRITICAL COMPONENTS as a risk reduction strategy.|
|Do not apply PERFORMANCE-CRITICAL COMPONENTS when your system is very small, or applies standard technology only. Though you gain valuable experience, the costs hardly pay off in these cases. Another counter indication is a weak position of the architect in the project. Side effects include an improved logical structure of your system. Overdose effects would be DESIGN BY SPLINTER [ Marquardt01 ] if you drive the separation to an unbalanced extreme, or a micro-architecture similar to micro-management violating your DEFINED NEGLECTION LEVEL. [ Marquardt02b ].|
|PERFORMANCE-CRITICAL COMPONENTS is the preventive strategy to later ARCHITECTURE TUNING. It can be one of your risk reduction measures applied with VISIBLE QUALITIES.|
Start with a useful separation that is most likely to support your actual needs. Keep the performance critical components as small as possible by iteratively repeating the division.
From the experience the team has gathered in the domain, both application and technology domain, you already know which parts of the system are likely to become the bottlenecks. Run a retrospective to uncover which areas have been performance relevant in previous projects. You will experience a fair amount of support when you separate these from the remainder of the system, and apply EXPLICIT DEPENDENCY MANAGEMENT [ Marquardt02b ].
In a real time patient information system, all transient data was stored in shared memory. This was hidden from the clients to this data; some opaque access classes provided an interface independent of implementation issues. Tuning the performance of data access and locking granularity was located within the access layer classes.
Applies to projects whose team focuses all work and thoughts on a few essential ideas, but ignores all other issues that might also be or become important to the project's success.
In a development team that focuses on a particular quality of the product, you need to address further important system qualities that are essential to adequately manage the system architecture.
Neglecting internal qualities can cause a large system to break under its own weight,
...but the value they add to the software is hidden and becomes visible only in the long term.
Internal qualities can be crafted intentionally into the software,
...but they are hardly visible from a bird's eye perspective.
Therefore, make your system's internal qualities visible. Similar to sound risk management practice, maintain a list of your top five qualities. Define measures to achieve them, and determine frequently to what extend you have reached your goal.
The key issue is to raise awareness for the existence of these qualities and their relative importance in the team and in management. Especially when the internal system qualities are unbalanced, ask the team for a list of possible qualities and discuss their value and advantages. The team should order them according to their priority. Do not mind if your favourites are not the topmost - you will go through the list every week or two and re-evaluate.
Do the same process with management, and make both lists visible. While it is often not possible to resolve any conflict and come to consensus, the fact that all qualities are there and considered important leads to awareness, a more careful balancing and to an architecture and design that addresses different qualities explicitly.
You need to maintain the lists, find criteria how to evaluate whether a specific quality has been achieved, and define appropriate actions [ Weinberg92 ]. This could become a part of a periodically scheduled team meeting. Especially the evaluation criteria would be a tough job, as most qualities show only indirect effects. Try to define goals that appear reasonable to the project. If you or the team fails to define criteria, leave that quality at the end of the list for the time being.
|VISIBLE QUALITIES is effective through creating attention and a positive attitude. The attention achieved by the top-five list causes second thoughts, awareness, and potentially actions, while the measurable achievement fosters a positive attitude that in itself already could improve the quality of work.|
|The work and initial costs are with the architect, but VISIBLE QUALITIES requires involvement of the entire team. In the mid term, the effort required is comparable to mentoring or coaching, while in the long term it pays off through improved development practices.|
|There are no real counter indications to VISIBLE QUALITIES, but if your team is resilient to learning other therapies might be more cost effective for your project at hand. You might experience negative side effects if you fail to explain the importance of different qualities, and a continuous neglect of specific qualities might finally break a large system. Prevent this by establishing a veto right on certain priorities. An overdose could be injected if the team does not get the idea at all, or is disgusted by the somewhat formal process. Use the drive for discussion to come to an adequate dosage.|
|Bringing a mentor to the project could reduce the ceremony level introduced by VISIBLE QUALITIES. Otherwise, they are successfully accompanied by ARCHITECT ALSO IMPLEMENTS [ CoHa04 ].|
For motivation of the team and management, the testability quality often is a good starter. Its benefits towards risk reduction and customer satisfaction are obvious, and it can be verified with concrete actions, namely implementing the tests. For testability, the achievement criterion could be 'all classes are accompanied by at least one unit test' or, if you introduce unit tests late in the project, 'every fixed defect has to be accompanied by at least one new test case'. If for some reason the unit testability is hard to achieve, this is a potential hint for a design fault. To get away with a rule violation, a developer should need to convince the architect. There are situations e.g. in GUI development that are hard to unit test, but improvement suggestions may enable to test at least parts of the functionality, e.g. after a class has been split into distinct parts.
It is not important to maintain the list for a long time. If you introduce it, and hold it up often enough so that the developers know that you are serious about it, you might neglect the list and only check it at the start of a new iteration or release period. The check to what degree you have reached the internal qualities never becomes obsolete, but can be reduced to one check with each iteration or release.
Some qualities are hard to measure by numbers, but for others there are commercial tools available. As an example, the software tomograph [ Roock06 ] supports a quantitative evaluation of the internal software structure.
The team was new to object-oriented design, so we discussed a lot about the promised qualities it should deliver. We started to do joint design at the white board, and explored some examples how a high extensibility could be reached, how testability could be increased, and what amount of decoupling required what effort.
When the team size increased, design reviews became an essential part of the project. Initially I participated in most, and we established an ordered catalogue of criteria to check. With this catalogue, the process was accepted and carried by the team. Closer to the end of the project, the team decided to focus on other issues and reduce the formality of the design reviews. By that time, the project lasted for more than two years; all team members had significant expertise and shared a common sense.
Applies to projects that have no dedicated architect and experience trouble with their architecture, either in quality of the architecture itself, in incoherent visions, or in uncovered effort.
In a development team that has an informal design and architecture process without a dedicated role assignment, the lack of a dedicated architect can cause one or more of the following problems:
- The development focus is on management goals only. A technical focus is not present, or is randomly selected by individual developers or managers.
- Important internal system qualities that are essential to adequately manage the system are not addressed.
- Developers who take over significant parts of the architectural tasks fall behind their schedule.
- Different people address different expert developers for technical issues.
- Questions concerning the architecture are not consistently answered.
These forces are present in the situation:
An acknowledged architect has less time available for real programming, and is potentially expensive,
...but dealing with inconsistencies and the subsequent system failures is even more expensive,
Small projects can come along without much effort on architecture,
...but building a large or reuseable system requires attention to issues that hardly matter in smaller systems; neglecting internal qualities can cause a system to break under its own weight.
Any architect's experience and view on the software world is limited,
...but an architect has the broadest view of the developers, and he can still delegate.
Newly assigning an architect within a given team might cause personal conflicts,
...but each such conflict would be present anyway, and would otherwise express itself in technical inconsistencies.
Therefore, ensure that an architect's role becomes defined and assigned to a key developer. The architect becomes responsible for creating a common vision of the system, ensuring technical consistency, broadening the architectural view, re-balancing the different forces on the architecture, and coaching the development team on the internal qualities (the '~ilities') that are essential to crafting large software systems. In turn, all developers, managers and technical leads pass their decision competency in these areas to the dedicated architect, and provide sufficient resources - namely the working time of the architect.
An architect that is expected to initiate significant change will need dedication, explicit empowerment, and time to become accepted among his peers. Much less time can be devoted to 'real work' such as coding, and this needs to be reflected in the project schedule and the performance review criteria. While most project situations can live with one or more developer informally taking parts of the architect's role, as advocated by agile development methods [ Beck99 ] [ Agile01 ], architectural tasks may require significant effort and time. You can compromise on how the role is called, but an architect needs to be able to spend significant effort without troubling his boss or his career.
The obvious candidate for the assignment is the informal architect. Convince your manager to establish the architect's role by indicating the risk reduction it could bring to the project, and by comparing the consequences of not having a consistent architecture against the costs of having an architect. This process will likely take some time since the problem needs to be perceivable to management. Try to get support from other developers in advance, including external contractors in case they join the project team. Their opinions might be considered more significant than those of employees.
When the team has several informal architects, the one of them who is most frequently asked is the right candidate. A team of architects can also work when each member has a distinct key area. However, one person must have the final decision.
If there is no informal architect, this means that the team creates the architecture by consensus or accident, though with the best of intentions. In this case you should consider asking for an outsider to join. The same applies if there are too many architects in the team, and picking one of them would break the project.
|The mechanism behind DEDICATED ARCHITECT is acknowledgement by management. Only an acknowledged architect is able to devote sufficient time, and receive sufficient respect from the team.|
|DEDICATED ARCHITECT involves management, the architect, and potentially all team members. The costs can become significant because you need to dedicate time to architectural issues as well as to establishing the new role in the first place. Contracted architects are even more expensive than internal ones. However, the costs for a good architect will reduce the project risk and likely pay off several times, while the costs for a bad one will lead to further cost explosion.|
|DEDICATED ARCHITECT has one counter indication: when the team would not accept any architect. Its side effects are on the workload that the team can manage. It will decrease in the short term, but eventually increase in the mid to long term. Another side effect is the positive influence on the career of the assigned architect.|
|The trust in the architect is likely fostered by ARCHITECT ALSO IMPLEMENTS [ Coplien04 ], when the developers perceive that the architect still knows how to express ideas in code.|
The doctor wants to see you again
For a system suffering from Performitis, the major engineering practices have been introduced. They should reduce the problems to an acceptable level, and can be applied after considering their effects and costs. Other therapies, VISIBLE QUALITIES and DEDICATED ARCHITECT, facilitate the introduction of technical practices.
While this is sound practice and probably useful advice, it doesn't address the heart of the problem. Your team is likely to fall back into Performitis as soon as these therapies are no longer applied strictly. And poor compliance with prescribed therapies is something that all doctors suspect of their patients...
By next time, the therapies should have become effective. We will then reflect on the successes and see what we can do about the non-technical issues. We can then avoid PERFORMITIS in your next project.