The Edge of C++

The Edge of C++

By Deák Ferenc

Overload, 28(159):24-34, October 2020


Everything has limits. Deák Ferenc explores the bounds of various C++ constructs.

Everything we interact with in our daily lives (well, except the universe) has a boundary that constrains its existence to within well-defined limits. There are borders delimiting a country, there are walls keeping out bluer than white walkers, there is a finite number of bits that a variable can reach and, of course, there is the maximum quantity of source a C++ compiler can swallow without choking.

In our daily usage of C++, we rarely reach these limits, but regardless, the almighty Standard covers these fringe situations too. This article, in which we will explore the outer edges of some of the most common compilers, is based on ‘Annex B – (informative) Implementation quantities’ [ANNEX-B] of the (current) C++ Standard.

In this article, I will walk you through these limitations and what they mean for your daily life. I will present a tool for generating small test source files for testing each of the specific limits from Annex B and that will push the compilers to their limits.

Annex B

No, I am not going to include Annex B in the article. It would be a waste of paper and we are trying to be as environmentally conscious as possible, so anyone interested can fetch it (from [ANNEX-B]), and I will just give a short overview of what it is.

In the C++ Standard, Annex B lists the maximum recommended values for various code snippets from a C++ application that the Standard writers recommend that a compiler should support. From the Standard:

However, these quantities are only guidelines and do not determine compliance.

For example, it is recommended that the supported number of arguments in one function call should be at least 256. Certainly, this sounds like a pretty big number and no-one would be expected to type in 256 arguments by hand, but consider that today a lot of the code that is being compiled is first generated by code generators (think about Google’s protobuf compiler for example, or just the unreadable output of a software modeling/CASE application, Qt’s resource compiler or any other applications out there which generate code for you). You may find yourself in a situation where code being generated is heading towards this limit.

The actual limits imposed by the compiler

All the current compilers I have tested have a page ([GCC], [CLANG], [MSVC]) where they present the limitations actually imposed by their implementations. However, not all the available limits presented in Annex B were to be found in all the documented limitations, and not all the compilers have identical values.

The test suite

As mentioned before, the main purpose of this article is to provide a set of tests for compilers to test the supported edge situations. The code is generated by an application, for fun’s sake written in the go programming language, and it is available at the [GITHUB] location. Everyone is free to download it, modify it and extend it to fit their needs.

The test suite is contained in a big json package where each entry is of the format shown in Listing 1, where most of the fields are self-explanatory; however, testName must be mapped to one of the functions in the go program, which parses this json, and calls the specific methods, for each value in the count field.

{
  "run": true,
  "testName" : 
    "parameterCountInFunctionDefinition",
    "count": ["256"],
    "minimum": "256",
    "description": 
      "Parameters in one function definition
      ([dcl.fct.def.general]"
}
			
Listing 1

As a side note, some of these test cases were intended to test a specific feature of the compilers, but unwittingly highlighted an error somewhere else in the product. I took the decision to leave them as they are because highlighting these errors might be useful for compiler writers on the quest to continuously improve their products.

The compilers

All the tests I have performed on a computer using dual boot between two operating systems: Firstly a brand new shiny Ubuntu 20.04 just downloaded from Canonical which by default comes with the following compilers:

  • g++ 9.3.0 (installed via apt)
  • clang 10.0.0 (installed again via apt)
  • icc (ICC) 19.1.2.254 20200623 installed as a by-product from a trial version of Intel Parallels Studio

And secondly, under Windows 10:

  • msvc from Visual Studio 2019 (Version 16.4.5)

I deliberately chose not to use a locally compiled version of any of those compilers. I tend to stick to the mainstream Linux distributions, and use what is available for the largest communities of programmers right out of the box, so making a highly personalized compiler would not have been an ideal comparison ground for everyone who uses default compilers on their OS.

Some of these test cases required the activation of C++17 features; however, I consider that in 2020 this should not be such a big issue.

Timing issues

In the test results, I intentionally did not include a precise measurement of the time it took each test to compile. That would have only made sense for my computer, and if someone repeats the test on a much slower or faster computer, the results they obtain will be significantly different.

Where I have observed a noticeable difference between the various compilers, I have added my comments regarding the specific case.

The compilers’ own test suites

Before digging deeper into the subject, I have to mention that both gcc and clang come with exhaustive test suites meant to verify the correct functionality and compliance (with the Standard) of the compilers. However, I did not find a dedicated test suite for the edge situations I am researching through this article, so I thought that providing a unified set of tests for all the C++ compilers would be beneficial.

Unfortunately, I did not find any test suite for the Microsoft compiler nor for Intel’s compiler, considering the closed source nature of the product, but I would love to hear from developers who actually work(ed) on Microsoft’s C++ compiler to see whether they have considered these test cases too.

Numbers

For the test cases, I intentionally used numbers that are powers of two. Only for very special cases did I dig deeper and identify a number outside of this family. For most of the test cases, I specifically tested against the Standard-recommended values, and where for some test cases I have pushed the compilers a bit further, there is a note in the test case.

The tests

Most of the tests are represented as a single generated C++ file; however, some cases required that some of the tests are joined together. For example, testing the maximum number of arguments really makes sense with the maximum number of parameters a function can have.

These small applications were carefully engineered to cover all the required edge cases and are all compilable independently of each other. Setting generateMakefile to true in the json will generate a Makefile when the test generator is run beside the CPP files.

Furthermore, you can request measurements of execution time (and other important data) using the timeCompilation flag, and the timeFlags in the json. I currently use "-f '%E,%M'" to measure the time taken in seconds (%E) and the amount of memory used (%M) by the process.

So as not to depend on data from only one invocation of the compiler, you can gather an average execution time for the compiler compiling the same source by specifying the "compilationTimes" property to be the number of compiler invocations you want.

There are places in the tests where local (global) variables are initialized. For ease and in order to get a consistent and reproducible behaviour between test runs, all of them are initialized to 1. I found no difference in the compilers’ performance whether I used a set of random numbers or just used 1.

All of the tests require the output of some values to the screen, so I used the standard iostream header with std::cout to print out all necessary values.

Nesting level of iteration, selection, compounds statements –nestingOfStatements

For this test, I generated a simple source file containing alternating for and if statements, like the sequence in Listing 2.

int main() {
  for (int f0 = 1; f0<256; f0++ )
    if (f0 % 2 == 0)
      for (int f1 = f0; f1<256; f1++ )
        if (f1 % 3 == 0)
          for (int f2 = f1; f2<256; f2++ )
            if (f2 % 4 == 0)
              ...
			
Listing 2

This seemed to be complex enough to prevent the compiler from optimizing out everything while still generating assembly code that was not itself overly complex.

Regardless, no-one in this life should be required to handle applications in which the nesting level reaches even half of the Standard-recommended maximum value to support, which is 256. (Maybe this is why clang, being a pragmatic compiler, actually got stuck at 128 and was killed after five hours of struggling with the generated source consisting of 256 nested statements.) gcc, though, had no problem compiling applications which contained up to 1024 nested levels. More, I did not dare try.

icc had no problems generating code for up to 256 nested levels, but msvc does not support the depth of 256. It actually gives an error at 166:

  nestingOfStatements-166.cpp(169): fatal error
  C1061: compiler limit: blocks nested too deeply

but compiles fine for 164 levels.

gcc clang msvc intel
1024 128 164 256

Nesting levels of conditional inclusion – nestingLevelOfConditionalInclusion

This test required the definition of a specific number of identifiers, all of which could be used as tests in a conditional check. If all of them evaluated to true, the proper header file for writing out the actual number for this test was included. The code generated was like:

  #define COND_0 1
  #define COND_1 1
  ...
  #if defined COND_0
   #if defined COND_1
    ...
     #include <iostream>
    ...
   #endif
  #endif

No compilers had any issue compiling the code up to 512 identifiers, which is double the Standard-recommended value to support; however, clang was much slower than gcc.

gcc clang msvc intel
512 512 512 512

Pointer, array, and function declarators modifying something – pointerAndArrayDeclaratorsModifyingSomething

I have to admit, this was one of the trickiest cases I had to generate code for, as the optimizers in today’s compilers are simply too clever. They instantly see through your intentions and, throwing out all your efforts to generate code for calculating values, instead do the calculations themselves and substitute the results in the generated code. I really had to use a lot of trickery.

For example, Listing 3 is the code generated for 4.

#include <iostream>

constexpr int z() {
  return 0;
}
int main() {
  volatile int i = 0;
  volatile int *volatile p1=&i;
  volatile int *volatile *p2 = &p1;

  * & z()[* & z()[* & z()[&p2] ] ] = 4;
  std::cout << i << std::endl;
}
			
Listing 3

As expected, it prints out 4. Some explanations: firstly, if there is no volatile, the compiler simply ignores all the code and just generates the required assignment. The weird looking expression of * & z()[* & z()[* & z()[&p2] ] ] = 4; is actually equivalent to **&p2[0] = 4; but I wanted to use both pointer arithmetic, array indexing and function in the same expression, thus ended up with this monstrosity.

gcc had no problems compiling up to 1024 modifiers, clang complained at a certain point that fatal error: bracket nesting level exceeded maximum of 256; however, if I specified -fbracket-depth=1024, it compiled without any issues.

msvc and icc again had no problems compiling up to 1024.

gcc clang msvc intel
1024 1024 1024 1024

Nesting levels of parenthesized expressions – nestingLevelsOfParenthesizedExpressionsInAFullExpression

Because the compiler can be very effective at optimizing code by pre-calculating values in the compilation phase, this test generated a complex parenthesized expression to calculate the summation and multiplication of various numbers. gcc had no issues with the depth of the expression up to 1024 (four times the Standard-recommended number); however, clang gave a very clear error message in the form of fatal error: bracket nesting level exceeded maximum of 256 and I also appreciated the suggestion on the next line on how to fix it: use -fbracket-depth=N to increase maximum nesting level. After using this parameter, clang compiled without problems.

msvc and icc had no problems compiling nesting parentheses up to 1024, which is a pretty large value for this purpose, so I concluded that this was an acceptable value for this test case as some compilers (well, all except gcc) started showing error messages for 2048.

gcc clang msvc intel
2048 1024 2048 2048

Number of characters in an internal identifier or macro name – identifierOrMacroNameLength

This was an easy run: just define a macro with a long random name, then a function with a different long random name containing a variable with a third long random name being assigned to the macro. Then, call this function. Most of the tested compilers had errors when compiling code with variable names as long as 8192 characters, except msvc which conjured up the message:

  identifierOrMacroNameLength-8192.cpp(3): fatal
  error C1064: compiler limit: token overflowed
  internal buffer

msvc proved to be successful for 2048.

gcc clang msvc intel
8192 8192 2048 8192

Number of characters in an external identifier – externIdentifierNameLength

Almost as easy as the previous test, I just had to use a small trick. To avoid multiple compilation units for the extern variable, I defined it just after main as follows:

  #include <iostream>

  int main() {
    extern int vxvl;
    std::cout << vxvl << std::endl;
  }
  int vxvl = 4;

Most of the tested compilers had no issues in compiling code with variable names up to 8192 characters, which I consider to be more than enough, except msvc which gave up with a similar error message to the previous case, but it succeeded for 2048.

gcc clang msvc intel
8192 8192 2048 8192

External identifiers in one translation unit – externIdentifiersInOneTranslationUnit

The code generated for this case pretty much follows the recipe for the previous case, just varying the number of identifiers. Here, to my surprise, clang crashed during something that – in the printed stack trace – looked like a recursive call when it tried to compile the Standard-suggested value of 65536. gcc also had its fair share of struggles with this value, taking several minutes; however, it completed its task successfully. icc gave up with the following error:

  externIdentifiersInOneTranslationUnit-
  65536.cpp(65540): internal error: bad pointer

so I had to lower my expectations. clang successfully managed to compile 8192 external identifiers, and icc managed 4096.

msvc really had no problems compiling the test case with 65536 values.

gcc clang msvc intel
65536 8192 65536 4096

Identifiers with block scope declared in one block – identifiersWithBlockScopeDeclaredInOneBlock

This test just consisted of generating a long list of variables in a block and seeing when the compiler complained, but all the tested compilers successfully compiled up to 8192 local variables.

gcc clang msvc intel
8192 8192 8192 8192

Parameters in one function definition – parameterCountInFunctionDefinition

This is one of the test cases which was joined together with another, namely ‘Arguments in one function call’, because it just made sense. The application generates a function with the required number of parameters, generates a list of variables of different type, and calls the function with the required number. None of the tested compilers had issues compiling functions with up to 4096 parameters, which is 16 times the recommended amount, so I consider that to be a fair number.

gcc clang msvc intel
4096 4096 4096 4096

Structured bindings introduced in one declaration – structuredBindingsInOneDeclaration

The code generated for this is a larger scale of the one in Listing 4.

#include <iostream>

int main() {
  int arr[] = {1, 1, 1, 1};
  auto volatile [v0, v1, v2, v3] = arr;
  int i = v0 + v1 + v2 + v3;
  std::cout << i << std::endl;
}
			
Listing 4

Some of the tested compilers (well, all except icc) had no issues compiling code with up to 8192 values in the structured binding expression. To my huge surprise, however, this is one of the tests gcc proved to be slower at than clang, but both compiled the test files nicely.

My other surprise came from icc, which gave a core dump upon compiling 8192 (see Listing 5) but in the end, 4096 seemed like a good number for icc.

structuredBindingsInOneDeclaration-8192
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

icc: error #10105: /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom: core dumped
icc: warning #10102: unknown signal(0)
icc: error #10106: Fatal error in /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom, terminated by unknown
compilation aborted for structuredBindingsInOneDeclaration-8192.cpp (code 1)
			
Listing 5
gcc clang msvc intel
8192 8192 8192 4096

Macro identifiers simultaneously defined in one translation unit – macroCountInOneTranslationUnit

This was one of the easiest tests to come up with: just generate a file with enough macros and let the compilers go wild on them. gcc and icc had no problems sorting out files (at blazing speeds) containing up to 65536 macros; however, clang started choking after 8192 with a coredump. A similar fate awaited msvc:

  macroCountInOneTranslationUnit-8192.cpp(8197):
  fatal error C1009: compiler limit: macros nested
  too deeply`

so I had to lower my expectations and the number of generated macros to 256.

When even this number gave a compiler error (not a compile error), I started thinking that maybe my test case is simply wrong, maybe I expect too much from the macro engine of msvc, or that the test with the following logic is simply not good:

  #include <iostream>
  
  #define V0 1
  #define V1 V0 + 1
  #define V2 V1 + 1
  #define V3 V2 + 1
  #define V4 V3 + 1
  
  int main() {
    std::cout << V4<< std::endl;
  }

From the error message, I felt that somehow this specific test case must have stepped on the toes of the msvc compiler, so I concluded that the test is using the wrong approach for this situation because I felt that no (decent) compiler would have problems with 256 macros defined in a source file so the problem must be the recursive substitution part of it. However, since it managed to annoy two of the compilers to the point of breaking, I decided to leave it in here; maybe someone will have a look at these cases in one of the development teams.

gcc clang msvc intel
65536 8192 128 65536

Parameters in one macro definition – parametersInMacroDefinition

This test was very simple to construct, involving a macro, similar to the parameterCountInFunctionDefinition. This test case was implemented together with the ‘Arguments in one macro invocation’ test, since it sort of made sense to have both run together.

All the compilers did very well with the code with up to 4096 parameters (except the Microsoft compiler, see below) which is several time above the supported number recommended by the Standard. On [GCC], it is mentioned that gcc allows up to USHRT_MAX number of arguments, which should be at least 65535. 65535 worked nicely, but the gremlin woke up somewhere inside and I had to try to run with 65536.

gcc provided a cute (but weird) error:

parametersInMacroDefinition-65536.cpp:5: error: macro "M" passed 65536 arguments, but takes just 0
    5 |   int v = M(1, 1, 1, 

Seemingly there was an overflow somewhere deep inside gcc. clang threw a tantrum in form of a coredump for the same number. icc compiled without any complains.

Microsoft’s own compiler was very consistent with the amendments mentioned in [MSVC]. It accurately gave a warning that 127 is the maximum number of parameters supported for these situations.

clang successfully compiled for 9216 parameters but failed for 10240, so I decided that the maximum supported value must be somewhere between the two.

gcc clang msvc intel
65535 9216 127 65536

Characters in one logical source line – charactersInOneLogicalSourceLine

This test case was just about generating a long list of summations, that in the end will print out the number of characters in the test case. The following listing, for example, gives the source for 15.

  #include <iostream>
  int main() {
  int a=9+2+2+2 ;
    std::cout << a << std::endl;
  }

gcc struggled with the Standard-recommended value (65536), but after a while it completed the operation successfully. clang, to my big surprise, crashed again; however, I am not sure whether it was due to the very long sequence of operations handled in a peculiar mode by clang or due to the line length. Since I personally don’t consider this to be the most important test case, I just let it lie. This test case will not work correctly for values under 10, but lines with length under 10 should not be a struggle for any compiler.

msvc and icc had no problems compiling lines with the required length.

gcc clang msvc intel
65536 16384 65536 65536

Characters in a string literal after concatenation – charactersInAStringLiteral

This is again was one of the easiest test cases: just generate a string long enough and run a strlen on it. This case might be useful for tools which are generating source code for embedding resources into C++ applications (such as aforementioned Qt’s resource compiler). None of the compilers (except msvc, with a well defined limit from [MSVC]) I tested had any problems running with strings long as 131072 characters, double the Standard-recommended value.

gcc clang msvc intel
131072 131072 65535 131072

Size of an object – sizeOfAnObject

This required some tricks in order to beat the optimizer, leading to the source in Listing 6 for 262144 (which is, by the way, the value recommended by the Standard) being generated.

#include <iostream>
#include <numeric>

class A {
public:
  A() {
    std::iota(std::begin(c), std::end(c), 0);
  }
  void printer() {
    for(auto i=0ULL; i<sizeof(c); i++) {
      if(c[i] * 256 == i && i > 0) {
        std::cout << i ;
      }
    }
    volatile auto x = sizeof(*this);
    std::cout << x << std::endl;
  }
private:
  unsigned char c[262144];
};
int main() {
  static A a;
  a.printer();
}
			
Listing 6

It actually surprised me how far the optimizer can go in order to save memory, time and space for you. Unless you place some complex calculations and constraints on the values it has to manage, it will simply precalculate all the values for you without leaving a trace of their origins in the generated binary. Of course, I am talking about release builds with optimization turned on.

No compilers had problems in generating code (and running the generated executable) for class sizes up to 2097152, which is 8 times the Standard-required supported size.

gcc clang msvc intel
2097152 2097152 2097152 2097152

Nesting levels for #include – filesnestingLevelsForIncludes

For this test, I created the required number of header files and placed them in the inc directory, with each header including the next one. The current iteration of the Standard suggested a supported nesting level of 256 here, and this is mentioned on [GCC] (but with a value smaller, specifically 200). Both gcc and clang subscribe to this 200, and we get a very specific error in the form of error: #include nested too deeply.

icc and msvc, on the other hand, managed up to 256, which is considered a success.

gcc clang msvc intel
200 200 256 256

Case labels for a switch statement – caseLabelsForSwitch

For this test, I implemented a simple random generator, which could pick values between 1 and the required value, and – in a long switch statement – printed out the square of that number. No compiler had problems compiling the code up to 16384, the value recommended by the Standard.

gcc clang msvc intel
16384 16384 16384 16384

Non-static data members in a single class – nonStaticDataMembersOfClass

This was also one of the more wood-cutting types of work: just generate a class, with the required number of data members (and for simplicity’s sake, all in one class) and sum those up. msvc, gcc and clang had no problems generating code for classes which contained 65536 data members, which is more than the double the recommended amount.

icc choked on that value (and for 32768, 16384, 8192 and 4096 too), but nicely compiled for 2048, which I found a bit strange (see Listing 7) because an application of the form in Listing 8 (generated for 4) does not possess a huge level of complexity, so theoretically should not be a huge problem for a compiler.

nonStaticDataMembersOfClass-4096
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

icc: error #10105: /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom: core dumped
icc: warning #10102: unknown signal(0)
icc: error #10106: Fatal error in /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom, terminated by unknown
compilation aborted for nonStaticDataMembersOfClass-4096.cpp (code 1)
Command exited with non-zero status 1
			
Listing 7
#include <iostream>
class TestClass {
public:
  short int m_member0 = 1;
  unsigned short int m_member1 = 1;
  unsigned int m_member2 = 1;
  int m_member3 = 1;
};
int main() {
  TestClass tc; int v = 0;v += tc.m_member0;
  v += tc.m_member1;
  v += tc.m_member2;
  v += tc.m_member3;
  std::cout << v << std::endl;
}
			
Listing 8
gcc clang msvc intel
65536 65536 65536 2048

Lambda-captures in one lambda-expression – lambdaCapturesInOneLambdaExpression

Again, one of the easiest test cases: just generate the required number of variables, and a lambda trying to capture them. msvc, gcc and clang had no issues compiling lambdas capturing 8192 values, which I considered enough for even the most evil code generated by any code generator.

icc core-dumped for that value, but successfully compiled code generated for 4096 variables.

gcc clang msvc intel
8192 8192 8192 4096

Enumeration constants in a single enumeration – enumerationConstantsInEnum

This was again one of the easiest cases: just generate an enum with enough members and let the compiler pick out a random value from them. No compiler had issues compiling code generated with up to 8192 values, which is the double of the indicated number to be supported in the Standard.

gcc clang msvc intel
8192 8192 8192 8192

Levels of nested class definitions – nestingOfClasses

Nested classes used in projects when encapsulating information should provide a better overview of what the class is about, and what information to keep apart. However, too deep a nesting of inner classes will (after a while) produce unreadable code (personal opinion) and will possibly lead to a maintenance nightmare. This may be why Microsoft reduced the nesting level to a humanly manageable number (16) while other compilers keep their value at 256, the value recommended by the Standard.

gcc clang msvc intel
256 256 16 256

Functions registered by atexit() – functionsRegisteredByatexit

The Standard recommends 32 here, but no compiler had problems generating code (which worked as expected) for sane values, although this was all actually dependent on my OS. On systems conforming to POSIX, the correct method for finding out the number of functions that can be registered for atexit is using the sysconf function with _SC_ATEXIT_MAX as a parameter.

The Windows SDK had a remark in the form that the number of functions that can be registered is limited by the available heap space.

gcc clang msvc intel
64 64 64 64

Functions registered by at_quick_exit() – functionsRegisteredByat_quick_exit

According to the documentation, the difference between std::exit and std::quick_exit is the amount of cleanup done when the application exits (for example, calling static objects’ destructors, or other fine nuances). The Standard recommends at least 32 functions, but I have found that registering 8192 is also alright with gcc, icc and clang. And because, sadly, this feature is among the ones for which there is no POSIX-assigned retrieval count, as is case for atexit, I concluded that 64, the same value as for atexit, should be a good value for this situation.

gcc clang msvc intel
64 64 64 64

Direct and indirect base classes for a class – directAndIndirectBaseClassesOfClass

Making this test would have been much more easier if I had opted just to generate a bunch of classes as I did for directBaseClassesOfClass. However, what I did was to create a full binary tree with a number of nodes as close as possible to that required and generate a class hierarchy from this tree. A binary tree with 13 levels already contains a huge number of nodes and this pretty much covers the classes for the main part of our test. In cases where the requested number was not exactly one less than a power of two, I generated a set of additional classes that were added to the inheritance list for the tests’ target Derived class, bumping the number of classes up to that required.

No compiler had any problems compiling code with values up to 65535.

gcc clang msvc intel
65535 65535 65535 65535

Direct base classes for a single class – directBaseClassesOfClass

This test was also a straightforward one: I just had to generate a long list of base classes and a derived one from them. To prevent the compiler optimizing away the classes, for each class I stored a global static value in a class member (and also printed it out in the constructor), which incremented with every constructor call, and I also summed the values at the end.

No compilers had problems compiling code with up to 4096 generated direct classes, which is 4 times more than the number recommended by the Standard.

gcc clang msvc intel
4096 4096 4096 4096

Class members declared in a single member-specification – classMembersDeclaredInASingleMemberSpecification

The code generated for a value of 5 is something like:

  #include <iostream>
  class A {
  public:
    int v1 = 1, v2 = v1 + 1, v3 = v2 + 1, 
      v4 = v3 + 1, v5 = v4 + 1;
  };
  int main() {
    A a;
    std::cout << a.v5 << std::endl;
  }

so for the Standard-recommended 4096, the same logic is used. No compiler had problems compiling code for up to 16384 class members, which I considered enough for this purpose as I strongly advocate the principles of clean code, and recommend everyone to have at most one, or (in the worst case) a small group of members that logically belong together (and of course, don’t forget to add comments to explain their purpose).

gcc clang msvc intel
16384 16384 16384 19384

Final overriding virtual functions in a class – finalOverridingVirtualFunctions

This test case also requires a class hierarchy just as for directAndIndirectBaseClassesOfClass but also introduced virtual functions, to make the generation more fun. Since, even for small numbers, the code tends to be long and repetitive, I will not put any example code here, but feel free to check out the code generated for this case by the test application. The Standard recommends 16384 as the magic limit for this case, and I think that is indeed a very good number.

gcc and clang had no problems generating code for values up to 32768; however, msvc didn’t manage to compile code generated for this value (neither the 32 bit compiler, nor the 64 bit one). Both failed with the error:

  finalOverridingVirtualFunctions-32768.cpp(262149):
 fatal error C1060: compiler is out of heap space`

I found this a bit strange, since neither of those compilers consumed too much memory while running. The real surprise came when I tried to compile with 16384 functions, and I was greeted by an internal error from the compiler:

  finalOverridingVirtualFunctions
  -16384.cpp(196618): fatal error C1001: An
  internal error has occurred in the compiler.
  (compiler file 'msc1.cpp', line 1528)

Finally, 8192 gave a result for msvc in the form of a compiled executable. Intel’s icc could not even manage 8192, its final value is 4096.

gcc clang msvc intel
32768 32768 8192 4096

Direct and indirect virtual bases of a class – directAndIndirectVirtualBaseClassesOfClass

This test case is also very similar to the one for directAndIndirectBaseClassesOfClass except that the inheritance must be virtual. The same language mechanisms were used to generate the required number of base classes, and the result is also very similar. Compiling with the limit set at 65535 took forever, but successfully completed both for gcc and clang. I did not dare try with a larger value. Interestingly, the code generated by clang is only 62 megabytes, while the one generated by gcc is 72 MB.

Sadly, after half an hour of struggle msvc gave up with the following error message:

  directAndIndirectVirtualBaseClassesOfClass-
  65535.cpp(65527): fatal error C1060: compiler is
  out of heap space

and the same result was produced for 32768, and 16384 and 8192 too, so I came to the conclusion that the C++ compiler of Visual Studio 2019 can’t handle these large applications, and I therefore reduced the maximum supported number to 4096.

icc really struggled with 4096. It took more than 30 minutes to compile the source file, so again, I have decided that this should be enough for it, and the same value applies for msvc too.

gcc clang msvc intel
65535 65535 4096 4096

Static data members of a class – staticDataMemberOfClass

This test case consisted of generating a class with the specified number of public static members, of various numeric types. Following that is code to initialize these values to 1 and in the main function is code generated to sum up all the members.

None of the tested compilers had any issues with generated code containing up to 16384 static members except icc, which core-dumped again (see Listing 9).

  staticDataMemberOfClass-16384
  ": internal error: ** The compiler has
   encountered an unexpected problem.
  ** Segmentation violation signal raised. **
  Access violation or stack overflow. Please
  contact Intel Support for assistance.
  
  icc: error #10105: /home/fld/intel/
  compilers_and_libraries_2020.2.254/linux/bin/
  intel64/mcpcom: core dumped
  icc: warning #10102: unknown signal(0)
icc: error #10106: Fatal error in /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom, terminated by unknown
compilation aborted for staticDataMemberOfClass-16384.cpp (code 1)
Command exited with non-zero status 1
			
Listing 9
gcc clang msvc intel
16384 16384 16384 2048

Friend declarations in a class – friendsOfAClass

Friends of a class provide a useful back door into the internals of a class, but too many back doors aren’t a very good approach to optimal application design, so you should not over-abuse them. The Standard indicates a value of 4096 and the compilers had no problems compiling with values up to 8192.

For this test, I generated a class and a combination of friend classes and functions, and then counted the number of private members via friend functions and classes.

gcc clang msvc intel
8192 8192 8192 8192

Access control declarations in a class – accessControlDeclarationsInClass

I interpreted this test case as alternating protected, public, private of various data members, so the test generated is also nothing else but a long list of data members with alternating visibility, a set of public getter functions (the test application will print out the required number - 1, due to this last set being public) for the private and protected members, and a main function which simply generates a summation of all the members (which were set to one).

Some of the compilers I tested had no problems in generating code for alternating the visibility of data members up to 16384. To my surprise, this was one of the test cases where clang outperformed gcc in terms of speed.

icc sadly gave up at 16384 with the following error message:

 accessControlDeclarationsInClass-16384.cpp(43698):
 internal error: bad pointer

and threw an exception for 8192 (shown in Listing 10) but compiled nicely for 4096.

: internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

icc: error #10105: /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom: core dumped
icc: warning #10102: unknown signal(0)
icc: error #10106: Fatal error in /home/fld/intel/compilers_and_libraries_2020.2.254/linux/bin/intel64/mcpcom, terminated by unknown
compilation aborted for accessControlDeclarationsInClass-8192.cpp (code 1)
			
Listing 10
gcc clang msvc intel
16384 16384 16384 4096

Member initializers in a constructor definition – memberInitializersInAConstructorDefinition

A not overly complicated test case: I just generated the required number of members in a class, generate code for the constructor and printed their sum in order to have some confirmation. Except for icc, no compiler had problems in compiling code generated for values up to 16384, and again, this is one of the test cases where clang was faster than gcc. However, for this high value icc came up with the following error:

  memberInitializersInAConstructorDefinition-
  16384.cpp(16720): internal error: bad pointer

Finally, icc settled at the value of 4096 (not being able to compile the Standard-recommended 6144 either) and I had a feeling there is a connection with the previous test case.

gcc clang msvc intel
16384 16384 16384 4096

Initializer-clauses in one braced-init-list – initializerClauseInBracedInitList

Another test case which just repeatedly required generating an array with the required number of elements, and then iterating over it to sum up a value to get the required number for the test case.

Although the Standard recommends 16384 clauses, I found that no compiler had problems generating code for initializer lists of lengths up to 262144, which is several times the Standard-recommended value.

gcc clang msvc intel
262144 262144 262144 262144

Scope qualifications of one identifier – scopeQualificationOfOneIdentifier

Although this seemed to be one of the more banal test cases, the higher values turned out to be fatal in the end to clang and msvc when I increased the bracket depth (via -fbracket-depth=4096 for clang) but gcc was happy even with 4096 (this being 16 times the Standard-recommended value).

clang gives up somewhere at a value between 1024 and 2048 in a seemingly infinite recursive call between:

  EmitTopLevelDecl(clang::Decl*)

and:

  EmitDeclContext(clang::DeclContext const*)

but I’d rather say that 1024 scopes for a variable is more than enough.

msvc does not even support the Standard-recommended 256; it gives up at 128 with the error message:

  scopeQualificationOfOneIdentifier-128.cpp(130):
  fatal error C1061: compiler limit: blocks nested
  too deeply

but with a depth set to 127 there were no problems.

icc had no problems with compiling to depths of 2048, but failed with 4096.

gcc clang msvc intel
4096 1024 127 2048

Nested linkage-specifications – nestedLinkageSpecifiers

Personally, I think that nesting linkage specifications can, in the long term, lead to highly unmaintainable code. But if the Standard allows it, and there is even a recommended depth, who am I to protest. So, some code in the form of the one in Listing 11 (example shown for 4) was generated, and I have just acknowledged that 1024 sounds like a good number for this purpose unless you really want to be the source of future headaches.

#include <iostream>

extern "C" { int fC() { return 1; }
extern "C++" { int fCx() { return 1; }
  extern "C" { int fCxC() { return 1; }
  extern "C++" { int fCxCx() { return 1; }
    int fun() { 
      return 0+fC()+fCx()+fCxC()+fCxCx();}
    }
  }
 }
}
int main() {
  std::cout << fun() << std::endl;
}
			
Listing 11

gcc and clang had no issues compiling code with the aforementioned depth; however, the msvc compiler gave up somewhere at a value between 736 and 752 with the following error but works for 736:

  nestedLinkageSpecifiers-752.cpp(748): fatal error
  C1026: parser stack overflow, program too complex

icc accepted for this test case the Standard-recommended 1024.

gcc clang msvc intel
1024 1024 736 1024

Recursive constexpr function invocations – recursiveConstexpr

Recursive constexpr function are not the most frequent ones; however, they can come in very handy from time to time. This test case required the application in Listing 12, where 512 is the actual required depth of the recursion. Running this test case gave the results that follow.

#include <iostream>
constexpr unsigned long long sum(unsigned long long n, unsigned long long s=0) {
  return n ? sum(n-1,s+n) : s;
}
constexpr unsigned long long k = sum(512);

int main() {
  std::cout << k<<std::endl;
}
			
Listing 12

clang gave a very correct assessment of the situation:

  recursiveConstexpr-512.cpp:5:30: error: constexpr
  variable 'k' must be initialized by a constant
  expression
  constexpr unsigned long long k = sum(512);
                               ^   ~~~~~~~~
  recursiveConstexpr-512.cpp:3:13: note: constexpr
  evaluation exceeded maximum depth of 512 calls

gcc also recognized the situation, in the form of a warning message like:

  recursiveConstexpr-512.cpp:5:41: error:
  ‘constexpr’ evaluation depth exceeds maximum of
  512 (use ‘-fconstexpr-depth=’ to increase the
  maximum)
  5 | constexpr unsigned long long k = sum(512);

However, after applying the suggested -fconstexpr-depth=513, it actually managed to compile the code, and by making that change we can bring the level of recursion up to 16384. At 32768, gcc also decided it was time to give up:

  g++: internal compiler error: Segmentation fault
  signal terminated program cc1plus

icc did not like 512 but worked nicely with 256.

msvc correctly recognized the scenario:

  recursiveConstexpr-512.cpp(5): error C2131:
  expression did not evaluate to a constant

  recursiveConstexpr-512.cpp(3): note: failure was
  caused by evaluation exceeding call depth limit
  of 512 (/constexpr:depth<NUMBER>)

After specifying the required depth, the msvc compiler choked at 16384, 8192, 4096 in the form of an Internal Compiler Error but successfully compiled for 2048.

gcc clang msvc intel
16384 16384 2048 256

Full-expressions evaluated within a core constant expression – fullExpressionInAConst

The value recommended for this situation is simply so huge (1048576) that I did not consider it necessary to increase it. The application generated for this case is just a simple addition of ones, being assigned to a constant value. Compiling a test case takes a long time, but the only compiler tested that had any problems with it was msvc, which gave up at somewhere a value between 65536 and 131072.

gcc clang msvc intel
1048576 1048576 65536 1048576

Template parameters in a template declaration – templateParametersInTemplateDeclaration

This test case consisted of creating a source file along the lines of the one in Listing 13.

#include <iostream>

template<int N0,int N1,int N2,int N3>
struct C {
  static const int v = N0 + N1 + N2 + N3;
};
int main() {
  C<1,1,1,1> c;
  std::cout << c.v << std::endl;
}
			
Listing 13

No compiler, except msvc, had problems compiling code with values up to 16384, which is 4 times the value recommended by the Standard. Just a small interesting observation is that while gcc generally outperformed all the other compilers (that were run on the same platform) from the point of view of speed, this test case was aced by clang, which delivered blazing fast speed for this test case, easily outperforming all the other compilers.

msvc failed 16384 with fatal error C1111: too many template parameters but in the end it managed to compile the Standard-recommended 1024.

gcc clang msvc intel
16384 16384 1024 16384

Recursively nested template instantiations – recursivelyNestedTemplateInstantiations

The application in Listing 14 actually gave a headache to a few compilers. It seems that icc can handle recursively nested templates in a very predictable way:

#include <iostream>
template<typename T>
struct B {
  typedef T BT;
};
template<int N>
struct C {
  typedef typename B<typename C<N-1>::T>::BT T;
};
template<>
struct C<0> {
  typedef int T;
};

int main()
{
  C<1024>::T c = 1024;
  std::cout << c << std::endl;
}
			
Listing 14
  recursivelyNestedTemplateInstantiations-
  1024.cpp(9): error: excessive recursion at
  instantiation of class "C<524>"`.

The troubles for icc were not over since – after a period of experimentation – I discovered that the maximum value it supports is 500. For values above 500, I get the previous strange error, with the value being always the test value - 500. So for 501 the error is: error: excessive recursion at instantiation of class "C<1>". Strange, but interesting.

gcc also had its troubles:

  recursivelyNestedTemplateInstantiations-
  1024.cpp:9:45: fatal error: template
  instantiation depth exceeds maximum of 900 (use
  ‘-ftemplate-depth=’ to increase the maximum)

but after specifying -ftemplate-depth=1025 as an extra parameter, gcc succeeded. Interestingly, gcc expects +1 to the actual number.

clang aced this test, and without complaining compiled the entire 1024 iterations of template madness. An interesting side-note for clang: for 16384 it gave me the hint to use -ftemplate-depth=16384 and then it gave me the warning in Listing 15. I had never seen this before, and my admiration for compiler writers has just gone up. But gcc compiled 16384 too, without this warning (I just had to specify -ftemplate-depth=16385 as an extra parameter).

warning: stack nearly exhausted; compilation time may suffer, and crashes due to stack overflow are likely
[-Wstack-exhausted]
    typedef typename B<typename C<N-1>::T>::BT T;
                                ^
recursivelyNestedTemplateInstantiations-1024.cpp:9:30: note: in instantiation of template class 'C<15286>'
			
Listing 15

1024 proved to be fatal for msvc:

  recursivelyNestedTemplateInstantiations-
  1024.cpp(9): fatal error C1202: recursive type or
  function dependency context too complex

Finally, it managed to compile 128.

gcc clang msvc intel
16384 16384 128 500

Handlers per try block – handlersPerTryBlock

This test case consisted of generating a number of classes derived from std::exception that will act as objects to be thrown, then throwing an object of that kind in a try block and then writing a long list of catch statements for each class. No compiler had problems with code that contained 256 different handlers for a try block, as per the Standard-recommended value.

gcc clang msvc intel
256 256 256 256

Number of placeholders – numberOfPlaceholders

This is not specifically a compiler limit and is more a library feature, but in the end we have to agree that all the compilers tested had an upper limit of 29, except msvc which draws the upper limit at 20.

gcc clang msvc intel
29 29 20 29

Conclusion

Before you jump ship and decide that, based on these results, it’s time to ditch your current compiler and switch to a different one, a big warning for you: don’t. These test cases were specifically engineered for a unique purpose, and they are not real life situations. If they were, then maybe it would be time to rethink your source strategy.

Each of these compilers is able to perform adequately for any project you can find on the market today, and the purpose of this test was not to find a winner, but to see which does what well, and what improvements should be made for future releases.

Each of the tested compilers shines in some areas and performs poorly in others, and what follows are just a few (personal) observations. If you run the test cases, you will possibly reach a different conclusion.

gcc and msvc are the oldest of the tested bunch. Their age has positively affected their performance. Both of them are blazingly fast in all areas of compilation. msvc has a set of limitations, that you will not notice in your average daily programming routine unless you specifically look for them, while gcc can compile basically everything that you throw at it, assuming you have the patience to wait for the compilation time of a large code base, and your computer can cope with the expectations of the compiler.

icc, which came more than 20 years after msvc, promises faster-than- average code targeting its own processors, good c++17 support and also a decent speed. Sadly, it is packaged into a suite downloadable on a trial basis from Intel’s homepage, and this possibly makes hobbyist programmers or the advocates of open source stay away unless forced by some specific requirements.

clang is the newcomer and the youngest of the tested compilers. It outperforms all the others when it comes to more recent C++ features but seems to struggle with notions and constructs that other compilers have had a few extra decades to polish till perfection. But the speed at which the community picked it up, and made it into one of the most used compilers today hints at a bright future for this product.

References

[ANNEX-B] ‘Annex B: (normative) Implementation quantities’ of the C++ Standard: https://eel.is/c++draft/implimits

[CLANG] Clang limites: https://clang.llvm.org/docs/UsersManual.html#controlling-implementation-limits

[GCC] gcc limits: https://gcc.gnu.org/onlinedocs/gcc-9.2.0/cpp/Implementation-limits.html

[GITHUB] The test code: https://github.com/fritzone/cpp-stresstest

[MSVC] Microsoft compiler limits: https://docs.microsoft.com/en-us/cpp/cpp/compiler-limits?view=vs-2019

Deák Ferenc Ferenc has wanted to be a better programmer for the last 15 years. Right now he tries to accomplish this goal by working at Maritime Robotics as a system programmer, and in his free time, by exploring the hidden corners of the C++ language in search of new quests.






Your Privacy

By clicking "Accept Non-Essential Cookies" you agree ACCU can store non-essential cookies on your device and disclose information in accordance with our Privacy Policy and Cookie Policy.

Current Setting: Non-Essential Cookies REJECTED


By clicking "Include Third Party Content" you agree ACCU can forward your IP address to third-party sites (such as YouTube) to enhance the information presented on this site, and that third-party sites may store cookies on your device.

Current Setting: Third Party Content EXCLUDED



Settings can be changed at any time from the Cookie Policy page.