0

Yesterday at work, my colleague claimed that preprocessor macros were slower than writing variables and functions manually. The context is that we have a class in which member variables are sometimes added and for each of these member variables, 3 different methods have to be created in exactly the same pattern. We had these generated automatically using macros, as shown below.

struct Bar
{
    long long a;
    long long b;
    long long c;
    long long d;
};

struct Foo
{
    Bar var[1300];
};

typedef std::vector<Foo> TEST_TYPE ;

class A
{
private:
    TEST_TYPE container;

public:
    TEST_TYPE& getcontainer()
    {
        return container;
    }
};

#define createBMember(TYPE, NAME)         \
private:                                  \
    TYPE NAME;                            \
                                          \
public:                                   \
    TYPE& get##NAME()                     \
    {                                     \
        return NAME;                      \
    }

class B
{
    createBMember(TEST_TYPE, container);
};

double testA()
{
    A a;
    LARGE_INTEGER frequency;
    LARGE_INTEGER startA, endA;

    if (!QueryPerformanceFrequency(&frequency)) {
        std::cerr << "High-Resolution-Timer nicht unterstützt." << std::endl;
        return 1;
    }

    QueryPerformanceCounter(&startA);
    for(size_t i = 0; i < 10000; ++i)
    {
        a.getcontainer().push_back(Foo());
    }

    QueryPerformanceCounter(&endA);

    return static_cast<double>(endA.QuadPart - startA.QuadPart) / frequency.QuadPart;
}

double testB()
{
    B b;
    LARGE_INTEGER frequency;
    LARGE_INTEGER startB, endB;

    if (!QueryPerformanceFrequency(&frequency)) {
        std::cerr << "High-Resolution-Timer nicht unterstützt." << std::endl;
    }

    QueryPerformanceCounter(&startB);

    for(size_t i = 0; i < 10000; ++i)
    {
        b.getcontainer().push_back(Foo());
    }

    QueryPerformanceCounter(&endB);

    return static_cast<double>(endB.QuadPart - startB.QuadPart) / frequency.QuadPart;
}

//----------------------------------------------------[main]
int main()
{
    double Atest = 0;
    double Btest = 0;

    double AHigh = 0;
    double BHigh = 0;

    double ALow = 10000;
    double BLow = 10000;

    double a;
    double b;

    const uint16_t amount = 30;

    for(uint16_t i = 0; i < amount; ++i)
    {   
        a = testA();

        AHigh = a > AHigh ? a : AHigh;
        ALow = a < ALow ? a : ALow;

        Atest += a;
    }

    for(uint8_t i = 0; i < amount; ++i)
    {   
        b = testB();

        BHigh = b > BHigh ? b : BHigh;
        BLow = b < BLow ? b : BLow;

        Btest += b;
    }

    Atest /= amount; 
    Btest /= amount; 

    std::cout << "A: " << Atest << std::endl;
    std::cout << "B: " << Btest << std::endl;

    auto size = sizeof(Foo);

    return 0;
}

I tried to refute his statement with this test by having a fairly large struct, which I simply append in a vector in each test run.

The strange thing, however, was that although the preprocessor runs before compiling and both classes should therefore be identical, I measured some speed differences. The following observations were made:

  • In debug mode without any optimization, the class that is tested first is faster
  • In release mode with "whole-program-optimization" and other settings, B is faster. The last times were: A: 0.47695, B: 0.430825

This confuses me, because as I said, both classes are identical.

I should also mention that unfortunately, as far as our development environment is concerned, we have to work with a kind of snapshot version of C++11 (Visual Studio 2010). That's why I can't use std::chrono for benchmarking, for example.

I haven't been able to test it with other compilers yet. I also looked at the assembly code on godbolt.org, but didn't find anything that could make such a big difference.

Admittedly, I'm still a trainee and would classify my skills as more of an amateur. Does anyone have any idea what could be causing this difference in speed?

8
  • Did you try using the same class in both tests? Commented Aug 16, 2024 at 7:13
  • I hadn't tested this yet. But the result is as follows: I did 30 runs each in the release configuration when testing. TestA is faster with A and B as classes. Then I swapped which test is executed first - i.e. swapped the for loops. In this case, TestA is still faster in both cases. Now I swapped the positions of the two functions again and tested them. When testing with class A, TestB is faster. When testing with class B, however, TestA is faster again. Commented Aug 16, 2024 at 8:25
  • Have you tried swapping the order of the tests. The first one to run may suffer from a slowdown due to memory being not recently used. Test A,B,A,B to be sure that memory being allocated first time around isn't skewing your benchmarks. I can't see how the preprocessor generated sourcecode should be any different when compiled. I'm sure it will add something to the compile time but shouldn't affect runtime at all. Commented Aug 16, 2024 at 8:27
  • Yes, I tested that as well. In general the first test to run is usually faster, not slower. Commented Aug 16, 2024 at 8:43
  • The vectors will do lots of heap allocations, and of course the state of the heap is different for the first run. So, are you testing the heap or the macros? :-) Commented Aug 16, 2024 at 9:05

1 Answer 1

3

Your colleague doesn't know what they're talking about.

Macro's are a textual replacement in the preprocessor, one of the earliest phases of compilation. The actual compiler sees identical code. Any speed differences will be due to other factors. As noted in the comments, a bad test methodology is almost certain the best explanation (especially given the lack of knowledge shown by both the claim and the use of VS 2010)

Sign up to request clarification or add additional context in comments.

1 Comment

I concur. I'm am also puzzled why they are stuck on such a geriatric compiler version.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.