3

I am not quite familiar with static variable in a function in C++. I know it is only initialize once. e.g.

void func(map<int, int>& m){
   static int a = m[0];
}

I would expect static int a = m[0] would only execute once when I first call the function, but it seems to me that every time I call the function, it cost some time to execute m[0]. Here is one test I do. the program itself does not make sense (it will always return the same number) but just want to show the performance as a example

    double getDirect(int i, map<int, double>& m){
    static double res = i;
    return res;
    }

    double getFromMap(int i, map<int, double>& m){
    static double res = m[i];
    return res;
    }

in main function

clock_t t;
t = clock();
long long i = 0;
map<int, double> m;
for(int i = 0; i < 10; i++){
    m.insert(make_pair(i, i));
}
#pragma omp parallel for private(i)
for(i = 0; i < size; i++){
    for(int j = 0; j < 10; j++)
        double res = getDirect(j, m);
}

t = clock()-t;
cout << " It cost " << t << " clicks (" << ((float)t) / CLOCKS_PER_SEC << " sec) to run." << endl;

t = clock();
#pragma omp parallel for private(i)
for(i = 0; i < size; i++){
    for(int j = 0; j < 10; j++)
        double res = getFromMap(j, m);        
}
t = clock()-t;
cout << " It cost " << t << " clicks (" << ((float)t) / CLOCKS_PER_SEC << " sec) to run." << endl;

I would expect the time between the two would be really similar. But the result is

It cost 14055 clicks (14.055 sec) to run. It cost 150636 clicks (150.636 sec) to run.

The getFromMap is much slower. Is this because m[i] is still executed every time? If not, what is the reason? If so, what is a good way to by-pass this performance cost? Thanks.

Here is some follow up. I get the assembly code\

    static double res = m[i];
000000013F83D454  mov         eax,104h  
000000013F83D459  mov         eax,eax  
000000013F83D45B  mov         ecx,dword ptr [_tls_index (013F8503C8h)]  
000000013F83D461  mov         rdx,qword ptr gs:[58h]  
000000013F83D46A  mov         rcx,qword ptr [rdx+rcx*8]  
000000013F83D46E  mov         eax,dword ptr [rax+rcx]  
000000013F83D471  cmp         dword ptr [res+0Ch (013F85035Ch)],eax  
000000013F83D477  jle         getFromMap+0A9h (013F83D4B9h)  
000000013F83D479  lea         rcx,[res+0Ch (013F85035Ch)]  
000000013F83D480  call        _Init_thread_header (013F831640h)  
000000013F83D485  cmp         dword ptr [res+0Ch (013F85035Ch)],0FFFFFFFFh  
000000013F83D48C  jne         getFromMap+0A9h (013F83D4B9h)  
000000013F83D48E  lea         rdx,[i]  
000000013F83D495  mov         rcx,qword ptr [m]  
000000013F83D49C  call        std::map<int,double,std::less<int>,std::allocator<std::pair<int const ,double> > >::operator[] (013F831320h)  
000000013F83D4A1  movsd       xmm0,mmword ptr [rax]  
000000013F83D4A5  movsd       mmword ptr [res (013F850360h)],xmm0  
000000013F83D4AD  lea         rcx,[res+0Ch (013F85035Ch)]  
000000013F83D4B4  call        _Init_thread_footer (013F8314C4h)

It did seem that m[i] is still called after the first time. I also did a single thread test with a smaller size, but the difference is even larger (a ratio about 100).

Any idea on how to set up visual studio 2012 so that it stops calling m[i] after the first time? Thanks so much for your help

9
  • 4
    Take a look at the generated code to see what really happens. Commented Jun 5, 2019 at 20:06
  • 2
    Remember that there can be a great number of things going on on a modern multi processing system all at the same time. It can be very hard to get precise time measurements from a clock as a result. Program gets the start time, start performing computation, waits for a shared resource used by another process for a while, completes task. Next time it might not have to wait. It might wait longer. It may be forced to wait multiple times. A very finicky business, timing a program. Commented Jun 5, 2019 at 20:10
  • what compiler options did you use? optimizations turned on? Commented Jun 5, 2019 at 20:15
  • That said, my above comment does not justify the order of magnitude difference reported by the Asker. Commented Jun 5, 2019 at 20:16
  • I use Visual Studio 2012. Usually I use multithreading so the program can have 100% CPU usage. I thought this might be a better comparison. Optimization is set to Maximize Speed (/O2) Commented Jun 5, 2019 at 20:20

2 Answers 2

7

Having function-local static variables would ever so slightly penalize you. Since static values need to be initialized once, and only once, there has to be a flag which would be checked on every function execution, set to 'yes, please initialize' before the function is entered first time and reset to 'no, please no longer initialize' after such initialization has happened.

Otherwise, compiler would have no way to guarantee you a single initialization.

However, compiler guarantees you that once the variable has been initialized, no further initialization will take place, and the declaration will be skipped: https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables

We can look at a sample codegen to confirm this assumption with the compiler: https://gcc.godbolt.org/z/Q-Dni_ It is clear that if the variable is already initialized, a declaration is skipped and k is not called. Conclusion: your difference in time comes from somewhere else.

Sign up to request clarification or add additional context in comments.

4 Comments

While this is good information, it doesn't explain how two functions, both using static initialization, have such a difference.
@Evg thread-safeness normally has no effect on performance (at least on hardware I am familiar with) due to double-check locking.
@NathanOliver good point, let me edit to answer the question better.
Thanks so much for the answer. I saw the assembly code but it seems that map did get called. Maybe I did something wrong in the optimization for Visual Studio 2012. Any good idea on this
2

From quick-bench

It seems that getDirect is inlined contrary to getFromMap.

Probably due to size of code to call m[i] in the initialization case.

(version with at is even faster that your getDirect Demo :) )

1 Comment

That's a fantastic find!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.