1

Performance-wise, is there any difference between the following functions in modern C++ compilers?

std::string ConcatA(const std::string& a, const std::string& b, const std::string& c)
{
    return a + b + c;
}

std::string ConcatB(const std::string& a, const std::string& b, const std::string& c)
{
    std::string r = a;
    r += b;
    r += c;
    return r;
}
4
  • 3
    How about loop both individually 10,000 times and compare the diff? Commented Apr 23, 2014 at 3:47
  • Although ConcatA constructs more than one string it's probably faster than ConcatB due to the potential extra realloc and copy in the second. Commented Apr 23, 2014 at 3:51
  • When producing such concatenations, a "join view" into the source ranges can provide a decent performance advantage (or even allow to avoid creation of joined string). One example of such a join is boost::range::join (boost.org/doc/libs/1_55_0/libs/range/doc/html/range/reference/…; it only takes 2 source ranges but it is possible to make a truly multi-range join with c++11). Commented Apr 23, 2014 at 4:22
  • Another option when plenty of concatenations are required, is to use concatenation-friendly string class, such as the venerable __gnu_cxx::crope Commented Apr 23, 2014 at 4:24

2 Answers 2

1

ConcatB has 1 temp string, while ConcatA has 2 temp strings, thus ConcatB is twice faster.

$ cat cata.cpp

#include <string>
#include <iostream>
std::string ConcatA(const std::string& a, const std::string& b, const std::string& c)
{
    return a + b + c;
}
int main(){
  std::string aa="aa";
  std::string bb="bb";
  std::string cc="cc";
  int count = 0;
  for(int ii = 0; ii < 10000000; ++ii) {
    count += ConcatA(aa, bb, cc).size();
  }
    std::cout<< count <<std::endl;
}

$ cat catb.cpp

#include <string>
#include <iostream>
std::string ConcatB(const std::string& a, const std::string& b, const std::string& c)
{
    std::string r = a;
    r += b;
    r += c;
    return r;
}
int main(){
  std::string aa="aa";
  std::string bb="bb";
  std::string cc="cc";
  int count = 0;
  for(int ii = 0; ii < 10000000; ++ii) {
    count += ConcatB(aa, bb, cc).size();
  }
    std::cout<< count <<std::endl;
}

$ clang++ -v

Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin13.1.0
Thread model: posix

$ clang++ cata.cpp
$ time ./a.out

60000000

real    0m1.122s
user    0m1.118s
sys 0m0.003s

$ clang++ catb.cpp
$ time ./a.out
60000000

real    0m0.599s
user    0m0.596s
sys 0m0.002s
$
Sign up to request clarification or add additional context in comments.

Comments

1

I've compiled it with MinGW (TDM) 4.8.1 with option -fdump-tree-optimized, without -O2

The first one do the moves like

string tmp = a+b; // that mean create new string g, g += b, tmp = g (+dispose g)
tmp += c;
return tmp; // and dispose tmp

The second do it in another way

string tmp = a; // just copy a to tmp
tmp += b;
tmp += c;
return tmp; // and dispose tmp

It looks just like this

  void * D.20477;
  struct basic_string D.20179;

  <bb 2>:
  D.20179 = std::operator+<char, std::char_traits<char>, std::allocator<char> > (a_1(D), b_2(D)); [return slot optimization]
  *_3(D) = std::operator+<char, std::char_traits<char>, std::allocator<char> > (&D.20179, c_4(D)); [return slot optimization]

  <bb 3>:

  <bb 4>:
  std::basic_string<char>::~basic_string (&D.20179);
  D.20179 ={v} {CLOBBER};

<L1>:
  return _3(D);

<L2>:
  std::basic_string<char>::~basic_string (&D.20179);
  _5 = __builtin_eh_pointer (1);
  __builtin_unwind_resume (_5);

and

  void * D.20482;
  struct string r [value-expr: *<retval>];

  <bb 2>:
  std::basic_string<char>::basic_string (r_1(D), a_2(D));
  std::basic_string<char>::operator+= (r_1(D), b_3(D));

  <bb 3>:
  std::basic_string<char>::operator+= (r_1(D), c_4(D));

  <bb 4>:

<L0>:
  return r_1(D);

<L1>:
  std::basic_string<char>::~basic_string (r_1(D));
  _5 = __builtin_eh_pointer (1);
  __builtin_unwind_resume (_5);

So, after applying -O2 optimization compiler keep ConcatB function in almost same view, and makes some magic with ConcatA by inlining functions, adding constant values to memory allocation parts, declaring new functions, but the most valuable parts stay the same.

ConcatA:

  D.20292 = std::operator+<char, std::char_traits<char>, std::allocator<char> > (a_2(D), b_3(D)); [return slot optimization]
  *_5(D) = std::operator+<char, std::char_traits<char>, std::allocator<char> > (&D.20292, c_6(D));

ConcatB:

  std::basic_string<char>::basic_string (r_3(D), a_4(D));
  std::basic_string<char>::append (r_3(D), b_6(D));
  std::basic_string<char>::append (r_3(D), c_8(D));

So, it's obvious that ConcatB is better than ConcatA, because it does less allocation operations, which is very expensive when you're trying to optimize such small pieces of code.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.