std string concatenation performance

Question

Performance-wise, is there any difference between the following functions in modern C++ compilers?

std::string ConcatA(const std::string& a, const std::string& b, const std::string& c)
{
    return a + b + c;
}

std::string ConcatB(const std::string& a, const std::string& b, const std::string& c)
{
    std::string r = a;
    r += b;
    r += c;
    return r;
}

How about loop both individually 10,000 times and compare the diff? — digit plumber
– digit plumber, Commented Apr 23, 2014 at 3:47
Although ConcatA constructs more than one string it's probably faster than ConcatB due to the potential extra realloc and copy in the second. — Captain Obvlious
– Captain Obvlious, Commented Apr 23, 2014 at 3:51
When producing such concatenations, a "join view" into the source ranges can provide a decent performance advantage (or even allow to avoid creation of joined string). One example of such a join is boost::range::join (boost.org/doc/libs/1_55_0/libs/range/doc/html/range/reference/…; it only takes 2 source ranges but it is possible to make a truly multi-range join with c++11). — oakad
– oakad, Commented Apr 23, 2014 at 4:22
Another option when plenty of concatenations are required, is to use concatenation-friendly string class, such as the venerable __gnu_cxx::crope — oakad
– oakad, Commented Apr 23, 2014 at 4:24

digit plumber · Accepted Answer · 2014-04-23 05:20:09Z

ConcatB has 1 temp string, while ConcatA has 2 temp strings, thus ConcatB is twice faster.

$ cat cata.cpp

#include <string>
#include <iostream>
std::string ConcatA(const std::string& a, const std::string& b, const std::string& c)
{
    return a + b + c;
}
int main(){
  std::string aa="aa";
  std::string bb="bb";
  std::string cc="cc";
  int count = 0;
  for(int ii = 0; ii < 10000000; ++ii) {
    count += ConcatA(aa, bb, cc).size();
  }
    std::cout<< count <<std::endl;
}

$ cat catb.cpp

#include <string>
#include <iostream>
std::string ConcatB(const std::string& a, const std::string& b, const std::string& c)
{
    std::string r = a;
    r += b;
    r += c;
    return r;
}
int main(){
  std::string aa="aa";
  std::string bb="bb";
  std::string cc="cc";
  int count = 0;
  for(int ii = 0; ii < 10000000; ++ii) {
    count += ConcatB(aa, bb, cc).size();
  }
    std::cout<< count <<std::endl;
}

$ clang++ -v

Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin13.1.0
Thread model: posix

$ clang++ cata.cpp
$ time ./a.out

60000000

real    0m1.122s
user    0m1.118s
sys 0m0.003s

$ clang++ catb.cpp
$ time ./a.out
60000000

real    0m0.599s
user    0m0.596s
sys 0m0.002s
$

Ralor · Accepted Answer · 2014-04-23 06:03:46Z

I've compiled it with MinGW (TDM) 4.8.1 with option -fdump-tree-optimized, without -O2

The first one do the moves like

string tmp = a+b; // that mean create new string g, g += b, tmp = g (+dispose g)
tmp += c;
return tmp; // and dispose tmp

The second do it in another way

string tmp = a; // just copy a to tmp
tmp += b;
tmp += c;
return tmp; // and dispose tmp

It looks just like this

  void * D.20477;
  struct basic_string D.20179;

  <bb 2>:
  D.20179 = std::operator+<char, std::char_traits<char>, std::allocator<char> > (a_1(D), b_2(D)); [return slot optimization]
  *_3(D) = std::operator+<char, std::char_traits<char>, std::allocator<char> > (&D.20179, c_4(D)); [return slot optimization]

  <bb 3>:

  <bb 4>:
  std::basic_string<char>::~basic_string (&D.20179);
  D.20179 ={v} {CLOBBER};

<L1>:
  return _3(D);

<L2>:
  std::basic_string<char>::~basic_string (&D.20179);
  _5 = __builtin_eh_pointer (1);
  __builtin_unwind_resume (_5);

and

  void * D.20482;
  struct string r [value-expr: *<retval>];

  <bb 2>:
  std::basic_string<char>::basic_string (r_1(D), a_2(D));
  std::basic_string<char>::operator+= (r_1(D), b_3(D));

  <bb 3>:
  std::basic_string<char>::operator+= (r_1(D), c_4(D));

  <bb 4>:

<L0>:
  return r_1(D);

<L1>:
  std::basic_string<char>::~basic_string (r_1(D));
  _5 = __builtin_eh_pointer (1);
  __builtin_unwind_resume (_5);

So, after applying -O2 optimization compiler keep ConcatB function in almost same view, and makes some magic with ConcatA by inlining functions, adding constant values to memory allocation parts, declaring new functions, but the most valuable parts stay the same.

ConcatA:

  D.20292 = std::operator+<char, std::char_traits<char>, std::allocator<char> > (a_2(D), b_3(D)); [return slot optimization]
  *_5(D) = std::operator+<char, std::char_traits<char>, std::allocator<char> > (&D.20292, c_6(D));

ConcatB:

  std::basic_string<char>::basic_string (r_3(D), a_4(D));
  std::basic_string<char>::append (r_3(D), b_6(D));
  std::basic_string<char>::append (r_3(D), c_8(D));

So, it's obvious that ConcatB is better than ConcatA, because it does less allocation operations, which is very expensive when you're trying to optimize such small pieces of code.

Collectives™ on Stack Overflow

std string concatenation performance

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related