1

I'm fan STL algorithms, so I frequently use many STL algorithms in my job. But,...

Consider following simple example: // Compiler: Visual Studio 2010 Sp1. Cpu: i5 3300MG.

struct FileInfo
{
   std::string filename_;
   std::string filename()const { return filename_;}
};

//works with 1000 FileInfos,  Elapsed: 127.947 microseconds
std::string sumof_filenames_1(std::vector<FileInfo> const& fv )
{
   std::string s;
   for( std::size_t ix = 0u; ix < fv.size(); ++ix) s += fv[ix].filename();
   return s;
}

//Elapsed: 7430.138 microseconds
std::string sumof_filenames_2(std::vector<FileInfo> const& fv )
{
   struct appender{  
     std::string operator()(std::string const& s, FileInfo const& f)const 
      { return s + f.filename(); } 
   };

   return std::accumulate( std::begin(fv), std::end(fv), std::string(), appender() );
}

//Elapsed: 10729.381 microseconds
std::string sumof_filenames_3(std::vector<FileInfo> const& fv )
{
      struct appender{
           std::string operator()(std::string & s, FileInfo const& f) const
           { return s+= f.filename(); }
      };
      std::accumulate( std::begin(fv), std::end(fv), std::string(), appender() );
}

Q: How optimize sum_of_filenames using STL algorithms, such std::accumulate, or any others, and how to implement appender functor ?

test: main function :

int main()
   {
enum{ N = 1000* 1 };
srand(14324);
std::vector<FileInfo> fv;
fv.reserve(N);
for(std::size_t ix = 0; ix < N; ++ix)
{
    FileInfo f;
    f.m_Filename.resize(  static_cast< int > ( rand() *  256 / 32768 ) + 15 , 'A');
    //for( std::size_t iy = 0; iy < f.m_Filename.size(); ++iy)
    //  f.m_Filename[iy] = static_cast<char>( rand() * 100 / 32768 + 28 );

    fv.push_back( f );
}

LARGE_INTEGER freq, start, stop;
QueryPerformanceFrequency(&freq);

{
    QueryPerformanceCounter(&start);

    std::string s = sumof_filenames_1(fv); 


    QueryPerformanceCounter(&stop);
    double elapsed = (stop.QuadPart - start.QuadPart)* 1000.0 / (double)(freq.QuadPart) * 1000.0;
    printf("Elapsed: %.3lf microseconds\n", elapsed);



    printf("%u\n", s.size());
}


{
    QueryPerformanceCounter(&start);

    std::string s = sumof_filenames_2(fv); 


    QueryPerformanceCounter(&stop);
    double elapsed = (stop.QuadPart - start.QuadPart)* 1000.0 / (double)(freq.QuadPart) * 1000.0;
    printf("Elapsed: %.3lf microseconds\n", elapsed);



    printf("%u\n", s.size());
}




{
    QueryPerformanceCounter(&start);

    std::string s = sumof_filenames_3(fv); 


    QueryPerformanceCounter(&stop);
    double elapsed = (stop.QuadPart - start.QuadPart)* 1000.0 / (double)(freq.QuadPart) * 1000.0;
    printf("Elapsed: %.3lf microseconds\n", elapsed);



    printf("%u\n", s.size());
}
4
  • Is this a question out of general interest or do you ask, because you have exactly this problem? Because, if a simple one-liner is faster and shorter than the version with the std algorithm I don't seee the point in using it, even if there are faster solutions (as long as they are not faster than the naive solution of course). Commented May 17, 2014 at 12:53
  • 2
    In fact, you could even shorten it to for (const auto& f: fv) s += f.filename(); Commented May 17, 2014 at 13:11
  • 1
    See this answer. Commented May 17, 2014 at 13:14
  • FYI: Unless I'm missing something, it looks like your times are being reported it milliseconds, not microseconds. (You're reporting to microsecond precision, but the unit is milliseconds.) Commented May 17, 2014 at 15:06

2 Answers 2

2

Try to estimate the following function

std::string sumof_filenames_3(std::vector<FileInfo> const& fv )
{
      struct appender{
           std::string & operator()(std::string & s, FileInfo const& f) const
           { return s+= f.filename(); }
      };

      return std::accumulate( std::begin(fv), std::end(fv), std::string(), appender() );
}

and with using a lambda expression

std::string sumof_filenames_3(std::vector<FileInfo> const& fv )
{
      return std::accumulate( std::begin(fv), std::end(fv), std::string(),
                              []( std::string &s, FileInfo const& f ) -> std::string &
                              {
                                 return s += f.filename();
                              } ); 
}

Also estimate the following loop

std::string sumof_filenames_1(std::vector<FileInfo> const& fv )
{
   std::string::size_type n = 0;
   for each ( FileInfo const& f in fv ) n +=  f.filename().size();

   std::string s;
   s.reserve( n );

   for( std::size_t ix = 0u; ix < fv.size(); ++ix) s += fv[ix].filename();       

   return s;
}

Also do the same estimations for the structure defined the following way

struct FileInfo
{
   std::string filename_;
   const std::string & filename()const { return filename_;}
};
Sign up to request clarification or add additional context in comments.

8 Comments

The accumulate function does init = BinaryOperator(init, *first);, so I think it is better to use + rather than += in your operator.
@Matt McNabb If you will use operator + then the compiler will create a temporary object while if you will use operator += neither temporary object wilol be created.
@Matt McNabb Also you edited my post incorrectly. Using for each is not a typo. MS VC++ 2010 does not support the standard range based for statement.
How thoughtful of MS to make up their own syntax instead of checking what was about to be published as standard.
I would bet the "two loops, reserve first" would be the fastest answer. The multiple reallocations and copies of string buffers are the most probable cause of slowdowns in this example.
|
2

A for_each comes to my mind like this:

std::string sumof_filenames(std::vector<FileInfo> const& fv )
{
    std::string s;
    std::for_each(std::begin(fv), std::end(fv), [&s](const FileInfo& f){s += f.filename();});
    return s;
}

or without the lamba

struct Appender
{
    std::string& m_s;
    Appender(std::string& s):m_s(s){};
    void operator()(const FileInfo& f) const 
        { s += f.filename(); } 
};
std::string sumof_filenames(std::vector<FileInfo> const& fv )
{
    std::string s;
    std::for_each(std::begin(fv), std::end(fv), Appender(s)});
    return s;
}

2 Comments

Shouldn't void operator()(const FileInfo f) const take its argument by reference?
@Ali: Yes they definitely should. I changed my post

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.