15

What is the best way to compare std::strings? The obvious way would be with if/else:

std::string input;
std::cin >> input;

if ( input == "blahblahblah" )
{
    // do something.
}

else if ( input == "blahblah" )
{
    // do something else.
}

else if ( input == "blah" )
{
    // do something else yet.
}

// etc. etc. etc.

Another possibility is to use an std::map and a switch/case. What is the best way when doing lots (like 8, 10, 12+) of these comparisons?

2
  • 1
    Yeah, just use a map from string to function. Commented Jan 23, 2011 at 4:45
  • @Ben could you post an example as an answer? Commented Jan 23, 2011 at 4:53

7 Answers 7

27

Here's an example using std::map.

#include <map>
#include <string>
#include <iostream>
#include <utility>

void first()
{
  std::cout << "first\n";
}

void second()
{
  std::cout << "second\n";
}

void third()
{
  std::cout << "third\n";
}


int main()
{
  typedef void(*StringFunc)();
  std::map<std::string, StringFunc> stringToFuncMap;

  stringToFuncMap.insert(std::make_pair("blah", &first));
  stringToFuncMap.insert(std::make_pair("blahblah", &second));
  stringToFuncMap.insert(std::make_pair("blahblahblah", &third));

  stringToFuncMap["blahblah"]();
  stringToFuncMap["blahblahblah"]();
  stringToFuncMap["blah"]();
}

Output is:

second
third
first

The benefits of this approach are:

  • It's easily extensible.
  • It forces you to break out the string-handling routines into separate functions (programming by intention).
  • Function lookup is O(log n), whereas your example is O(n)

Look into using boost::function to make the syntax a bit nicer, especially with class member functions.

Sign up to request clarification or add additional context in comments.

2 Comments

this seems good. but one question, why did you do stringToFuncMap.insert(std::make_pair("blah", &first)); instead of stringToFuncMap["blah"] = &first?
operator[] would work fine to initialize the map. I'm just used to initializing a map with the insert method. It's marginally more efficient, as operator[] will first create an std::pair with the "second" default initialized. I think Meyers' "Effective STL" covers this in depth.
3

using operator== is pretty good, but if performance is really critical, you can improve it depending on your use case. If the goal is to pick one of a few choices and perform a specific action, you can use a TRIE. Also if the strings are different enough, you could do something like this:

switch(s[0]) {
case 'a':
    // only compare to strings which start with an 'a'
    if(s == "abcd") {

    } else if (s == "abcde") {

    }
    break;
case 'b':
    // only compare to strings which start with an 'b'
    if(s == "bcd") {

    } else if (s == "bcde") {

    }
    break;
default:
    // we know right away it doesn't match any of the above 4 choices...
}

basically use a certain character in the string which good uniqueness (doesn't have to be the first if all strings are at least N in length any character before N will do!) to do a switch then do a series of compares on a subset of the strings which match that unique characteristic

Comments

1

"12" isn't a lot... but anyway.

You can only use a switch for integral types (char, int, etc.), so that's out of the question for std::string. Using a map would probably be more readable.

Then again, it all depends on how you define "best".

3 Comments

Did they ever fix c++0x so that hash<std::string>("foo") is a constexpr? You could then switch on the hash of the string.
@KitsuneYMG Even then, what if there is a collision? You would end up with multiple labels with the same value. I don't think a hash value would make a good label for a switch.
@EtiennedeMartel You can check with a static_assert or assert for that.
1

You could make a switch using a compile-time hash and with a 64 bit hash not have to worry about collisions. A nice side effect of this is your compiled program will not have the string constants in it, potentially being smaller.

// compile-time FNV-1a hash
// see (https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function)
static constexpr std::uint64_t basis = 14695981039346656037ULL;
static constexpr std::uint64_t prime = 1099511628211ULL;
constexpr std::uint64_t c_hash(const char *str) {
    std::uint64_t hash{basis};
    while (*str != 0) {
        hash ^= str[0];
        hash *= prime;
        ++str;
    }
    return hash;
}
switch (c_hash(input)) {
    case c_hash("blah"): cout << "first"; break;
    case c_hash("blahblah"): cout << "second"; break;
    case c_hash("blahblahblah"): cout << "third"; break;
    default: cout < "what?"; break;
}

Comments

0

The answer to this question is all too dependent upon the problem. You've named two examples. You could add to your options things like hash tables, regular expressions, etc...

Comments

0

With 8, 10 and even 12 comparisons you can still use if ... else if ... scheme, nothing bad. If you want 100 or something, I'd recommend writing a function which would calculate a hash of string (even by simple xoring all the characters, but some other good method would be preferable for better distribution) and then switching over its result as Evan proposed. If function returns unique numbers for all the possible input strings - that's even better and doesn't require additional comparisons.

Comments

0

If you mean "most efficient" by "the best", read ahead.

I'm suggesting using the following method if there really is a lot.
String in Switch is actually something will be in Java 7. (As part of Project Coin)

And according to the proposal, this is the way Java language will implement it.
First, hash value of each of the strings is calculated. This problem is then a "switch int" problem, which is available in most currently language, and is efficient. In each of the case statement, you then check if this is really the string (in very rare cases different strings could hash to the same int).
I personally don't do the last step in practice sometimes as it's necessity depends on the situation you specific program is in, i.e. whether the strings possible are under the programmer's control and how robust the program need to be.

A sample pseudocode the corresponds

String s = ...
switch(s) {
 case "quux":
    processQuux(s);
    // fall-through

  case "foo":
  case "bar":
    processFooOrBar(s);
    break;

  case "baz":
     processBaz(s);
    // fall-through

  default:
    processDefault(s);
    break;
}

from the fore-mentioned proposal to help you understand.

// Advanced example
{  // new scope for synthetic variables
  boolean $take_default = false;
  boolean $fallthrough = false;
  $default_label: {
      switch(s.hashCode()) { // cause NPE if s is null
      case 3482567: // "quux".hashCode()
          if (!s.equals("quux")) {
              $take_default = true;
              break $default_label;
          }
          processQuux(s);
          $fallthrough = true;
                case 101574: // "foo".hashCode()
          if (!$fallthrough && !s.equals("foo")) {
              $take_default = true;
              break $default_label;
          }
          $fallthrough = true;
      case 97299:  // "bar".hashCode()
          if (!$fallthrough && !s.equals("bar")) {
              $take_default = true;
              break $default_label;
          }
          processFooOrBar(s);
          break;

      case 97307: // "baz".hashCode()
          if (!s.equals("baz")) {
              $take_default = true;
              break $default_label;
          }
          processBaz(s);
          $fallthrough = true;

      default:
          $take_default = true;
          break $default_label;
      }
  }
  if($take_default)
      processDefault(s);
}

1 Comment

The question was for c++ (std::string) :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.