72

The strings topic in the SO Documentation used to say, in the Remarks section:

Since C++14, instead of using "foo", it is recommended to use "foo"s, as s is a string literal, which converts the const char * "foo" to std::string "foo".

The only advantage I see using

std::string str = "foo"s;

instead of

std::string str = "foo";

is that in the first case the compiler can perform copy-elision (I think), which would be faster than the constructor call in the second case.

Nonetheless, this is (not yet) guaranteed, so the first one might also call a constructor, the copy constructor.

Ignoring cases where it is required to use std::string literals like

std::string str = "Hello "s + "World!"s;

Is there any benefit of using std::string literals instead of const char[] literals?

8
  • 3
    Errr... Does auto type deduction counts? The almost-always-auto advice has some controversy after all. Commented Jul 28, 2016 at 2:57
  • 1
    A lot of things in C++ are about semantics. The ideal is that you describe what you want done as good as possible, and let the compiler figure out everything else. However, don't over-do it so that the compiler will have room to breath (and optimize). Commented Jul 28, 2016 at 3:07
  • 3
    Consider the case where you pass a string literal for a parameter with a type that is constructible from std::string, but not from a C string. Commented Jul 28, 2016 at 3:08
  • 1
    For c++17 in some cases string_view may be preferable to store literals, as it does not 'touch heap'. Commented Jul 28, 2016 at 3:55
  • 2
    @PaulRooney, string_view is extremely useful, but keep in mind that most implementations of std::string don't either for short strings. Commented Jul 28, 2016 at 4:01

5 Answers 5

67

If you're part of the "Almost Always Auto" crowd, then the UDL is very important. It lets you do this:

auto str = "Foo"s;

And thus, str will be a genuine std::string, not a const char*. It therefore permits you to decide when to do which.

This is also important for auto return type deduction:

[]() {return "Foo"s;}

Or any form of type deduction, really:

template<typename T>
void foo(T &&t) {...}

foo("Foo"s);

The only advantage I see using [...] instead of [...] is that in the first case the compiler can perform copy-elision (I think), which would be faster than the constructor call in the second case.

Copy-elision is not faster than the constructor call. Either way, you're calling one of the object's constructors. The question is which one:

std::string str = "foo";

This will provoke a call to the constructor of std::string which takes a const char*. But since std::string has to copy the string into its own storage, it must get the length of the string to do so. And since it doesn't know the length, this constructor is forced to use strlen to get it (technically, char_traits<char>::length, but that's probably not going to be much faster).

By contrast:

std::string str = "foo"s;

This will use the UDL template that has this prototype:

string operator "" s(const char* str, size_t len);

See, the compiler knows the length of a string literal. So the UDL code is passed a pointer to the string and a size. And thus, it can call the std::string constructor that takes a const char* and a size_t. So there's no need for computing the string's length.

The advice in question is not for you to go around and convert every use of a literal into the s version. If you're fine with the limitations of an array of chars, use it. The advice is that, if you're going to store that literal in a std::string, it's best to get that done while it's still a literal and not a nebulous const char*.

Sign up to request clarification or add additional context in comments.

14 Comments

Template function type deduction might be a good example as well.
−1 There is no problem doing auto str = "Foo";, so the claim that "Foo"s lets you do that is just nonsense.
@Cheersandhth.-Alf: If you want a std::string rather than an array, then doing auto str = "Foo"; is problematic.
1) Without the UDL it's a pointer, not an array. auto decays. 2) (ultra-pedantic): it calls char_traits<char>::length, which may but isn't required to call strlen.
Does UDL mean user-defined literal?
|
25

The advice to use "blah"s has nothing to do with efficiency and all to do with correctness for novice code.

C++ novices who don't have a background in C, tend to assume that "blah" results in an object of some reasonable string type. For example, so that one can write things like "blah" + 42, which works in many script languages. With "blah" + 42 in C++, however, one just incurs Undefined Behavior, addressing beyond the end of the character array.

But if that string literal is written as "blah"s then one instead gets a compilation error, which is much preferable.

Comments

22

In addition, UDL makes it easier to have \0 in the string

std::string s = "foo\0bar"s; // s contains a \0 in its middle.
std::string s2 = "foo\0bar"; // equivalent to "foo"s

1 Comment

Regarding the link to presumed authority (since there's no more detailed info there): the SO documentation is not an authority, it's at the opposite end: completely untrustworthy. As it is per late July 2016. cppreference.com is a good resource to link to.
3

This is old and already has wonderful answers. I just want to add another use case for the UDL for std::string:

for (char cur : "abcdefghijklR") {
    // this will loop from 'a' to 'R' and add another loop with cur=0
}
for (char cur : "abcdefghijklR"s) {
    // this will loop from 'a' to 'R'
}

Just my two cents.

Comments

2
  1. Using a C++ string literal means we do not need to call strlen to compute the length. The compiler already knows it.
  2. Might allow library implemetations where the string data points to memory in global space will using C literals must always force a copy of the data to heap memory on construction.

11 Comments

COW strings are no longer allowed (since C++11), so I can't imagine how #2 would be possible. If std::string contains a string, it must own it, and do so uniquely.
This is not COW, just using global space as the initial storage.
If you're using global space as the initial storage, then it's very possible that more than one std::string is using that same global storage. In pretty much every compiler, if you use the same literal twice, it'll only be in the string table once. As such, if you try to modify that string, the object will have to copy it out to object-specific storage first. That is the essential essence of COW.
@NicolBolas: I remember implementing a good part of std::string's functionality as COW, for a question here on SO, just to prove that the statement that they're no longer allowed is incorrect. As I recall C++11 rules reduce the gains from COW so that it's now not practically desirable.
@NicolBolas: I've done the proof job for you, because I found I had promised to update my (self-deleted for that reason) answer in the thread you linked. I just posted a new answer. As it turns out, checking my own old arguments, C++11 indeed prohibits COW implementation of basic_string, via constant time requirements on item access. For some reason the other respondents failed to notice this. Also, 75 upvoters failed to notice that the logic in Dave's answer was entirely bogus, possibly because he arrived at the correct conclusion?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.