1

scanf is supposed to consume and discard any character in the format string that is not part of a conversion specifier. But its behavior seems different when a non-whitespace, non-conversion character comes first in the format string. Why?

int main() {
    int num, num2;
    printf("> ");
    while (scanf("> %d %d", &num, &num2)) {
        printf("You entered the number %d %d.\n", num, num2);
        printf("> ");
    }
    return EXIT_SUCCESS;
}

If you build and run this and enter

> 3 4

at the prompt, it prints the message and the repeated prompt and then quits immediately.

So that means that scanf returns 2 the first time and then returns 0 before the user can enter another set of tokens. If you remove the > from the format string, the loop will run until the user enters something not a numeral, which then causes scanf to return 0 - the behavior I would expect.

Also, if I put that same symbol after the first conversion specifier, the loop continues to run as expected. That is, if the format string has, say, "%d > %d", and the user enters

3 > 4

the loop will run again and accept another round of input.

I have not seen any documentation on this behavior.

12
  • Instead of scanf, why not walk through the string with a pointer and just figure this out yourself? Figuring out simple syntax like this is pretty close to trivial. Commented Oct 21, 2023 at 16:47
  • 3
    On the second loop, you asked it to match '>' with the newline reamining in the input, so it failed. Commented Oct 21, 2023 at 16:54
  • 1
    Use " > %d %d" to avoid that issue. Your read-loop test must be while (scanf("> %d %d", &num, &num2) == 2) otherwise, it will fail (spectacularly) if a manual EOF is entered. (see @pmg comment below for proper solution) Read it all into a buffer and then call sscanf() on the buffer instead of reading with scanf() directly. Commented Oct 21, 2023 at 16:55
  • 1
    Unrelated: do not use scanf() for user input. scanf() was not designed for user input. Use fgets() instead. Commented Oct 21, 2023 at 16:56
  • 2
    I think that is why he began his comment with "unrealated:" ... Bottom line is the '\n' is generated by you pressing [Enter] following input. '>' is NOT a conversion specifier and has no ability to read/discard whitespace. Adding a space before allows zero-or-more whitespace to be discarded. Failing to check == 2 invites undefined behavior. His point is using fgets() and a sufficiently sized buffer avoids that pitfall in scanf() use completely. Commented Oct 21, 2023 at 17:03

2 Answers 2

5

From some documentation on fscanf:

The format string consists of

  • non-whitespace multibyte characters except %: each such character in the format string consumes exactly one identical character from the input stream, or causes the function to fail if the next character on the stream does not compare equal.

While the fscanf specifier %d consumes any and all1 leading whitespace, it does not consume the line feed that follows it, and '>' does not exactly match that newline character ('\n') on subsequent iterations.

From the same documentation:

  • whitespace characters: any single whitespace character in the format string consumes all available consecutive whitespace characters from the input (determined as if by calling isspace in a loop). Note that there is no difference between "\n", " ", "\t\t", or other whitespace in the format string.

So a leading whitespace in your format specifier will consume the trailing newline character from the input:

#include <stdio.h>

int main(void)
{
    int num, num2;

    while (1) {
        printf("Enter \"> NUM NUM2\": ");

        if (2 != scanf(" > %d %d", &num, &num2))
            break;

        printf("You entered the number %d %d.\n", num, num2);
    }
}
Enter "> NUM NUM2": > 1 2
You entered the number 1 2.
Enter "> NUM NUM2": > 3 4
You entered the number 3 4.

Aside: this will loop forever on the truthy return value of EOF (a negative int, almost universally -1):

while (scanf("> %d %d", &num, &num2))

You should explicitly check the return value of scanf is the expected number of conversion specifiers (i.e., 2).


1. As do all format specifiers except %c, %[, and %n (assuming no errors occur).

Sign up to request clarification or add additional context in comments.

5 Comments

Worth explaining that all conversion specifiers except "%c", "%[..]" (and "%n") discard leading whitespace. May also want to explain why " > %d %d" would work.
Thank you. The point I was missing was how leading whitespace is discarded before conversion specifiers but not literal characters. The point about checking the actual return value is good — I wanted to keep my example simple to illustrate the specific question.
@DavidC.Rankin Should be covered now, cheers.
Yep, you got it and the nod.
@AmittaiAviram Correct is simple, though! Less incorrectness means there's less to focus on. That said, the fact that it is rather difficult to be correct when using scanf is the reason why so many people will advise you to avoid the function (usually in favour of line-based processing) - even if its behaviour can be described.
2

I generally agree with all the comments, especially the ones which emphasize the fact that scanf is not really a good idea for user input.

The answer to your question is also mostly given, a trailing newline 'character' is added to the input stream by Enter in the first input, it's not parsed by scanf and will remain there, in the second loop that newline will be matched with the first character in the specifier, > in this case, and it will fail to match, having scanf return 0, breaking the loop.

As suggested, adding a leading whitespace to the specifier will force scanf to consume said newline and clear the input buffer which will again wait for input, exposing this behavior.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.