4

I am trying to correctly handle user input in C, particularly when reading a file path from the user.

However, I have some concerns:

  1. How do I refactor this code to handle dynamic memory allocation.
  2. How should I properly handle dynamic memory allocation for user input when the length is unknown?
  3. What is the safest way to accept user input for file paths?

My current code:

#include <stdio.h>

int main() {
    char filePath[1024];

    printf("Please provide file to encrypt (File Path): ");
    scanf("%1024s", filePath); // Is this safe?

    printf("You entered: %s\n", filePath);
    return 0;
}

I found a source on Microsoft's website suggesting that I specify a width for the %s format specifier in scanf (e.g., %1024s instead of %s), but it's still fixed size and I want it dynamically allocated.

6
  • 3
    This call of scanf scanf("%1024s", filePath); is unsafe because it can result in memory overflow. It would be correctly to write scanf("%1023s", filePath); Commented Feb 10 at 15:23
  • 1
    0) don't scanf(), use fgets() 1) Use PATH_MAX (if in POSIX land) and assume user doesn't type "../a/../a/../a/../a/../a/../a/../a/../a/../a/../a/../a/../a/../a/many_more_of_these/file.txt" Commented Feb 10 at 15:31
  • 1
    There is no convenient standard library function that dynamically allocates a buffer and reads user input into it. You may want to write your own. Note you cannot do the allocation in one go because you don't know the size of the input before you read it, so you need to allocate, read some, reallocate, read some more, etc. Repeat until done. Alternatively, use a third party function, such as GNU readline (which incidentally can do about a zillion more useful things than just allocating a buffer). Commented Feb 10 at 15:40
  • 4
    Small note, %1024s for a 1024 character buffer is wrong. It should be %1023s to leave room for a null-terminator. Commented Feb 10 at 18:17
  • 1
    "What is the safest way to accept user input for file paths?" --> Include a limit to possible input to say a small factor of MAX_PATH. No need to allow user input to consume memory resources for insanely long input. Commented Feb 11 at 5:44

2 Answers 2

4

Fixed buffer size

As noted in comments, you'd be better off utilizing the standard library fgets function which allows you to specify the size of the buffer to avoid buffer overflow issues.

#include <stdio.h>

int main() {
    char filePath[1024];

    printf("Please provide file to encrypt (File Path): ");
    fgets(filePath, sizeof(filePath), stdin);

    printf("You entered: %s\n", filePath);
    return 0;
}

Using scanf with %s with or without a width specifier will also restrict you to reading one whitespace delimited string. This prohibits spaces in your input.

Unknown input size

If you want to read an entire line of unknown length, you'll need to either go outside of the standard library or reinvent the wheel yourself. It shouldn't be difficult. I whipped the following up in a few minutes. There is room for improvement on this, but it should give you an idea of what's possible.

The key process is to allocate an initial buffer, then read into it character by character until you hit EOF or a newline. If the length hits the limits of your buffer, reallocate.

Two common memory management pitfalls with this:

  • Growing your buffer by a factor of 1 (added). If you do this, you're going to be making a linear number of calls to realloc, which may need to copy your buffer each time. This is expensive. Grow by a factor of 21 (multiplied) and your calls to realloc will be logarithmic. Starting with a larger initial budget size is also an option; 8 was used for demonstration purposes only in the following code.
  • When calling realloc test that it succeeded before assigning to the original buffer. If you don't, and realloc fails, you'll be unable to access the originally allocated memory, leading to a memory leak.
#include <stdlib.h>
#include <stdio.h>

char *read_line(FILE *fp) {
    size_t sz = 8;
    size_t len = 0;
    char *buf = malloc(sz);
    int ch;

    while ((ch = fgetc(fp)) != EOF) {
        // Grow the buffer if necessary by a factor of 2
        // to avoid extraneous calls to realloc
        if (len >= sz - 1) {
            // Don't immediately overwrite the buf pointer 
            // in case realloc fails
            char *temp = realloc(buf, sz * 2);
            if (!temp) {
                free(buf);
                return NULL;
            }

            buf = temp;
            sz *= 2;
        }

        // Terminate on a newline
        if (ch == '\n') {
            buf[len] = '\0';
            return buf;
        }

        buf[len++] = ch;
    }

    buf[len] = '\0';

    return buf;
}

Opportunities for refinement:

  • Take a size_t pointer as an argument and allow the function to write the length of the read string to a variable, so that the size of the input string does not need to subsequently be calculated with strlen.
  • Return NULL if the first fgetc returns EOF rather than just returning an empty string.

1 Ideal memory growth rate is a matter of some debate. See: What is the ideal growth rate for a dynamically allocated array?

Sign up to request clarification or add additional context in comments.

2 Comments

Hey @Chris, thanks for the explanation! Just wondering, why do you use FILE *fp as a parameter? Is it mainly for reusability, or are there specific cases where it’s better than just using stdin inside function? while ((ch = fgetc(stdin)) != EOF)
@tr41z It makes it more general, there's no reason why a function like this should be limited to stdin.
3

Try using getline(&buffer,&size,stdin); since it dynamically allocates memory for the input.

You can simply have a working code with something like this:

#include <stdio.h>
#include <stdlib.h>

int main() {
    char *buffer = NULL;
    size_t len = 0;

    if (getline(&buffer, &len, stdin) != -1) { 
        printf("Entered Line: %s\nLenght: %zu", buffer, len); 
    } else { 
        perror("getline failed"); 
    } 
 
    // Remember to free the memory after the string is used
    free(buffer);
    return 0; 
}

3 Comments

getline is not a standard C function. Unix provides it, but OP mentions Windows, so this code may not work for them. Additionally, the behavior of printing the size_t len using %d is not defined.
Follow-up to @EricPostpischil's comment: the correct specifier for size_t is %zu
If you're interested, I wrote a porting of getline for MSVC: github.com/CostantinoGrana/msvc_getline

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.