5
\$\begingroup\$

I have written my first C program that renames spaces in filenames to underscores. I wonder if there is an easier or more efficient way:

#include <stdio.h>
#include <string.h>

int main( int argc, char *argv[] ) {

    if ( argc != 2 ) {
        printf("Format is: NoSpace <filename>\n");
        return(0);
    }
    else if ( strstr( argv[1], " ") == NULL) {
        printf("No space character in file: %s\n",argv[1]);
        return(0);
    }

    int filename_size = strlen(argv[1]);
    char file_name_old[filename_size], file_name_new[filename_size];

    strcpy(file_name_old, argv[1]);
    strcpy(file_name_new, argv[1]);

    // ASCII 32 = space; ACII 95 = underscore
    for( int pos = 0; pos < filename_size; pos++) {
        if ( file_name_new[pos] == 32)  {
            file_name_new[pos] = 95;
        }
    }

    rename(file_name_old,file_name_new);
    return(0);
}

Thanks!

\$\endgroup\$
6
  • \$\begingroup\$ @RoiHatam you are not really making sense here... your comment seems more about using files with spaces in some specific console environment rather than about renaming files. \$\endgroup\$ Commented May 17, 2017 at 14:26
  • 3
    \$\begingroup\$ Regarding the code: why use 32 and 95 instead of ' ' and '_'? Also, your program will not play nice with full qualified path input like "/user/home/My Awesome Folder/My File with Space to be replaced" \$\endgroup\$ Commented May 17, 2017 at 14:29
  • 3
    \$\begingroup\$ You can use strchr to search for a particular character in a string to make the code slightly simpler, but you've got it about as efficient as possible. \$\endgroup\$ Commented May 17, 2017 at 14:30
  • 3
    \$\begingroup\$ The length of a C string is strlen() + 1 to account for the terminating '\0' character. These are too short: char file_name_old[filename_size], file_name_new[filename_size]; And you need to check the return value from rename() and not ignore it. \$\endgroup\$ Commented May 17, 2017 at 14:33
  • 1
    \$\begingroup\$ If you like shorthand (or don't like to type), you can replace strstr( argv[1], " ") == NULL with simply !strstr( argv[1], " "). Entirely up to you. \$\endgroup\$ Commented May 17, 2017 at 18:03

4 Answers 4

2
\$\begingroup\$

Report errors to error output stream

Error messages should be written to stderr, not stdout. We should also return a non-zero value when we fail - ideally EXIT_FAILURE, defined in <stdlib.h>:

if (argc < 2) {
    fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
    return EXIT_FAILURE;
}

Note also that I've interpolated argv[0], so as to print the actual name we were invoked as, rather than having to match the code and the build system.

Accept many arguments

Users would like to be able to operate on multiple files at once:

NoSpace *

We can support this (and I've hinted at this above, if you spotted the test argc < 2) by looping over the arguments:

for (int i = 1;  i < argc;  ++i) {
    const char *const src = argv[i];
    unspace(src);
}

No-action-required is not an error

Possibly an opinion point, but I would suggest that a file that doesn't need any replacement shouldn't be considered a failure.

Avoid overwriting existing files

We don't want to accidentally destroy data if the target filename already exists. For Linux, we can avoid this without risk of a race between checking and acting by using the renameat2() system call:

    renameat2(AT_FDCWD, file_name_old, AT_FDCWD, file_name_new, RENAME_NOREPLACE);

Buffer size (BUG)

We allocate strlen(src) characters as buffer, but we need one more, to account for the terminating NUL character. Note that strlen returns a size_t, not an int (although it's unlikely to make a practical difference here).

Replace during copy

We can perform the substitution as we copy:

    char *const src = argv[i];
    size_t filename_size = strlen(argv[i]);
    char dest[filename_size+1];
    for (char *p = src, *q = dest;  *p;  ++p, ++q)
        if (*p == ' ')
            *q = '_';
        else
            *q = *p;
    dest[filename_size] = '\0';

More compactly:

    for (char *p = src, *q = dest;  *p;  ++p, ++q)
        *q = *p == ' ' ? '_' : *p;

Note that we use character constants ' ' and '_', so our code works in non-ASCII environments and so that its intention is clear without needing comments.

Check the result of the rename() call

The library call may fail - perhaps the destination name exists (and we're using the RENAME_NOREPLACE flag), or the directory is not writable.

    if (renameat2(AT_FDCWD, src, AT_FDCWD, dest, RENAME_NOREPLACE)) {
        perror(src);
    }

We can make this more portable, by testing whether it's supported:

#if _POSIX_C_SOURCE >= 200809L
#define rename(src, dest)  renameat2(AT_FDCWD, src, AT_FDCWD, dest, RENAME_NOREPLACE);
#endif

Complete program

#include <fcntl.h>

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#if _POSIX_C_SOURCE >= 200809L
#define rename(src, dest)  renameat2(AT_FDCWD, src, AT_FDCWD, dest, RENAME_NOREPLACE);
#endif

int main(int argc, char *argv[])
{
    if (argc < 2) {
        fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
        return EXIT_FAILURE;
    }

    int ret_val = EXIT_SUCCESS;
    for (int i = 1;  i < argc;  ++i) {
        char *const src = argv[i];
        size_t filename_size = strlen(argv[i]);
        char dest[filename_size+1];
        for (char *p = src, *q = dest;  *p;  ++p, ++q)
            *q = *p == ' ' ? '_' : *p;
        dest[filename_size] = '\0';

        if (rename(src, dest)) {
            perror(src);
            ret_val = EXIT_FAILURE;
        }
    }

    return ret_val;
}

Enhancement suggestions

You might want to look into accepting some option flags to adjust behaviour. I suggest the following (modelled on the mv command):

  • -v for verbose output: print a line for every file successfully moved
  • -f to overwrite existing files without checking
  • -i to ask before overwriting an existing file
\$\endgroup\$
1
  • \$\begingroup\$ Thanks for the detailed and complex answer! I will work on it and read the documentation about these improvements. Cheers! \$\endgroup\$ Commented May 21, 2017 at 1:03
4
\$\begingroup\$

Although this indeed does something similar to your goal, here are some criticisms / improvements:

  1. There's absolutely no need to take a copy of the original filename.
  2. If you meet an error condition, return 1 (or better EXIT_FAILURE from stdlib.h), so scripts calling your tool get a chance to detect the error.
  3. Use functions; this helps your code to keep structure
  4. Your tool will not work correctly for filenames with paths, so better search from the end and stop replacing when you hit a path separator character.
  5. When iterating through a string, it's often easier to just use pointers instead of array indexing.

I'd suggest something like this:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#ifdef _WIN32
#define isPathSep(x) ((x) == '\\' || (x) == '/')
#else
#define isPathSep(x) ((x) == '/')
#endif

char *sanitizeFileName(const char *filename)
{
    size_t namelen = strlen(filename);

    char *sanitized = malloc(namelen + 1);
    if (!sanitized)
    {
        perror("malloc");
        exit(EXIT_FAILURE);
    }

    strcpy(sanitized, filename);

    char *p = sanitized + namelen - 1;
    while (p >= sanitized)
    {
        if (isPathSep(*p)) break;
        if (*p == ' ') *p = '_';
        --p;
    }

    return sanitized;
}

int main(int argc, char **argv)
{
    if (argc != 2)
    {
        fprintf(stderr, "Usage: %s [filename]\n", argv[0]);
        return EXIT_FAILURE;
    }

    char *safename = sanitizeFileName(argv[1]);

    // instead of rename(), for demonstration:
    printf("%s -> %s\n", argv[1], safename);

    free(safename);

    return EXIT_SUCCESS;
}
\$\endgroup\$
6
  • 1
    \$\begingroup\$ char *sanitized = strdup(filename); might be a better choice than strlen() and strcpy(). I'd also use strrchr('/') to find the last slash in the file name, increment the pointer returned by strrchr('/') and then go forward rather than backwards with pointer p. The algorithm you have is a good one. \$\endgroup\$ Commented May 17, 2017 at 18:16
  • 2
    \$\begingroup\$ @pacmaninbw - true, where strdup() is available - it's not a standard C function. \$\endgroup\$ Commented May 17, 2017 at 19:42
  • 1
    \$\begingroup\$ @TobySpeight I guess I've always been lucky, I've never had a problem with it in 30 years, but you're correct that's what my search just turned up. \$\endgroup\$ Commented May 17, 2017 at 23:34
  • 1
    \$\begingroup\$ @pacmaninbw with the GNU toolchain, you can compile with e.g. -std=c11 to only get declarations of standard c by default, helps a lot for writing portable code \$\endgroup\$ Commented May 18, 2017 at 6:03
  • \$\begingroup\$ Actually, for user input in windows, both separators / and `` should be expected. \$\endgroup\$ Commented May 18, 2017 at 7:05
3
\$\begingroup\$

This code is correct. Here some style tips for you:

  1. I recommend to get rid of the file_name_old. In this program, you never edit it anyway and rename accepts const chars (never will be edited inside this function), so you can replace this:

    rename(file_name_old,file_name_new);
    

    With this:

    rename(argv[1], file_name_new);
    
  2. Replace "magic numbers" in this scope:

    if ( file_name_new[pos] == 32)  {
        file_name_new[pos] = 95;
    }
    

    With this:

    if ( file_name_new[pos] == ' ')  {
        file_name_new[pos] = '_';
    }
    
  3. As commenters pointed out, use strlen(argv[1]) + 1 when declaring new arrays.

\$\endgroup\$
3
\$\begingroup\$

Buffer overrun

You allocated one too few bytes in each of your filenames, because you forgot about the terminating null character:

int filename_size = strlen(argv[1]);
char file_name_old[filename_size], file_name_new[filename_size];

strcpy(file_name_old, argv[1]);
strcpy(file_name_new, argv[1]);

You could either change the size to include that extra byte:

 int filename_size = strlen(argv[1]) + 1;

Or you could allocate one more byte here:

 char file_name_new[filename_size+1];

(Note: I removed file_name_old because it was a useless copy)

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.