I'm trying to write a simple split function in c, where you supply a string and a char to split on, and it returns a list of split-strings:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char ** split(char * tosplit, char delim){
int amount = 0;
for (int i=0; i<strlen(tosplit); i++) {
if (tosplit[i] == delim) {
amount++;
}
}
char ** split_ar = malloc(0);
int counter = 0;
char *token = strtok(tosplit, &delim);
while (token){
split_ar[counter] = malloc(0);
split_ar[counter] = token;
token = strtok(NULL, &delim);
counter++;
}
split_ar[counter] = 0;
return split_ar;
}
int main(int argc, char *argv[]){
if (argc == 2){
char *tosplit = argv[1];
char delim = *argv[2];
char ** split_ar = split(tosplit, delim);
while (*split_ar){
printf("%s\n", *split_ar);
split_ar++;
}
} else {
puts("Please enter words and a delimiter.");
}
}
I use malloc twice: once to allocate space for the pointers to strings, and once allocate space for the actual strings themselves. The strange thing is: during testing I found that the code still worked when I malloc'ed no space at all.
When I removed the malloc-lines I got Segfaults or Malloc-assertion errors, so the lines do seem to be necessary, even though they don't seem to do anything. Can someone please explain me why?
I expect it has something to with strtok; the string being tokenized is initialized outside the function scope, and strtok returns pointers to the original string, so maybe malloc isn't even necessary. I have looked at many old SO threads but could find nothing similar enough to answer my question.
malloc(0)returns a null pointer, or a valid pointer to 0 bytes of memory. So you need to either (a) take care not to try to allocate 0 bytes of memory, or (b) if you do, don't print an error message ifmalloc(0)returnsNULL.malloc((0)returns a pointer to a memory zone of length 0, and when you dereferencing this pointer you get undefined behaviour which appears to work.char ** split_ar = malloc(0);, and then start filling insplit_ar[counter]. For simplicity, try callingsplit_ar = malloc(50 * sizeof(char *)), where 50 is a guess of how many strings you might need. (That's not a good long-term solution, but it's a start.)split_ar[counter] = malloc(…);, immediately followed bysplit_ar[counter] = token;, you're throwing away (failing to use) the memory you just allocated, and instead filling insplit_arwith a pointer value — of dubious longevity — fromtoken.