0

To learn more of C, I'm trying to recreate basic data structures. Here's a minimal example of my attempt at an array, which compiles and runs but has a problem detected by valgrind:

#include <stdlib.h>
#include <stdio.h>

typedef void * vp_t;

typedef struct {
  int len;
  vp_t *start;
} arr_t;

arr_t * array_new(int len) {
  arr_t *arr = malloc(sizeof(arr_t));
  arr->start = malloc(len * sizeof(vp_t));
  arr->len = len;
  return arr;
}

void array_set(arr_t *arr, int i, vp_t vp) {
  vp_t *dest = arr->start + i * sizeof(vp_t);  
  *dest = vp;
}

int array_get(arr_t *arr, int i) {
  int *p = *(arr->start + i * sizeof(vp_t));
  return *p;
}

void array_delete(arr_t *arr) {
  free(arr->start);
  free(arr);
}

int main() {
  int x=0, y=1, z=2;
  arr_t *arr = array_new(3);
  array_set(arr, 0, &x);
  array_set(arr, 1, &y);
  array_set(arr, 2, &z);

  for (int i = 0; i < 3; ++i) printf("%i ", array_get(arr, i));
  putchar('\n');

  array_delete(arr);
  
  return 0;
}

The program outputs 1 2 3 as expected. However, valgrind is detecting a problem the second and third time I call the array_set function. Running valgrind against the example code here, I get:

==91933== Invalid write of size 8
==91933==    at 0x109244: array_set (min.c:22)
==91933==    by 0x109312: main (min.c:39)
==91933==  Address 0x4a990d0 is 32 bytes before an unallocated block of size 4,194,032 in arena "client"
==91933== 
==91933== 
==91933== Process terminating with default action of signal 11 (SIGSEGV)
==91933==  Access not within mapped region at address 0x2003A98F4C
==91933==    at 0x109244: array_set (min.c:22)
==91933==    by 0x109327: main (min.c:40)

min.c:22 refers to *dest = vp in the array_set function. min.c:39 refers to array_set(arr, 1, &y). Valgrind does not complain about line 38, array_set(arr, 0, &x).

I've been poking around with gdb, but I've not figured it out yet. Thanks for taking a look.

4
  • 2
    Both set() and get(), don't multiply by the size of the object. Eg: vp_t *dest = arr->start+i; You deal with objects. The compiler knows how many bytes each object occupies. Commented Sep 12, 2023 at 4:51
  • Try printing the address like: printf("%p\n", arr->start + i * sizeof(vp_t)); Is that what you expect? Then print it like printf("%p\n", arr->start + i );Is that what you expect? Commented Sep 12, 2023 at 5:13
  • Read stackoverflow.com/a/67848109/4386427 Commented Sep 12, 2023 at 5:15
  • You're mixing vp_t pointers with int's... What are you trying to achieve? The compiler should be giving you all sorts of warnings and errors... Commented Sep 12, 2023 at 7:09

4 Answers 4

0

The is wrong

  vp_t *dest = arr->start + i * sizeof(vp_t);  

In C, when you do pointer arithmetic, (i.e. add a number to a pointer), the compiler will take care of multiplying the number by the size of the objects pointed to. For example, if you have

int64_t a[50];
int *b = a;
int *c = &(a[21]);

b + 8 points to a[8] not a[1] - the compiler knows b points to objects of size 8 bytes and multiplies the number added on to b by 8. Similarly c - b will be 21, not 168, because the compiler knows to divide the difference in addresses by the object size.

In my example, b + 8 * sizeof(int64_t) would be the same as b + 64 and the compiler would then multiply that 64 by sizeof(int64_t) to get the number of bytes to add to b. This would clearly be outside of the array, which is what Valgrind is detecting in your case.

Another way to look at this is that a[i] and *(a + i) are functionally identical in C. You never see people writing a[i * (sizeof *a)] do you? You don't need to do the multiplying for pointer arithmetic either.

The reason why you do need the sizeof in malloc() is because malloc() can't infer the type of the objects that will be contained in the block.


There is another problem with your code. Your array stores pointers to the objects you want to store. This is presumably because you want to be able to store objects of any type. But if you do this, you have to be careful to make sure the objects don't disappear leaving you with dangling pointers.

For example, x, y and z are all of automatic scope. Their storage will disappear when the function exits. This isn't a problem with your code because exiting the function is the same as exiting the program. However, if you have something like this:

int populateArray(arr_t *array)
{
    int x = 1, y = 2, z = 3;
    array_set(arr, 0, &x);
    array_set(arr, 1, &y);
    array_set(arr, 2, &z);
}

int main()
{
  arr_t *arr = array_new(3);
  populateArray(arr);
  // At this point your array contains 3 dangling pointers.
  // Valgrind will complain if you try to access any of them.
}

that's broken.

Sign up to request clarification or add additional context in comments.

Comments

0

When you need to address an array, just add how many cells(beside size of cell) to the pointer.

void array_set(arr_t *arr, int i, vp_t vp) {
  vp_t *dest = arr->start + i;  
  *dest = vp;
}

int array_get(arr_t *arr, int i) {
  int *p = *(arr->start + i);
  return *p;
}

arr->start is of type vp_t so the compiler will increment as much as it needs (sizeof(vp_t)) to go to next cell when you add just 1 to the pointer.

1 Comment

set() is saving struct's, while get() is returning ints... Curious...
0

here its the solution, but first: why you need an void ** array?, if you want to create an integer array , create with int*, an void array its only recomended when you need an array of any types. anyway, the mistakes its you were creating an void* (one dimension), while in fact, should be an bidimensional array void **

#include <stdlib.h>
#include <stdio.h>


typedef struct {
  int len;
  void  **start;
} arr_t;

arr_t * array_new(int len) {
  arr_t *arr = malloc(sizeof(arr_t));
  arr->start = malloc(len * sizeof(void*));
  arr->len = len;
  return arr;
}

void array_set(arr_t *arr, int i, void  *vp) {
    arr->start[i] = vp;
}

int array_get(arr_t *arr, int i) {
    return *(int*)arr->start[i];
}

void array_delete(arr_t *arr) {
  free(arr->start);
  free(arr);
}

int main() {
  int x=0, y=1, z=2;
  arr_t *arr = array_new(3);
  array_set(arr, 0, &x);
  array_set(arr, 1, &y);
  array_set(arr, 2, &z);
  for (int i = 0; i < 3; ++i) printf("%i ", array_get(arr, i));
  putchar('\n');

  array_delete(arr);
  
  return 0;
}

But if you need an any array (wich I think its the only reason to use a void *), here is the "correct" implementation


#include <stdlib.h>
#include <string.h>
#include <stdio.h>

enum {
    UNDEFINED,
    STRING,
    INTEGER 
};

typedef struct {
  int len;
  void  **start;
  int *types;
} arr_t;

arr_t * array_new(int len) {
  arr_t *arr = malloc(sizeof(arr_t));
  arr->start = malloc(len * sizeof(void*));
  arr->types = malloc(len *sizeof(int));
  //start everything as undefinde
  memset(arr->types,UNDEFINED,len);
  arr->len = len;
  
  return arr;
}

void array_set_int(arr_t *arr, int i, int value) {
    int *element = malloc(sizeof(int));
    *element = value;
    arr->start[i] =element;
    arr->types[i] = INTEGER;
}

void array_set_str(arr_t *arr, int i, char * value) {
    arr->start[i] = strdup(value);
    arr->types[i] = STRING;
}

int array_get_type(arr_t *arr, int i){
    if(i > arr->len){
        return -1;
    }
    return arr->types[i];
}

int array_get_int(arr_t *arr, int i) {
    return *(int*)arr->start[i];
}

char * array_get_str(arr_t *arr, int i) {
    return (char*)arr->start[i];
}


void array_delete(arr_t *arr) {
  for(int i = 0; i < arr->len;i++){
    if(arr->start[i]){
        free(arr->start[i]);
    }
  }  
  free(arr->types);
  free(arr->start);
  free(arr);
}

int main() {

  arr_t *arr = array_new(3);
  array_set_int(arr, 0, 10);
  array_set_int(arr, 1, 20);
  array_set_str(arr, 2, "aaaa");

  for(int i = 0; i < arr->len; i++){
    int type = array_get_type(arr,i);
    if(type == STRING){
        char *value = array_get_str(arr,i);
        printf("value is %s\n",value);
    }
    if(type == INTEGER){
        int value = array_get_int(arr,i);
        printf("value is %d\n",value);
    }
  }

  array_delete(arr);
  
  return 0;
}

6 Comments

Notice that main() is calling set(), passing the address of an int variable... The function, however, thinks it's receiving a *vp... And, what about get()??? Casting is not for beginners, especially not casting to an unrelated datatype. (Plus, the **start is just plain wrong. Innocuous in this case, but still wrong.)
yeah , still wrong and the value will not survive outside scope, unless he alocates the required memory for these, but I soluted his problemns, and presented an code that works fines without any bug or leak memory, let him realize the future problemns generated by his implementations
"while in fact, should be an bidimensional array void **" Wrong. Very low quality answer to give to a beginner. Looks like nothing more than a buggy code dump. Looks like a question a beginner would submit...
he is using an void , the only justification to use a void pointer, its to deal with more than one data type dynamically. and a "any" array must be a bidimensional void pointer, what its incorrect with my answer ?, if he wants a int array, he should use an int instead of an void.
Bravo! Taken a beginner's question and turned it into a beginning of a "generic datatype storage facility". However, memset(arr->types,UNDEFINED,len); only zero's the first 3 bytes; not all 6... and array_get_type() will allow the caller to access out of bounds (UB!!)... and array_delete() may cause a SEGFAULT if only 2 values have been stored... And, there's confidence that malloc() never fails... Still not a good answer, imo...
|
0

You are using the pointer arthmetic incorrectly.

For example let's consider the function array_set

void array_set(arr_t *arr, int i, vp_t vp) {
  vp_t *dest = arr->start + i * sizeof(vp_t);  
  *dest = vp;
}

According to the C Standard an expression like that pointer + i points to the i-th element of an array relative to the element pointed to by the ponter

So if you need to get a pointer to the i-th element in your functon you need to write

  vp_t *dest = arr->start + i;

nstead of

  vp_t *dest = arr->start + i * sizeof(vp_t);  

So the functon will look like

void array_set(arr_t *arr, int i, vp_t vp) {
  vp_t *dest = arr->start + i;  
  *dest = vp;
}

Pay attention to that the subscript operator pointer[i] is evaluated as *( pointer + i ). So the function also can be written as

void array_set(arr_t *arr, int i, vp_t vp) {
  arr->start[i] = vp;
}

The same problem exists in other parts of your program that you should update accordingly.

For example the function array_get can look like

int array_get(arr_t *arr, int i) {
  int *p = arr->start[i];
  return *p;
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.