What the memory difference between char *array and char array[]? [duplicate]

Question

When I execute the next code

int main()
{
    char tmp[] = "hello";
    printf("%lp, %lp\n", tmp, &tmp);
    return 0;
}

I had got the same addresses. But for the next code, they will be different

int main()
{
    char *tmp = "hello";
    printf("%lp, %lp\n", tmp, &tmp);
    return 0;
}

Could you explain the memory differences between those examples?

char tmp[] = "hello" is an array of 6 characters initialized to "hello\0" (it has automatic storage duration and resides within the program stack). char *tmp = "hello"; is a pointer initialized with the address for the String Literal "hello\0" that resides in readonly memory (generally within the .rodata section of the executable). (readonly on all but a few non-standard implementations) An array is converted to a pointer to its first element on access. — David C. Rankin
– David C. Rankin, Commented Jun 7, 2021 at 3:54
@David C. Rankin Re "readonly on all but a few non-standard implementations", I find it doubtful that C requires a machine to have virtual memory to have a standard implementation. Once should always consider the memory to be read-only, but I challenge the claim that the memory has to be read-only for the implementation to be standard. — ikegami
– ikegami, Commented Jun 7, 2021 at 4:00
@ikegami I concede that point. The standard doesn't require a conforming implementation to create string literals in read only memory. The point I was making is most do. — David C. Rankin
– David C. Rankin, Commented Jun 7, 2021 at 4:06
At very least the C standard states modifying string literals is undefined behaviour. — Aconcagua
– Aconcagua, Commented Jun 7, 2021 at 4:51
While legal in C you shouldn't assign string literals to non-const char pointers, always do char const* ptr = "some literal"; – otherwise you almost certainly will run into modifying the literal at some point in the future, which is UB, as stated above. Being able to assign immutable literals to char* pointers is a legacy from the very first days of C where const did not yet exist. — Aconcagua
– Aconcagua, Commented Jun 8, 2021 at 13:46

Aconcagua · Accepted Answer · 2021-06-07 05:01:17Z

5

char tmp[] = "hello"; is an array of 6 characters initialized to "hello\0" (it has automatic storage duration and resides within the program stack).

char *tmp = "hello"; is a pointer to char initialized with the address for the string literal "hello\0" that resides in readonly memory (generally within the .rodata section of the executable, readonly on all but a few implementations).

When you have char tmp[] = "hello";, as stated above, on access the array is converted to a pointer to the first element of tmp. It has type char *. When you take the address of tmp (e.g. &tmp) it will resolve to the same address, but has a completely different type. It will be a pointer-to-array-of char[6]. The formal type is char (*)[6]. And since type controls pointer arithmetic, iterating with the different types will produce different offsets when you advance the pointer. Advancing tmp will advance to the next char. Advancing with the address of tmp will advance to the beginning of the next 6-character array.

When you have char *tmp = "hello"; you have a pointer to char. When you take the address, the result is pointer-to-pointer-to char. The formal type is char ** reflecting the two levels of indirection. Advancing tmp advances to the next char. Advancing with the address of tmp advances to the next pointer.

edited Jun 7, 2021 at 5:01

Aconcagua

25.6k4 gold badges37 silver badges66 bronze badges

answered Jun 7, 2021 at 4:05

David C. Rankin

85.1k6 gold badges67 silver badges95 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Aconcagua Over a year ago

char tmp[]: Advance tmp is unlucky wording, as arrays cannot be incremented (like ++tmp). I wouldn't describe 'string literal "hello\0"' as that would imply a literal with two trailing null characters.

David C. Rankin Over a year ago

Yes, I was referring to the pointer that results from access. Obviously you cannot iterate with the array itself. The intent being char *p = tmp; or char (*p)[6] = &tmp; in the array case. Thanks for pointing that out.

Edw590 Over a year ago

So to clear up my mind, (supposing tmp1 and tmp2 in the order on your answer), from an Assembly point of view tmp1 == &tmp1, and tmp2 != &tmp2. tmp1 and &tmp1 are exactly the same except in C they have different types (both the same stack address); tmp2 is the pointer to the 1st string char (which might be on the stack or on read-only data or something), and &tmp2 is a pointer to tmp2, which in this case will be a stack address (because tmp2 is a local variable - or at least supposing it is). And these things are the same passed to a function as arguments. Is this correct?

David C. Rankin Over a year ago

@DADi590 - you are dead-on. That is the exact case. tmp1 and &tmp1 resolve to the same address tmp2 is a pointer to the 1st char in the string, and &tmp2 is the address of that pointer, not the address of the 1st character in the string. In other words, the address for the 1st character in the string is the address pointed to (e.g. held-by) tmp2, &tmp2 is where that address is stored in memory.

arrowd · Accepted Answer · 2021-06-07 04:42:19Z

char a[] = "hello";

and

char *a = "hello";

Get stored in different places.

char a[] = "hello"

In this case, a becomes an array(stored in the stack) of 6 characters initialized to "hello\0". It is the same as:

char a[6];
a[0] = 'h';
a[1] = 'e';
a[2] = 'l';
a[3] = 'l';
a[4] = 'o';
a[5] = '\0';

char *a = "hello"

Inspect the assembly(this is not all the assembly, only the important part):

    .file   "so.c"
    .text
    .section    .rodata
.LC0:
    .string "hello" ////Look at this part
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movq    $.LC0, -8(%rbp)
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

See

.section    .rodata
.LC0:
    .string "hello"

This is where the string is stored. char a[] is stored in the stack while char *a is stored wherever the compiler likes. Generally in rodata.

John Bode · Accepted Answer · 2021-06-07 04:42:49Z

With

char tmp[] = "hello";

you are setting aside an array of char large enough to store the string "hello" and copying the contents of the string to that array, such that you get this in memory:

     +–––+
tmp: |'h'| tmp[0]
     +–––+
     |'e'| tmp[1]
     +–––+
     |'l'| tmp[2]
     +–––+
     |'l'| tmp[3]
     +–––+
     |'o'| tmp[4]
     +–––+
     | 0 | tmp[5]
     +–––+

There is no tmp object separate from the array elements themselves, so the address of the array (tmp) is the same as the address of its first element (tmp[0]).

With

char *tmp = "hello";

you are creating a pointer to char and initializing it with the address of the first character in the string literal "hello", such that you get this in memory:

     +–––+       +–––+
tmp: |   | ––––> |'h'| tmp[0]
     +–––+       +–––+
                 |'e'| tmp[1]
                 +–––+
                 |'l'| tmp[2]
                 +–––+
                 |'l'| tmp[3]
                 +–––+
                 |'o'| tmp[4]
                 +–––+
                 | 0 | tmp[5]
                 +–––+

In this case tmp is a separate object from the array elements, so the address of tmp is different from the address of tmp[0].

Collectives™ on Stack Overflow

What the memory difference between char *array and char array[]? [duplicate]

3 Answers 3

4 Comments

char a[] = "hello"

char *a = "hello"

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

char a[] = "hello"

char *a = "hello"

Comments

Comments

Linked

Related