0

Given an array with 5 elements, it is well known that if you use scanf() to read in exactly 5 elements, then scanf() will fill the array and then clobber memory by putting a null character '\0' into the 6th element without generating an error(Im calling it a 6th element but I know its memory thats not part of the array) As is described here: Null termination of char array

However when you try to read in 6 elements or more an error is generated because the OS detects that memory is being clobbered and the kernel sends a signal. Can someone clear up why an error is not generated in the first case of memory clobbering above?

Example code:

// ex1.c
#include <stdio.h>
int main(void){
  char arr[5];
  scanf("%s", arr);
  printf("%s\n", arr);
  return 0;
}

Compile, run and enter four characters: 1234. This stores them in the array correctly and doesn't clobber memory. No error here.

$ ./ex1
1234
1234

Run again and enter five characters. This will clobber memory because scanf() stored an extra '\0' null character in memory after the 5th element. No error is generated.

$ ./ex1
12345
12345

Now enter six characters which we expect to clobber memory. The error that is generated looks like(ie. Im guessing) its the result of a signal sent by the kernel saying that we just clobbered the stack(local memory) somehow....Why is an error being generated for this memory clobbering but not for the previous one above?

$ ./ex1
123456
123456
*** stack smashing detected ***: ./ex1 terminated
Aborted (core dumped)

This seems to happen no matter what size I make the array.

12
  • Im guessing that scanf is compiled without stack smash protection while your compiler is compiling with it. Your standard C library may be built with different options to your code, Stack smash protection requires compiler support. Commented Aug 1, 2015 at 17:34
  • 1
    Probably the memory is aligned on even bytes. After the 5th byte there is another unused byte. Commented Aug 1, 2015 at 17:35
  • Yeah, alignment is my guess too... does it work if the array has size 4? Commented Aug 1, 2015 at 17:36
  • 2
    Close voting because OP has done a bad thing, invoked UB, knows it, but still wants the consequences explained. 'I poured gasoline over myself and lit a match. Why, and how, am I seriously burned?' Commented Aug 1, 2015 at 17:42
  • 3
    I'm voting to close this question as off-topic because OP wants UB explained, (again). Commented Aug 1, 2015 at 17:43

4 Answers 4

1

The behaviour is undefined if in both the cases where you input more than characters than the buffer can hold.

The stack smashing detection mechanism works by using canaries. When the canary value gets overwritten SIGABRT is generated. The reason why it doesn't get generated is probably because there's at least one extra byte of memory after the array (typically one-past-the-end of an object is required to be a valid pointer. But it can't be used to store to values -- legally). In essence, the canary wasn't overwritten when you input 1 extra char but it does get overwritten when you input 2 bytes for one reason or another, triggering SIGABRT.

If you have some other variables after arr such as:

#include <stdio.h>
int main(void){
  char arr[5];
  char var[128];
  scanf("%s", arr);
  printf("%s\n", arr);
  return 0;
}

Then the canary may not be overwritten when you input few more bytes as it might be simply overwriting var. Thus prolonging the buffer overflow detection by the compiler. This is a plausible explanation. But in any case, your program is invalid if it overruns buffer and you should not rely the stack smashing detection by the compiler to save you.

Sign up to request clarification or add additional context in comments.

11 Comments

Thanks a lot! I have been wondering why arrays acted this way for years but never asked because its always taken for granted. Never heard of canaries before. You have probably cleared up a lot of future security flaws and misconceptions with this answer. Cheers. PS: I cant upvote your answer yet because I dont have enough reputation, but will as soon as I do.
"... probably because there's at least one extra byte of memory after the array (typically one-past-the-end of an object is required to be a valid pointer. But it can't be used to store to values -- legally)" what please (referring the text in parenthesis)?
@alk I was referring to the fact in int a[128]; int *p =&a[0]+128;, p is required to be a valid pointer even though it's outside the memory allocated for a. The same is true for int i; int *p=&i+1; too.
@BlueMoon: I see, but being "a valid pointer" which may not be dereferenced. So there is defintily no need to reserve some spare memory following the array's memory.
|
1

.Why is an error being generated for this memory clobbering but not for the previous one above?

Because for the 1st test it seemed to work just because of (bad) luck.

In both cases arr was accessed out-of-bounds and by doing so the code invoked undefined behaviour. This means the code might do what you expect or not or what ever, like booting the machine, formatting the disk ...

C does not test for memory access, but leaves this to the programmer. Who could have made the call to scanf() save by doing:

char arr[5];
scanf("%4s", arr); /* Stop scanning after 4th character. */

5 Comments

Yes, it is undefined in the spec, but the question is why this implementation is doing it this way? Understanding the implementation defined behavior can be as important as understanding spec defined behavior. Sometimes, you target a particular implementation and depend on it, and sometimes it is just for educational purposes, but either way, I'm not satisfied leaving it at "it is undefined in the C spec, sorry".
@AdamD.Ruppe: So we'd have as many answers (to this question) as we'd have implementations?
He or she is asking about a specific implementation: "The error that is generated looks like(ie. Im guessing) its the result of a signal sent by the kernel saying that we just clobbered the stack(local memory) somehow....Why is an error being generated for this memory clobbering but not for the previous one above?"
@AdamD.Ruppe: "is asking about a specific implementation" does s/he?
I agree with alk. Stack smashing detection is not part of the language, and the question has absolutely no information about the implementation being used by the OP. So the only reasonable answer is, "It's UB, don't do that."
0

Stack Smashing here is actually caused due to a protection mechanism used by compiler to detect buffer overflow errors.The compiler adds protection variables (known as canaries) which have known values.

In your case when an input string of size greater than 5 causes corruption of this variable resulting in SIGABRT to terminate the program.

You can read more about buffer overflow protection. But as @alk answered you are invoking Undefined Behavior

Comments

0

Actually If we declare a array of size 5, then also rather we can put and access data from this array as memory beyond this array is empty and we can do the same till this memory is free but once it assigned to another program now even we are unable to acces a data present there

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.