2

I'm learning about shellcode execution in C and I've seen two different approaches. I understand the first one is for exploitation, but I'm confused about the type casting.

Approach 1: Stack Overflow Technique

int main() {
    char shellcode[] = "\x31\xc0\x50\x68..."; // shellcode bytes
    int *ret;
    ret = (int *)&ret + 4;  // Point to return address on stack
    (*ret) = (int)shellcode; // Overwrite return address
}

Approach 2: Function Pointer Technique

int main() {
    char shellcode[] = "\x31\xc0\x50\x68..."; // shellcode bytes
    int (*func_ptr)();
    func_ptr = (int (*)()) shellcode;  // Cast to function pointer
    func_ptr();
}

My questions:

In the first approach, why do we cast shellcode to (int) when overwriting the return address?

In the second approach, why do we need the complex cast (int (*)()) instead of just assigning the pointer directly?

What's the fundamental difference between these two methods in terms of how the shellcode gets executed?

I understand the first method is for exploitation (hijacking the return address), while the second is cleaner code. But I'm specifically confused about the type casting rationale.

4
  • 4
    This is strictly speaking all undefined behavior. Some systems do support object pointer to function pointer conversions though, as a non-standard extension. It depends on which compiler you are using and maybe on the ABI too. Commented Nov 3 at 14:56
  • ... and one fundamental difference between the two is that the second version actually calls the function embedded in shellcode while the first places the embedded function's address where the return address from main presumably lives, making the shell code execute when main is supposed to return (if I read it correctly). Commented Nov 3 at 15:02
  • You cannot assign a pointer directly if the type of the address is different from the variable you wan to assign it to. Therefore you need casts. Your code also relies on pointers and integers being 4 bytes both and on a specific stack layout. Commented Nov 3 at 15:10
  • 1
    @Lundin Most exploits depend on implementation-specific behavior (like datatype sizes and stack layout) and involve code with undefined behavior. Commented Nov 3 at 16:10

2 Answers 2

4

In the first approach, why do we cast shellcode to (int) when overwriting the return address?

I am not in the mind of the person who wrote that code, but standard C does require the cast. The identifier shellcode designates an object of array type. In that context, its value is automatically converted to a pointer to the first array element. C allows pointer types to be converted to integer types, but it does not specify any implicit conversions of that kind, so explicitly converting the pointer to type int is necessary for C conformance. C conformance seems an odd concern for code that intentionally contains (other) undefined behavior, so maybe the main point is to avoid the compiler rejecting or warning about the assignment.

In the second approach, why do we need the complex cast (int (*)()) instead of just assigning the pointer directly?

Similar reasons, most likely. C defines implicit conversions between some object pointer types, but most pointer conversions require explicit casts. The cast you ask about is to the type of the variable being assigned. It is complex(-ish) because the type in question is a function pointer type, and their type names are more complex than those of typical pointer-to-object types.

What's the fundamental difference between these two methods in terms of how the shellcode gets executed?

The first alternative intends to plug the address of the shellcode into the location where (the approach supposes) the containing function's epilog will look for a function return address. The idea is that when the function terminates normally, it will "return" to the shellcode instead of to its actual caller.

The second alternative intends to call the shellcode directly, as if it were a function.

I'm specifically confused about the type casting rationale.

In both cases, the casting seems to be only about matching the type of the value being assigned (the shellcode address) to the declared data type of the storage to which it being is assigned.

In the first case, whether that data type is fit for purpose and whether the cast does the wanted thing are functions of the particular platform and C implementation involved. The code has undefined behavior as far as C is concerned, as a result of an intentional bounds overrun. The language spec provides no reason to expect it to work for getting the shellcode executed. Among many other considerations, it probably wouldn't work in practice on platforms where function pointers are a different size from int (most x86_64 platforms, for instance), or on platforms where the implementation of pointer-to-integer casts is more complex than a mere reinterpretation of the bits, or on platforms where data addresses and code addresses have different size or representation.

In the second case, the data type is definitely fit for purpose, but whether the cast does the wanted thing or is even accepted at all is still a function of the particular platform and C implementation involved. This code, too, has UB as far as C is concerned, because C does not define behavior for converting object pointers to function pointers. The language spec provides no reason to expect it to work for getting the shellcode executed. Among many other considerations, it might not work on platforms where data addresses have different size or representation than code addresses, and of course, it will not work if the compiler rejects the cast, which is well within its rights to do.

Sign up to request clarification or add additional context in comments.

4 Comments

Exploits are practically always targeted to a specific implementation and take advantage of how it behaves in a particular undefined circumstances, so there's little point in pointing out that it's UB.
I know that exploits leverage UB, but the OP and other readers do not necessarily recognize that. In particular, the details of the question lead me to suppose that the OP probably didn't. I'm sure you appreciate that the whole concept of UB is difficult for many inexperienced C programmers.
I've never read any exploit tutorials, but I sure hope they make this extremely clear.
The questions we get about exploits have never given me the impression that the instructional materials on which askers have relied do any such thing. Or if they do, that they have successfully conveyed that message.
2
  1. First of all I do not know why the author decided to use return type int.
  2. The cast is needed to show compiler what you want to achieve and suppress the warning. You do not need a separate variable for a function pointer. You can call it directly.
((void (*)(void))shellcode)();

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.