1

This is more out of curiosity than productive need but I have been asking myself if it is possible to extract the C++ source of a binary such that it can be recompiled to produce a working clone of the binary.

If have tried to:

  1. compile the binary with "-g -Og" to include dwarf info,
  2. used objdump with "-S" and "--source-comment" to interleave the sources into the dump
  3. grepped out all the commented source lines
  4. removed the comment and
  5. formatted with clang-format

The output is pretty decent C++ but there is quite some confusion with the order of the source lines and with sources lines that have no real effect (such as a function's closing "}"). Example:

 bool UsartHal1::isTransmitRegisterEmpty()
 {
     return USART1->SR & USART_SR_TXE;

     bool Usart1::write(uint8_t data)
     {
         if (UsartHal1::isTransmitRegisterEmpty())
         {
             USART1->DR = data;
             UsartHal1::write(data);
             return true;
         }
         else
         {
             return false;
         }
     }
     return USART1->SR & USART_SR_RXNE;
 }

 bool Usart1::read(uint8_t & data)
 {
     if (UsartHal1::isReceiveRegisterNotEmpty())
     {
         data = USART1->DR;
         UsartHal1::read(data);
         return true;
     }
     else
     {
         return false;
     }
 }
 return USART1->SR & USART_SR_RXNE;

I can of course imagine that what I am trying to do is simply not possible - not all source lines have an effect that will make it to the binary and there is no real reason for the compiler to guarantee that the code placement will adhere to the order of lines in the sources.

Still I am wondering if there are perhaps some options/esoteric compiler-flags that will make this possible? After all, coverage analysis tools face the same problems.

2
  • objdump -S uses the original source files, it's not read from the binary. Commented Jul 26, 2022 at 10:25
  • Still I am wondering if there are perhaps some options/esoteric compiler-flags that will make this possible? After all, coverage analysis tools face the same problems. You could tell your linker / objcopy to simply include the original source code in the object file, as blob. Commented Jul 26, 2022 at 19:37

1 Answer 1

1

Debug symbols contain a lot of information that allows you to map stuff from the binary back to the source code (assuming you have access to both) especially in unoptimized builds. But extracting/recreating the original source exactly from the compiled binary is simply not possible.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.