0

Problem Statement

For several days, I have been trying to generate an Abstract Syntax Tree (AST) for part of the Linux kernel filesystem, and cannot get it to work. I am able to make the kernel, but when I try to generate the AST I am plagued by hundreds errors and warnings complaining about redefined macros, missing typedef's, double underscores, missing semicolons, etc. All syntax stuff.

What I've Tried

  • I've tried cloning the linux repo at https://github.com/torvalds/linux as well as downloading the source code directly from https://www.kernel.org/.
  • I've tried using clang, gcc, astgen, pyparser, joern-frontend, sparse - the most promising one was clang.
  • I've tried running clang -Xclang -ast-dump -fsyntax-only linux/fs/fuse/xattr.c (for example) on the raw code as well as after configuring and running make, same errors.
  • I've tried this on my Mac as well as a Linux VM running Ubuntu 16.0.4.
  • I've tried this on the rpi version of Linux (https://github.com/raspberrypi).
  • I've tried including headers to the local linux/include directory, the linux-headers-$(uname -r)/include dir and the linux-headers-$(uname -r)-generic dir, separately and together.

Here are some reproducible steps to showcase what I mean:

  • Open up my Linux VM running Ubuntu with >100GB in storage
  • Run su to log in as root
  • Run sudo apt-get install vim lld clang
  • Run sudo apt-get install vim libncurses-dev flex bison clang-12 lld-12 lldb-12 libssl-dev libelf-dev clang-format clang-tidy clang-tools clangd libc++-dev libc++1 libc++abi-dev libc++abi1 libclang-dev clibclang1 liblldb-dev libllvm-ocaml-dev libomp-dev libomp5 lld lldb llvm-dev llvm-runtime llvm python3-clang
  • Run apt remove clang lld, then ln -s /usr/bin/clang-12 /usr/bin/clang and ln -s /usr/bin/lld-12 /usr/bin/ld.lld (I got some errors about clang-10 being out of date, so I linked -12)
  • Download the latest stable kernel source from https://www.kernel.org and unzip it (for me, it's 6.8.3)
  • Run make allnoconfig && make -j16 LLVM=1 inside the linux directory (I tried make menuconfig, but the install took over a day, so I stopped it).

This make's without errors. I did not run make install, as I don't actually want to start the kernel.

If I then try to run clang -I <my-path>/linux/include -I <my-path>/linux/include/uapi -I /usr/src/linux-headers-$(uname -r)/include -I /usr/src/linux-headers-$(uname -r)-generic/include ... < a lot more includes w/ arch/x86, generated/uapi, etc > ... -Xclang -ast-dump fs/fuse/xattr.c, I get hundreds of typing and syntax errors.

I've read tons of SO posts and clang documentation, none of it worked. I saw Clang and Linux kernel, where the OP seems to resolve the issue, but I could not replicate that on my machine (it's a bit dated). I also tried including kconfig.h at the top of the file, no dice. I'm feeling pretty stuck.

Can anyone offer me any pointers (pun intended) on what I'm missing? Don't know what else to try. Also, this is outside my area of expertise - if I missed a step or did something stupid, please let me know.

2
  • Linux builds best with gcc and gcc has debug options to dump an AST. AST is a rather generic concept. Ie, there is no standard, so I assume your are willing to custom code whatever you want. You can also make plug-ins that will dump the AST in whatever format you want. See: stackoverflow.com/questions/15800230/… (How is this not the same?). You will need to use a Kconfig option or edit some Makefile infrastructure. You can also wrap the compiler/tools in a script/symlinks. Commented Apr 4, 2024 at 14:04
  • 1
    @artlessnoise, not better than clang nowadays. Actually I have even a bug when clang produces bootable code while gcc not (okay, it's not a compiler bug very likely :-) Commented Apr 6, 2024 at 21:21

1 Answer 1

1

For future people:

The answer is that I forgot the letter K! What I ended up doing was:

  1. Downloading the Linux source from scratch.
  2. Running make allnoconfig (you can choose the best config for you from make help).
  3. Run make KCFLAGS="-fdump-tree-all-graph" at the TLD. This will create .dot files that can be viewed in external tools like GraphViz later. DON'T FORGET THE K! You can look at other flag options at https://gcc.gnu.org/onlinedocs/gcc/Developer-Options.html.

To make sure the .dot files saved correctly (there can be hundreds of .dot files per original file, at different levels of compilation), you can try converting them to pdf by running dot -Tpdf <dot_file_name>.dot -o <output_file_name>.pdf. Then you can open the PDF and look at the result - you can also convert it to PNG, JSON, etc. See https://graphviz.org/docs/outputs/ for more options. No one .dot file is "better" to use than others, but you can pick lower number ones (earlier in the compilation process), the smallest file, the biggest file, etc.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.