I have a book in markdown format. I want to split it into separate files at the chapter headings. How can I do this?
-
So far I've used pandoc to convert a docx file into a markdown file. I've not tried anything else. I usually use php but I would imagine that trying to use regex to match the chapter headings isn't the most reliable.weaveoftheride– weaveoftheride2015-11-24 10:22:44 +00:00Commented Nov 24, 2015 at 10:22
4 Answers
have a book in markdown format. I want to split it into separate files at the chapter headings. How can I do this?
If you are using Pandoc, you can convert your Markdown file to EPUB, unzip the EPUB file and convert the HTML files into Markdown. Not the perfect solution but you can accomplish it with a few lines of bash script like
pandoc -f markdown -t epub -o my-book.epub my-book.md
unzip my-book.epub
for chapter in *.html
do
pandoc -f html -t markdown -o ${chapter/html/md} ${chapter}
done
You need to fix the path to the HTML files.
If you want to program something and you have some experience, shouldn't be hard to write a Python/... script to split the file.
Comments
I stumbled upon a straightforward solution. Credit goes Christian Tietze and mediapathic!
`gcsplit --prefix='novelname' --suffix-format='%03d.md' novel-file.md /##/ "{*}"`
https://christiantietze.de/posts/2019/12/markdown-split-by-chapter/
Other options:
Comments
I needed that exact functionality and was not content with the solutions provided in the other answers mostly because heading tags within code blocks were not respected which lead to problems with my documents.
So I went ahead and wrote a small python tool named mdsplit to do the job.
Install it via pip (pip install mdsplit) and then run this to e.g. split at level 2 headings:
mdsplit input.md --max-level 2
Only later I found out there is already a C++ based tool named mdsplit as well that does about the same:
mdsplit -i input.md -l 2
Comments
I also needed this functionality but in an environment without Python nor additional downloadable tools.
I ended up with the following solution using awk only.
cat myfile.md |awk '{if ($0~/^## /) {++count} if (count>1) {exit} print $0}'
It will stop printing lines after the second ## header 2 is found.