Since last summer I've been working on a book about Japanese Natural Language Processing. I'm writing it in Markdown, and publishing chapters simultan

Writing a Multilingual Book in Markdown

submited by
Style Pass
2022-06-22 08:30:07

Since last summer I've been working on a book about Japanese Natural Language Processing. I'm writing it in Markdown, and publishing chapters simultaneously in Japanese and English. I was surprised to not find many tools for to support this kind of workflow, so I ended up writing a preprocessing script to help me publish the book.

To give a little background, the book is about NLP, the field of machine learning concerned with processing language, in this case text documents. As such the book contains significant amounts of code. My co-author writes his chapters mainly in Jupyter notebooks. I considered doing the same, but many of my chapters are more prose than code, so I have stuck with my usual system for writing on the computer - Markdown in vim.

Our needs are already a little unusual - we have to write a document with significant amounts of code, in two languages, where the code will be shared between languages, and, oh, we need to turn it into a book too. We also want the texts to be parallel, with roughly paragraph-to-paragraph correspondence, so if possible we want to see both languages at the same time. That's a pretty specialized need, and I didn't immediately turn up any tools designed for this kind of usage.

Leave a Comment