Sunday, December 06, 2020

Open Standards mean independence, freedom

   Source: Wikimedia.

Open Standards, or Open Specifications as I personally prefer, remove another hurdle in locked-in science. They allow others to understand your language, the nuances of your message. They mean independence, freedom. One important international standard is the Open Document Format (ODF). It is supported by all major editors (yes, MS Word too). Indeed, scholars have been quite persistent and insisting on using closed source and semi-closed solutions like .docx files. Often this is because of the track changes. We have plenty of alternatives for that too, but old habits die hard.

Anyway, as I am trying to use open formats as much as possible (LaTeX, Markdown), but still have collaborations that do not have computers with open solutions (correlated with some vendor lock-in), I also end up sending around Word files, or use Google Docs, for tracking changes. If you search my GitHub repositories, you undoubtedly will find LaTeX of journal articles, with track changes with git.

So, yesterday I was wondering if I couldn't mix the two worlds. I've done this before. Markdown converts to quite reasonable .docx files. It's just the applying of the changes in the original take a bit more effort. Not that much, really, as I have to check them one by one anyway.

But what if I could automate this? The Word files are semi-closed, but that also means they are semi-open. A Word file, in fact, is just a ZIP archive. This also works great for extracting the images from Word files, did you know that? Well, now you do.

I asked on Twitter and got replies in second. I have yet to explore them, but thanks to Simon and Chris, I now have these two leads to explore if I can convert a Word file into a Markdown/Git patch:

I thought I'd just drop them here. Think of it as open notebook science (at the very least, for my future me).

