Parsing Rich Text Files

Thank you for the amazing application that has changed my learning forever.

I am trying to manipulate rich text files with some of the methods exposed in the Python package anki.storage.

I can successfully wire up my collection and create new notes programmatically. I’m hitting a roadblock in how Anki sees the RTF. For example, here’s a definition as viewed my rich text editor.


[Figure 1]

If I manually use copy and paste with formatting from my text editor and into an Anki note, it is rendered correctly:


[Figure 2]

But when I put the same clipboard data into a python string and programmatically feed it into the note, this is what the note looks like:


[Figure 3]

I’ve tried to read the source code for how Anki parses formatting but cannot find the specific method it is using to interpret the RTF stored from my clipboard in Figure 1 that removes all the RTF formatting codes and replaces it with HTML tags shown in Figure 2 when I manually paste in the editor.

Is there a library or method that I can call that parses the RTF string as shown in Figure 3 and produces the HTML in Figure 2? Or in other words, how does Anki strip RTF codes and make the HTML representation?

Thank you for reading and considering my question.

I don’t think Anki directly handles RTF. It simply gets the preprocessed HTML stored in the clipboard. If you really need to handle RTF in an add-on, maybe look into some parsing library like rtfparse.

1 Like

Thank you abdo for helping me understand this. The parser you led me to has helped a ton!

1 Like