Some Japanese Characters are Saved Incorrectly

When I use certain characters in fields of notes, they are not saved correctly. For instance, when entering 社, it gets saved as 社 instead. This happens when adding new notes, as well as when editing fields of existing notes. It also happens when importing notes from a tsv file. When importing notes from a tsv file containing two notes with their sort fields being 社 and 社, respectively, Anki complains about duplicate notes on import.

When editing a note field in the card browser by adding 社, it shows up correctly in the text box while simultaneously displaying 社 in the card list above the field editor. When switching away from the card and then back to the card, it now also displays 社 in the text box.

For all intents and purposes, it seems like Anki treats 社 as if 社 was entered instead. Maybe there is an encoding problem? As far as I can tell, this problem affects none of the jōyō kanji and 57 of the jinmeiyō kanji.

I have tried this on Anki 2.1.49 on Arch Linux (from the AUR), Anki 2.1.50 on Arch Linux (anki-2.1.50-linux-qt6 from the github release), as well as Anki 2.1.49 on Windows (on a friend’s computer, entirely different machine and configuration). They all display the same behavior.

This is probably relevant: Adding/Editing - Anki Manual

3 Likes

Thank you for your reply, this did indeed solve my issue.

I was quite surprised to learn about the existence of this feature. I could not find any other application on my Linux desktop that makes use of unicode normalization. Due to this, I am very used to seeing characters processed verbatim in plain text contexts and seeing this behavior in Anki was very unexpected for me.

I was even more surprised to see this feature being enabled by default in Anki, given that it is often used to learn about non-english characters and words, where such differences might be important.