3 years later, AnkiDroid still doesn't support Anki 2.1.20 TTS tag

I’m very invested in making sure audio and text to speech works correctly across the whole Anki ecosystem. Anki 2.1.20 came out in February 2020 and introduced the new TTS tag which was immediately supported on AnkiMobile.

I’m completely aware AnkiDroid is developed by volunteers so I am not entitled to anything whatsoever, but this lack of standardization is causing some serious headaches for users addons such as AwesomeTTS, HyperTTS.

Recently I had to help a blind user setup their Anki with audio for language learning. Attempting to setup realtime TTS across 3 platforms (Anki desktop, AnkiDroid, AnkiWeb) proved to be almost impossible, due to the interaction with screen readers.

This issue is causing me to re-think how HyperTTS realtime TTS tag integration should work. Before the built-in support in Anki, AwesomeTTS rolled it owns TTS. I’m thinking of doing the same thing with HyperTTS. This will cause some further fragmentation in the ecosystem. (i.e. i’m thinking of introducing a solution which would only work on Anki desktop).

Here are the problems with the current Anki TTS setup:

  1. biggest issue: AnkiDroid doesn’t support it
  2. the TTS tag is inserted in the card template. This causes an issue with HyperTTS: the way the TTS request during review is “routed” back to HyperTTS is using the voice attribute. However the voice attribute doesn’t contain enough information, so HyperTTS has to store some metadata separately to memorize how the audio should be generated (which voice(s), settings, speed, etc). Here’s the problem: if the user moves card templates around, this stops working, or works in a weird way.

I’d love AnkiDroid to support the Anki 2.1.20 TTS tag, and I’m ready to put some money behind this, if anyone would be interested in taking up development.

Separately, I wanted to see whether anyone is interested in brainstorming how Anki TTS support could be more user friendly. These issues are causing me to daydream about re-implementing a whole solution from scratch, but from experience, if we start to fragment, the ecosystem will not be better off.

AnkiDroid has been struggling to get a new release out for some time, and they still have their hands full with that. A new release is currently in beta, and hopefully not far off at this point. Once the dust settles from the new release, I gather their plan with 2.17 is to fully switch to Anki’s Rust backend, and I imagine this will get tackled as part of that.

The new TTS parsing code is actually already available in AnkiDroid when the new backend is enabled, but it’s not currently hooked up - so the tags are identified, but not fed to the TTS playback engine. If you wish to speed along that process, you could either dive into the code yourself, or perhaps offer a bounty to one of the AnkiDroid devs who knows the code well.

1 Like

The rust backend integration sounds like a good idea if the goal is to unify the platforms. I’ve contacted one of their developers to offer a bounty.
If I wanted to learn more about the rust backend, where would be a good place to start reading ?

Aside from https://github.com/ankitects/anki/blob/main/docs/architecture.md and https://github.com/ankidroid/Anki-Android-Backend/blob/main/README.md, it’ll require exploring the source code I’m afraid.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.