This post is based on a reddit post I wrote recently on the same. It’s both intended to share an experience on language learning with Anki with TTS, and to highlight a couple of issues.
My requirements : I am learning a language (Arabic), I’m intermediate level working to advanced. That means a bit less than 10,000 notes to practice (words, sentences, with most cards designed to produce the foreign language).
I want to practice hearing the foreign language. I practice my Anki cards any time I get a chance, on any platform : Win11, AnkiWeb, AnkiDroid, AnkiMobile.
Complication : Add-ons that produce sound files (like Awesome TTS) are not an adequate solution, because of the number and size of media files to generate, and the impracticality every time I have to add or modify a card.
Idea : Since native TTS voices are becoming quite good on all those platforms, can I teach Anki* to read cards on-the-fly exactly as I want?
Solution 1 and its challenge : the “new” native Anki tts {{tts}}
That looks like the best (easiest) solution. Unfortunately, I cannot make it work on more than one platform at a time (iOS or Win11), and could not make it work on Android (on a couple of Samsung phones/tablets)
The issue seems to be language specific (so you may be more lucky than me for other languages than Arabic). The lang code for Arabic varies : on Windows it’s ar_SA
, while on iOS it’s ar-001
.
It seems there is no way to tell Anki more that one language in the tts anchor, so that it can fall back on a second or third choice in case the first one doesn’t work, like :
{{tts ar_SA,ar-001:Front}}
or {{tts lang:ar-001,ar_SA:Front}}
As for Android, I have no clue why it doesn’t work.
Solution 2 by tinkering with the Web Speech API (i.e. JavaScript).**
I managed to build a script that works on Anki Win11, AnkiWeb on Win11, AnkiWeb on iOS, and AnkiMobile (iOS). No luck with Android (both AnkiWeb and AnkiDroid), even though I have tried several TTS engines (Samsung, Google, and a purchased one : Acapela).
Your thoughts / ideas ?
For those who are interested, in the details below is an abstract of the back of my main card template. My card template includes 2 fields to show on the back : the word ArabicMSA
to speak automatically (and repeat if I click on the word), and the sentences in Example
(to speak only if I click on them)
{{FrontSide}}
<div style='padding-right:5%;padding-left:5%; background-color:lightgreen;color:black;' onclick='speakWordA(); ' >
<hr >
<span style="font-weight: bold; direction: rtl; ">{{ArabicMSA}}
</span>
<div style="font-size: xx-small; font-weight: regular; direction: ltr;">
Audio:
<span id="TTSmethod"> FILL-IN WITH SCRIPT </span>
<span id="wordA" style="display: none;">
{{ArabicMSA}}
</span>
<hr>
</div>
</div>
<div style="padding-right:5%;padding-left:5%;font-size: small; font-weight: regular; direction: ltr;background-color:lightgreen;color:black;" onclick="speakExmple();" >
<HR>
<div id='exmple' style="text-align: justify ; font-size:large; font-weight: regular; direction: rtl">
{{Example}}
</div>
<hr>
</div>
<script type="text/javascript">
// the TTS flag may be replaced by something else (plateforme specific) at some point.
document.getElementById('TTSmethod').textContent = "TTS";
var w = document.getElementById("wordA");
window.setTimeout("speakAR(w.innerText)", 500);
var w3 = document.getElementById("exmple");
function speakAR(word) {
// Create a promise-based function
return new Promise((resolve, reject) => {
// Check if speech synthesis is supported
if (!('speechSynthesis' in window)) {
console.error("Speech synthesis not supported");
reject("Speech synthesis not supported");
return;
}
const utterance = new SpeechSynthesisUtterance();
utterance.text = word;
utterance.volume = 0.8;
utterance.rate = 1;
utterance.pitch = 1;
utterance.lang = "ar-SA";
// Set up event handlers for the utterance
utterance.onend = () => resolve();
utterance.onerror = (event) => reject(`Speech synthesis error: ${event.error}`);
// Function to find the best Arabic voice
const findArabicVoice = () => {
const voices = window.speechSynthesis.getVoices();
// Try to find the Laila voice first
let voice = voices.find(v => v.name === 'Laila');
// If Laila isn't available, look for any Arabic voice
if (!voice) {
voice = voices.find(v => v.lang === 'ar-SA');
}
// If no exact match, try any voice that starts with 'ar'
if (!voice) {
voice = voices.find(v => v.lang.startsWith('ar'));
}
return voice;
};
// Function to start speaking with the best available voice
const startSpeaking = () => {
const voice = findArabicVoice();
if (voice) {
utterance.voice = voice;
}
// Cancel any ongoing speech
window.speechSynthesis.cancel();
// Start speaking
window.speechSynthesis.speak(utterance);
};
// Get voices and handle browser differences
const voices = window.speechSynthesis.getVoices();
if (voices.length > 0) {
// Voices already loaded (Safari and some other browsers)
startSpeaking();
} else if (typeof speechSynthesis.onvoiceschanged !== 'undefined') {
// Wait for voices to load (Chrome and some other browsers)
speechSynthesis.onvoiceschanged = () => {
// Only execute once
speechSynthesis.onvoiceschanged = null;
startSpeaking();
};
} else {
// For browsers that don't support onvoiceschanged (like Safari)
// Try with a delay as a fallback
setTimeout(startSpeaking, 100);
}
});
}
function speakWordA()
{
speakAR(w.innerText);
}
function speakExmple()
{
speakAR(w3.innerText);
}
</script>