I’d like to introduce Anki Lang Deck Builder (ALDB), a project I developed entirely in Python. The program integrates edge-tts (for very high-quality neural voices that stay true to native accents) and genanki (for automatic deck generation) to create ready-to-use Anki decks for language learning.
It’s a 100% free and open-source tool, designed to save time for learners by automating the creation of .apkg decks with authentic audio. The project also provides a complete list of supported languages and voices, so you can easily explore the available options.
I’d like to share my GitHub repository and the link to my project presentation page, which includes a video and a screenshot. However, since I’m a new member I can’t post links yet. Could an admin let me know the proper way to share my project here?
How is it possible to generate them in bulk? I have around 30,000 words. Is it possible to upload a spreadsheet or something similar?
Do you have plans to make an add-on for Anki? Would that even be possible? I think it could become the most popular add-on and surpass other TTS add-ons. Currently, those have the advantage of being easier to use since they’re integrated into Anki itself. Personally, I think their main convenience is generating audio directly inside a user’s existing decks.
Your idea of a directly integrated Anki add-on is excellent! I’m going to look into adding a system to automatically scan spreadsheets so that voices can be generated automatically from your files.
Unless you know a person who can generate these voices in the language you’re learning, or you prefer to use Google’s basic voice, I think my alternative is an excellent solution, as the voices are very close to real human voices.
That said, you’re absolutely right about the presentation video — I should have shown more demonstrations with long sentences, female/male voices, multiple languages, etc.
If you want, you can still give it a try and provide feedback; you’ll see for yourself that even with long (or very long) sentences, the realism is impressive. The edge-tts module is really powerful.
UPDATE:
I added to my GitHub presentation page, for each language, an audio clip (male/female) of the sentence “Hello everyone, I am an artificial voice but I try to be as realistic as possible!” translated and pronounced in all languages available in ALDB.
You can directly preview the pronunciations from the website, without even downloading the program—it will already give you an idea!
Bro, this is a free project I created for you to enjoy. The voices come from the Microsoft Edge TTS service.
I don’t provide these neural voices myself. I just did my best to build a lightweight tool that helps language learners create quality audio cards.
To be fair, you did ask for feedback. And you are touting how “close to real human voices” they are. If those aren’t languages your ear is accustomed to, you might value the knowledge that the service you’ve chosen isn’t universally great (so your tool might not be of the same quality for every language …).
You are absolutely right on this point.
However, my comparison is made exclusively with the commonly used service, namely Google’s TTS, which is particularly flat and sounds very “synthetic” in most languages.
I suspect that many folks who are experienced with trying to fit TTS into their language learning have found there are more fish in the sea than just those two!
Sounds good to me actually. It has a weird vocal fry, but probably slightly better than Google TTS. It also does sound a bit more “human” probably because of the intonation.
(on a different note, I wonder how come we have awesome LLM coders nowadays and yet TTS voices suck so much…)
Here, all the Edge TTS voices are listed. Can you check if there are any better Japanese or Arabic voices on this list? You can listen to these voices on their website. If you found better ones, I’ll update my software to replace them.
The first time I tried that link, it had ONLY Spanish and English. The second time, it added several other languages, but did not include Japanese or Arabic. Nor any languages (other than Spanish/English) that I have enough experience with to judge. I didn’t think the Chinese sounded good, but I haven’t done enough with Chinese to trust my judgment. I did think the “Hong Kong English” sounded more like an American!
Aha—there’s a trick to it. If you don’t see what you want on the “language” menu, go to the “Country” menu and click on whatever country is checked (ticked/selected) to unselect it. The Italian sounded pretty good. But the single example sentence is not adequate to really evaluate the quality.
Apple has had some pretty good voices for years, but even those are noticeably fake even today. And some of their voices are terrible, but haven’t changed (as far as I can tell) in all those years.
I’m actually thinking about making a similar tool who’ll be available directly from a website with a backend to generate audio files.
Also, with the possibility to have multiple choices of voices for a given language (I’m talking about all voices availables here : https://tts.travisvn.com/)