Speech Recognition for Anki

abdo · October 25, 2022, 7:52pm

This is the support thread for Speech Recognition for Anki - AnkiWeb

I mainly wrote this to transcribe mpvacious cards with no subs.

The add-on is experimental. I appreciate testing and feedback!

GitHub: GitHub - abdnh/anki-asr: Anki add-on for speech recognition

fkacam · July 17, 2024, 6:23pm

Hey, can u add Thai support? Deepgram supports Thai, but your extension’s configuration does not. It’s simply not on the dropdown.

abdo · July 19, 2024, 9:25am

Hi. I’ll try to take a look this week.

c909 · October 29, 2024, 9:11pm

I followed the instruction but got this error:

Failed to transcribe testaudio.mp3: invalid syntax (client.py, line 233)

abdo · October 30, 2024, 12:38pm

Which service are you using, Deepgram or Whisper? If you’re willing to share a temporary API key with me for testing, I can look into a fix for this. (I’ve been using a local Whisper model with this add-on for a while).

abdo · November 1, 2024, 2:05pm

@c909 Fix released. Thank you for the help!

c909 · November 1, 2024, 2:45pm

Thanks a lot for the quick fix!

DiveShallow · November 5, 2024, 4:10pm

Hi, I understand the batch MP3 speech generation for the cards, but is there a way to answer the front of a card with natural speech by holding a button to activate my mic, then submit my spoken answer? Ideally, the system would check if my answer is correct, provide feedback, read the back of the card, and even let me ask follow-up questions on the topic using the OpenAI API.

abdo · November 5, 2024, 4:53pm

I plan to add something similar in the future (not sure when though).

RGR · January 12, 2025, 8:41pm

Hello,
Thanks for the add-on! I’ve used it before, but now I’m getting the following errors:

Deepgram:
Failed to transcribe Audio.m4a: ‘NoneType’ object is not subscriptable

Whisper:
Failed to transcribe Audio.m4a: init() got an unexpected keyword argument ‘follow_redirects’

Anki:
Version ⁨24.11 (87ccd24e)⁩
Python 3.9.18 Qt 6.6.2 PyQt 6.6.1

abdo · January 13, 2025, 1:29am

Hi,

Can you use Help > About > Copy Debug Info and paste the text here?

RGR · January 13, 2025, 12:00pm

Oh, there it is!

Anki 24.11 (87ccd24e) (ao)
Python 3.9.18 Qt 6.6.2 PyQt 6.6.1
Platform: Windows-10-10.0.19045

===Add-ons (active)===
(add-on provided name [Add-on folder, installed at, version, is config changed])
AJT Browser Play Button [‘182970692’, 2024-09-17T17:18, ‘None’, mod]
AnkiConnect [‘2055492159’, 2024-11-06T22:34, ‘None’, mod]
Auto-refresh browser [‘746398558’, 2024-10-01T09:50, ‘None’, mod]
Customize Keyboard Shortcuts [‘24411424’, 2023-11-01T06:17, ‘None’, mod]
DeepL Translator [‘972129549’, 2024-04-26T17:11, ‘1.0.0’, mod]
Google Translate [‘1536291224’, 2024-10-21T11:10, ‘None’, mod]
Quick Colour Changing [‘2491935955’, 2023-10-19T00:15, ‘None’, mod]
Speech Recognition for Anki [‘411601849’, 2024-11-01T11:04, ‘None’, mod]
Web Browser - Search terms Import texts and images automatically [‘864545277’, 2023-11-17T16:28, ‘None’, mod]

===IDs of active AnkiWeb add-ons===
1536291224 182970692 2055492159 24411424 2491935955 411601849 746398558 864545277 972129549

===Add-ons (inactive)===
(add-on provided name [Add-on folder, installed at, version, is config changed])
Symbols As You Type [‘2040501954’, 2022-06-08T01:09, ‘None’, ‘’]

abdo · January 13, 2025, 11:52pm

I was able to reproduce this with an .ogg file. It only occurs with some files apparently. Interestingly, Whisper also failed to process the same file. Still looking into it. Can you maybe send me that file to test with?

This is due to an incompatibility with the Google Translate add-on.

@kelciour if you’re still maintaining this add-on, can you maybe consider updating the bundled httpx version to a more recent one? The openai module depends on a more recent httpx.

keIciour · January 14, 2025, 12:27pm

@abdo

Thank you for bringing this to my attention!

I’ve updated the bundled httpx version to the latest version.

@RGR

The Google Translate add-on was updated and it should be now compatible with the Speech Recognition add-on and OpenAI Whisper.

chrislg · January 14, 2025, 12:47pm

Is there a speech to text solution as Dragon ? Recording my voice into a file and then using Whisper AI can be a long process. Does any of you know what technology Dragon uses? It also allows adding vocabulary. Is there a solution in which you can add vocabulary ?

RGR · January 14, 2025, 5:52pm

@abdo Thank you so much for your help. You were right—it worked perfectly with the incompatible add-on disabled and using Whisper. But I’m still having trouble with Deepgram. No matter what file I use, whether it’s .mp3, .m4a, .mp4 or .webm, I always get the same error. I tried to attach an audio sample, but new users can’t upload attachments… Anyway, thanks, it has already solved the problem for me.

@kelciour Thanks for the update, it’s a really useful add-on!

Topic		Replies	Views
I Made a Tool That Turns Your Voice / lectures into High-Quality Anki Flashcards Learning Effectively	5	207	January 29, 2025
Voice recognition with Google (apologies jumped the gun) Suggestions	1	93	September 5, 2024
Batch creation of audio for a deck Help	6	2747	February 22, 2023
Audio files not adding Add-ons	3	536	January 14, 2022
Trouble Syncing Audio For One Card Syncing & AnkiWeb	2	290	May 1, 2023

Speech Recognition for Anki

Related topics