Speech Recognition for Anki

This is the support thread for Speech Recognition for Anki - AnkiWeb

I mainly wrote this to transcribe mpvacious cards with no subs.

The add-on is experimental. I appreciate testing and feedback!

GitHub: GitHub - abdnh/anki-asr: Anki add-on for speech recognition

3 Likes

Hey, can u add Thai support? Deepgram supports Thai, but your extension’s configuration does not. It’s simply not on the dropdown.

Hi. I’ll try to take a look this week.

1 Like

I followed the instruction but got this error:

Failed to transcribe testaudio.mp3: invalid syntax (client.py, line 233)

Which service are you using, Deepgram or Whisper? If you’re willing to share a temporary API key with me for testing, I can look into a fix for this. (I’ve been using a local Whisper model with this add-on for a while).

@c909 Fix released. Thank you for the help!

2 Likes

Thanks a lot for the quick fix!

1 Like

Hi, I understand the batch MP3 speech generation for the cards, but is there a way to answer the front of a card with natural speech by holding a button to activate my mic, then submit my spoken answer? Ideally, the system would check if my answer is correct, provide feedback, read the back of the card, and even let me ask follow-up questions on the topic using the OpenAI API.

I plan to add something similar in the future (not sure when though).

Hello,
Thanks for the add-on! I’ve used it before, but now I’m getting the following errors:

Deepgram:
Failed to transcribe Audio.m4a: ‘NoneType’ object is not subscriptable

Whisper:
Failed to transcribe Audio.m4a: init() got an unexpected keyword argument ‘follow_redirects’

Anki:
Version ⁨24.11 (87ccd24e)⁩
Python 3.9.18 Qt 6.6.2 PyQt 6.6.1


Hi,

Can you use Help > About > Copy Debug Info and paste the text here?

Oh, there it is!

Anki 24.11 (87ccd24e) (ao)
Python 3.9.18 Qt 6.6.2 PyQt 6.6.1
Platform: Windows-10-10.0.19045

===Add-ons (active)===
(add-on provided name [Add-on folder, installed at, version, is config changed])
AJT Browser Play Button [‘182970692’, 2024-09-17T17:18, ‘None’, mod]
AnkiConnect [‘2055492159’, 2024-11-06T22:34, ‘None’, mod]
Auto-refresh browser [‘746398558’, 2024-10-01T09:50, ‘None’, mod]
Customize Keyboard Shortcuts [‘24411424’, 2023-11-01T06:17, ‘None’, mod]
DeepL Translator [‘972129549’, 2024-04-26T17:11, ‘1.0.0’, mod]
Google Translate [‘1536291224’, 2024-10-21T11:10, ‘None’, mod]
Quick Colour Changing [‘2491935955’, 2023-10-19T00:15, ‘None’, mod]
Speech Recognition for Anki [‘411601849’, 2024-11-01T11:04, ‘None’, mod]
Web Browser - Search terms Import texts and images automatically [‘864545277’, 2023-11-17T16:28, ‘None’, mod]

===IDs of active AnkiWeb add-ons===
1536291224 182970692 2055492159 24411424 2491935955 411601849 746398558 864545277 972129549

===Add-ons (inactive)===
(add-on provided name [Add-on folder, installed at, version, is config changed])
Symbols As You Type [‘2040501954’, 2022-06-08T01:09, ‘None’, ‘’]

I was able to reproduce this with an .ogg file. It only occurs with some files apparently. Interestingly, Whisper also failed to process the same file. Still looking into it. Can you maybe send me that file to test with?

This is due to an incompatibility with the Google Translate add-on.

@kelciour if you’re still maintaining this add-on, can you maybe consider updating the bundled httpx version to a more recent one? The openai module depends on a more recent httpx.

2 Likes

@abdo

Thank you for bringing this to my attention!

I’ve updated the bundled httpx version to the latest version.

@RGR

The Google Translate add-on was updated and it should be now compatible with the Speech Recognition add-on and OpenAI Whisper.

2 Likes

Is there a speech to text solution as Dragon ? Recording my voice into a file and then using Whisper AI can be a long process. Does any of you know what technology Dragon uses? It also allows adding vocabulary. Is there a solution in which you can add vocabulary ?

@abdo Thank you so much for your help. You were right—it worked perfectly with the incompatible add-on disabled and using Whisper. But I’m still having trouble with Deepgram. No matter what file I use, whether it’s .mp3, .m4a, .mp4 or .webm, I always get the same error. I tried to attach an audio sample, but new users can’t upload attachments… Anyway, thanks, it has already solved the problem for me.

@kelciour Thanks for the update, it’s a really useful add-on!