Microphone Access via JavaScript on Cards (Desktop and Mobile)

Dear Dae and the Anki team,


Is your feature request related to a problem? Please describe

TLDR: I can’t access the microphone on cards using JavaScript.

Yes. I’m interested in using Anki, ideally on both desktop (without add-ons) and mobile, for language learning tasks that involve speaking. This includes activities like shadowing, pronunciation practice, and pitch or timing analysis. The idea is to avoid traditional shadowing, where you repeat after a recording and hope you’re close.

With microphone access, I could give myself immediate visual or quantitative feedback, making practice more focused, measurable, and engaging. Currently, I’m unable to access the microphone from JavaScript within a card, which prevents me from building these kinds of features myself.

Somewhat related, I recently used Anki to improve my perception of Japanese pitch accent, and found that showing spectrograms on the back of the card helped me tune in to intonation differences much more effectively. In general, I’ve found that having a simple task on the front of the card, and lots of information to assist learning on the back of cards, is the most effective way to learn for me. That is to say, having more features might not just be a “nice-to-have”, but possibly changes the difficulty of learning something from prohibitive to possible, or from not sufficiently time efficient to time efficient.

I also have some ideas outside of language learning (e.g. singing), but I think shadowing seems like a straightforward use-case.

Describe the solution you’d like

TLDR: Allow JavaScript on cards to access the microphone.

I’d like to be able to access the microphone using JavaScript on a card, so I can capture and process the audio myself. This would allow me to create interactive cards that include real-time or post-recording analysis, such as comparing my speech to a model, showing spectrograms, or analysing pitch. I’m not asking for Anki to include these features - just for JavaScript-based access to be possible.

To understand what’s feasible, I’d like to ask:

  1. Is microphone access via JavaScript (e.g., using getUserMedia or otherwise) currently possible in AnkiDesktop or AnkiMobile?
  2. If not, would it be technically possible with changes to Anki?
  3. If it is technically possible but not yet supported:
    3.1 What technical hurdles exist that may prevent this?
    3.2 What non-technical hurdles exist that may prevent this?

Describe alternatives you’ve considered

TLDR: The alternatives are just workarounds.

  • Using Anki for scheduling and another application that allows mic access for everything else. This could work, but requires switching away from Anki and disrupts the experience.
  • Using add-ons or local tools on desktop, though these don’t work cross-platform and aren’t portable.
  • Anki’s built-in recording tools, which don’t integrate with JS or allow real-time access.

These options are workarounds rather than true solutions for interactive cards.


Thank you for your time and for making Anki. It has had an outsize impact on my life and I’m very grateful for your work.

What if a shared deck I download from AnkiWeb gets access to my input devices without notifying me as a user?

I think this is probably the primary concern as well, cheers for getting straight to it.

TLDR: Mic access is riskier than typed input, but with opt-in, prompts, and a visible indicator, we can sufficiently minimise the risk.

The risk with microphone access is twofold: it can be activated passively, without the user realising, and the data it captures is richer than typed input - including background speech, tone, and voice characteristics.

To handle that, I think the following safeguards are appropriate:

  1. A global opt-in setting (disabled by default)
  2. A prompt at the start of every review session if any cards may use the microphone
  3. A clearly worded warning in the session prompt, directly informing users of the risks, and advising to proceed only if they trust the deck and their recording environment
  4. A persistent, visible indicator (top bar or bottom bar) whenever the mic is active

This wouldn’t prevent someone from recording or sending data if a user downloads something malicious, explicitly enables mic access, and then agrees to the prompt - but the same is already true for typed input. A malicious deck today could include a form field, ask for input, and send that data off without the user realising. The key difference is that mic input doesn’t require deliberate user interaction.

With these measures in place, mic access becomes an informed and visible choice. This is broadly similar to how browsers and communication apps handle microphone access. In those cases, the user is assumed to be aware that their audio may be transmitted to another party. The difference with Anki is that users might not expect that possibility in a deck - so instead of relying on implicit awareness, we’d make that risk explicit through the prompt, warning, and indicator.

Of course, these measures don’t eliminate the risk entirely, but if they reduce how often users are caught unaware, the added friction will likely reduce the overall attractiveness of this as an attack vector. That said, it’s possible Apple may still view any dynamic mic access as too broad a surface, regardless of how well-contained it is in practice.

To answer your question directly:

Without precautions, something nefarious could potentially occur, but with the safeguards above, the user is clearly notified and the risk effectively mitigated.

I believe there is work in plan to make things more secure and perhaps mic access can come to fruition if the risks were fairly mitigated. But, to do it now would be a step back from security perspective, especially as a lot of Anki users aren’t very tech-savvy.

You might be interested to look at the discussion here:

Thanks for linking that thread - I hadn’t seen it before, and from what I read, a lot of the concerns raised there apply here too. The worry about shared decks running unexpected code, the solutions (opt-in etc.) that were proposed, and trying to balance advanced use cases without exposing casual users.

Having said that, I think microphone access might be worth considering separately.

Unlike proposals to expose Anki’s internal APIs to JavaScript - like getting or modifying deck data - this wouldn’t touch any of Anki’s core logic or collection data. It would just expose a standard browser API (getUserMedia) inside the review screen. So in terms of scope, it may be a little more self-contained.

And unlike the broader add-on ecosystem discussion, this might not require shared infrastructure changes across clients. If I’ve understood correctly, on desktop and AnkiMobile it could potentially be implemented by adjusting the WebView permissions and adding UI prompts, without needing a full plugin system or internal API bridge.

That said, here are a few of the concerns raised in that thread and how I think they relate here:

  1. Shared decks are implicitly trusted, and users may not realise when risk is introduced
    I think the mitigations I proposed would prevent this.

  2. Users might enable a feature once and forget about it
    The mitigations I proposed would almost definitely prevent this.

  3. Enabling new JS capabilities in shared decks sets a precedent that’s hard to contain
    That makes sense in the context of Anki’s internal APIs, where each new endpoint could open up more of the collection or review logic. This case might be a bit different as it’s a very narrow feature, and one that in a way seems aimed at the user and not at Anki. It seems pretty all-or-nothing as well. If it’s decided afterwards that removing this feature is necessary, I think it only breaks that feature and not the overall experience for the user (i.e. only the cards with behaviour that’s been determined to be risky).

  4. It’s important not to loosen the current direction toward stricter isolation
    Probably the biggest one here. I don’t think mic access is compatible with staying just as strict - by nature, it introduces risk as new information is exposed. But maybe this is one of those areas where a well-contained exception could be worth it though. JavaScript in templates already increases the risk, but the tradeoff has proven worthwhile for many use cases. If anything, the trade-off here would be more visible at least.

  5. New features could cause unforeseen issues or maintenance burdens and be difficult to reverse
    Because this is a narrow, browser-level feature that doesn’t interact with Anki’s internals, it should be easier to trial in a contained way. If problems arise, it could be cleanly disabled without requiring broad architectural changes or affecting unrelated parts of Anki. Since no current cards or templates rely on mic access, the feature could remain opt-in without breaking anything unexpected, making it well-suited to a long-term beta toggle. If a problem is identified, we’d just disable the feature and break those cards. That might seem like a problem, but if we already have the security notice, we can also add that the feature is experimental and could be removed at any time. If the user reads “The use of the microphone is an experimental feature - it has serious security risks and may be disabled in future: …details…”, every time they use a deck, they’re not going to be surprised by anything.

All this should make it more feasible to scope out clearly, and potentially easier to fund a one-off review from someone experienced security issues like this. If it turns out to be safe with the right safeguards, great. If not, at least we’ll know where the boundary is. Either way, I’d be happy to contribute (funding and testing etc.) toward that process if it helps move things forward.