Safer addons that work on mobile using WebAssembly

I have been have having a little play with WebAssembly Components over the last couple of days and think it could be a good basis for an alternative addon system for Anki.

What do you think?

I would be interested in having a go at implementing it if the Anki developers think it would be a good idea.

Potential Upsides

Better safety

Addons run inside a sandbox so it is harder for them to do naughty things.

They should only be able to interact with Anki through the interface we give
them.
No direct system access or ability to monkey patch the Anki application.

This also allows you to implement a permission system a la browser extensions /
mobile apps where the user has control over what the addon can access.

Mobile/web app support

Addons are run using a Wasm runtime which can be embedded in rustlib, no need for a Python interpreter.

If the app is based on rustlib it should be relatively easy to enable addon support.

You should only need to reimplement the “glue” for each platform, not the
entire addon host architecture.

Addons can be written in languages other than Python

Many different languages can compile to Wasm Components:

  • C/C++
  • C#
  • Go
  • JavaScript
  • Python
  • Rust

Potential Downsides

The only downside I can think of using this approach is that the functionality of addons would be strictly limited to what Anki explicitly exposes.

The creativity of addon authors could be stunted without the “escape hatch” of directly twiddling with the database, monkey patching etc.

Demo

I have created a little proof-of-concept demo that you can have a look at here (scroll down to see instructions for how to use it).

6 Likes

Cross platform add-ons would be nice. FYI, there have been discussions about cross-platform JS add-ons in the past but not much has transpired since then. If you want to read the relevant discussions, Cross-platform JS addons for the reviewer.

1 Like

For reference I also found these older topics that talk about JS based addons.

and

Your linked topic seems to be talking about giving Card javascript access to an API which they can use to directly interact with Anki.

e.g. to find out how many cards are left to review, or allow editing in-reviewer.

It seems to me that dae’s biggest objection to that is that it makes previously “safe” shared decks become inherantly dangerous by giving them deeper access to your system.

This is not what I am proposing here.

My proposal is just a way to make the existing kind of addons we have work in a different way that has some benefits over what the current Python addons offer.

e.g. you could have something like FSRS Helper that can be used on AnkiMobile/AnkiDroid.

To me this seems like it should be much less objectionable, and should in fact make things safer because of the sandboxing.

There would be nothing stopping you from later creating some kind of mechanism for letting Decks “recommend” Addons or letting Cards talk to Anki/Addons, but that is not the main point of this proposal.

2 Likes

I don’t think running a WASM host is a good idea. It means we have to copy everything into and out WASM memory, and most add-ons still need some way to present or modify the UI.

3 Likes

Isn’t this a problem that almost all addon solutions will have?

Unless you distribute Addons as native libraries it is going to be neccessary to do some form of marshalling.
e.g. from what I understand the current Python solution does not share memory with Rustlib, it uses Protobuf to send messages back and forth.

Admittedly I do not know how much this will affect speed and memory usage compared to the current Python solution.

I assumed this would be solved by providing exention points in the Addon Host e.g.:

  • Show a window which the Addon can populate with HTML + JS
  • Add an entry to the Deck context menu (cog menu)

The Addon Host then forwards these requests to the frontend which would need to decide how to best implement that on each platform.

I assume this would be the only way this could be achieved in a cross-platorm manner because the UI could be implemented in very different ways on different platforms e.g.:

  • Qt (legacy desktop)
  • Svelte (new desktop)
  • Native IOS
  • Native Android

I’ve updated my demo with an example of UI interaction.

You can now add simple actions to the Tools menu from the Wasm Addon.

I see this as a big improvement. In a perfect scenario, add-ons should communicate with Anki only by exposed APIs and should not perform direct “hacks”. This approach will minimize situations, where add-ons stop working after upgrading Anki to the new version.

3 Likes

I agree that in an ideal world this would reduce addon breakage, but there is a reason why I listed it as a potential downside.

Each “official” API you add creates a maintenance burden.
Sometimes you want to change things internally to make improvements, but you don’t want to break things after telling people “this is the correct way to do this”.
Obviously this leads to being quite conservative with adding new “official” APIs.

If you want to do something Anki does not currently have an official API for Python addons have the “escape hatch” of being able to meddle with the Anki internals.
The addon might break in the future if those internals change, but you get to have those added features right now.

The sandboxing of Wasm means you would just not be able to have that functionality at all. (unless you have a yucky API equivalent to exec(...) which would just let you escape the sandbox in a really awkward way).

1 Like

I’m afraid I’m not convinced by your UI example. What happens when the add-on author wants to do more than print something to the console? How is this going to scale to complicated UIs? It seems like it would be a huge amount of work for us, and awkward to use. I think our time is better invested in moving more of our UI into the shared web code and having add-ons modify that, instead of trying to come up with some multi-UI add-on solution.

When I came up with the current architecture, I made a deliberate decision to keep the Rust layer free of any foreign interference, such as callbacks into Python. It greatly reduces the complexity we need to deal with, keeps the core code fast, and makes it easier to refactor things in the future.

You are right that the data from rslib is marshalled so it can be transferred to the Python/TS layer. But that happens only at the end of a potentially large operation - we wouldn’t need to do it for every note separately in a bulk update for example. And when that data is in the web layer, it can potentially be passed to add-on code running in the same context without any extra copying.

2 Likes

Thanks for taking the time to look at my little demo, I know you are very busy with the Beta / Svelte migration.

I agree that your work on consolidating the UI into the shared Svelte code is more important. This is just something I personally want to play with.

If this is annoying feel free to tell me it is a hard no, and you do not want to discuss it further.

I am not totally married to this Wasm solution. I agree that that other solutions like JS addons that run in the frontend/ Web Workers have advantages and may be a better overall solution.
I brought up this topic because I recently had a look at Wasm Components, thought they were neat, and wanted to experiment to see if they could work well for Anki addons. (I’ve done a little more in the last few days, but thought it would be best to keep the demo I pushed to github simple).

I wasn’t trying to argue that UI interaction would be done exactly like that, just that it is possible to do UI interaction from Wasm.
I’m new to the Anki codebase and there is a lot to figure out. I have not yet dug into how the Python/QT WebView ↔ Svelte UI is all hooked up.
I had got as far as the rslib ↔ Python part so I just made something simple with what I understood.

I for UI hooks beyond just “I want a button in this menu” I envisioned the addon author would provide HTML+CSS+JS that Anki would inject into its UI in a dedicated IFrame / a dedicated WebView at the relevant point. i.e. We would give them specific extension points, not direct access the the main UI, so there would be less chance of conflicts with other addons/breakage on future UI updates.

We would then give them a JS API something like:

  • on_message_from_addon(callback: fn(message: JSON) -> ());
  • post_message_to_addon(message: JSON);

Then for the rslib ↔ frontend side of things it would be something like:

  • poll_message_from_addon() -> (addon_id: GUID, message: JSON)
  • post_message_to_addon(addon_id: GUID, message: JSON)

The frontend code would then have a thread/coroutine calling the poll method and dispatching the messages from rslib in a loop. (N.B. no rslib callback into Python).

There would still be a lot to figure out if we did go down this route.
e.g.:

  • Do we let addons dynamically edit UI, or do they have to statically declare it in a manifest file?
  • Do we provide convenience methods to addon UI to let them do common things directly from JS, or require all the business logic to be in the Wasm code?

Marshalling

I agree this may be a problem, but I’m not sure how much of one it would be.
It would depend greatly on how fast the Wasm marshalling is and what we are doing.

For your bulk note update example:

It could be horrible e.g.: if Wasm marshalling is very slow and we want a full copy of the objects on the frontend right now. We waste a lot of extra time marshalling back and forth between rslib, wasm and frontend.

It could be better than it is currently e.g.: if Wasm marshalling is faster than protobuf and we do not need the objects in the frontend right now. We spend less time marshalling between rslib and wasm than we would have between rslib and frontend.

N.B. WIT allows you have resources (handles) which could reduce the amount of copying.

If the before-add-note part of my demo was refactored to use resources (handles) instead of records (structs) you only pass the deck-id and a Note handle into Wasm memory on each call. There could then be no further copying (e.g. we ignore notes in that deck) or only limited copying (only the fields we access)

This is true, but JS / Web Worker based addons come with their own disadvantages as well.
It seems to me that JS addons would require you to do one of:

  • Re-implement the entire addon infrastructure on each platform.
  • Put a lot of non-UI addon logic into shared JS that each platform needs to run somehow.
  • Throw away the existing desktop/ IOS/ android apps and replace them with a single cross-platform toolkit like Tauri.
3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.