Suggestion to Damien: Migrating to an NN (Neural Net)

Any estimates on how buggy it will be on old devices (phones especially)?

Right now it’s slow as hell. Alex said right now RWKV can process like 13 reviews/second. But there are many ways to speed it up. It’s not impossible, but it does require someone who knows this stuff and is willing to put in the work.

We could probably get to 1000 reviews/second + use a smart way to prioritize some cards to run RWKV on, to save some time.

Given you seem somewhat positive about RWKV, what do you think of all the FSRS related work? Feels like that was for nothing if now we just ditch FSRS and do RWKV.

Well, that’s just the nature of progress. New stuff replaces old stuff. Plus, maybe some users would prefer FSRS (we would keep both).

Anki using SM2, FSRS, RWKV all at the same time…

What about implementing RWKV as an addon at first. Some users would trade speed for precision

As far as I know, the neural network seems to be too big to be used as a plugin

Alex isn’t interested in maintaining add-ons so we won’t have anyone for the work too

If the size of the file is too large it is easier to use add-ons, when uploading to AnkiWeb there is a limit of about 150MB but add-ons can download files from outside after installation, so there is no limit. (e.g. Ankimon, Ankihub) I don’t know about NN but the RWKV that Expertium mentions looks very small size.

The weights for RWKV are like 10 MB, so that’s not a problem. But I’m not sure if using an add-on is possible. I asked Alex in the Anki Discord server, I’ll copy his reply…if he replies.

Do you have experience yourself for making an addon like this to host the NN if the plan goes through. Doesnt have to be RWKV, but we are still here talking about it.

I don’t know how it works because I haven’t studied anything about NN. If the program is already working and does not need adjustment it is probably possible to embed it in an add-on, but in that case maybe other developers can develop it, AI is popular among developers so there are relatively many developers of add-ons related to AI.

Let’s wait a few weeks/months/years for Alex to make an account to post on the forum :laughing:
In the meantime I will be a messenger, lol

It does not sound practical to integrate in the near future, and we still have our hands full dealing with FSRS.

Fair enough :+1: Once FSRS reaches its full potential, then maybe this could be revisited.

Do you have a link to the pretrained weights? I wonder how difficult it would be to make an add-on. Might even contribute a way to extend the plugin hooks if necessary.

Well, consider the following scenario: (which isn’t reality, but seems possible to me)

  1. we can run some RWKV or similar NN on 10,000 cards every 10 minutes on a smartphone

  2. that’s way fewer cards than the average user has in their collection

  3. but it’s way more new cards than anyone is going to add in a day

  4. the NN performs better than FSRS at scheduling new cards.

Then it would be reasonable to turn over management of cards with short intervals to the NN and leave the rest to FSRS. As for ā€œmore or less confusingā€, I guess it is subjective. We did have a similar thing not too long ago where we turned over management of short intervals from FSRS to a different algorithm, namely the manually-specified ā€œlearning stepsā€. Actually, I still use them occasionally.

This doesn’t address any of the issues with developing this feature, just seems like it might be worth keeping in mind.

Unrelated thought on this topic: the currently-widespread solution to ā€œI don’t want to run an NN on my smartphoneā€ is to outsource the scheduling computations to the cloud (or home servers). Which seems like it would be very difficult to adapt anki for, for about a thousand reasons. But it would be pretty cool :slight_smile:

Hello @dae,

What if how Scheduling algorithm integrates in Anki was reworked later on ?

I mean, I can understand that making both SM2 and FSRS coexist is already challenging, but I think with more maturity on what it means to schedule something, we could maybe find a way to abstract those schedulers to be able to implement new one more easily.

When you think about it, having Anki defining concepts like revlog that would be fed to a different scheduling layer that would output if a card is due or not, could make the adoption of future algorithms far easier. Anki could even free itself from concepts like ā€œIntervalā€ and simply receive when a card is due or not (For example if some nextgen scheduling technic were re-scheduling other cards based on what is currently being reviewed right now).

Unless we run into technical difficulties, FSRS-7 will have an additional option to use the GPU to optimize an LSTM neural net first, and then train FSRS-7 to mimic it (knowledge distillation). With that + the improvement from FSRS-6 to FSRS-7, I think the case for replacing FSRS with a neural net is even less clear.

  1. With knowledge distillation you can make FSRS-7, a model with 35 parameters, become almost as accurate as a neural net with almost 9000 parameters. You get 99% of accuracy for a tiny fraction of the cost to run it.
  2. FSRS-7 will be accurate enough that I don’t think pursuing predictive accuracy any further will yield substantial improvements, unless we implement a massive neural network that has tons of different input features like I described here. And that would be A LOT of effort. Anki isn’t made for using anything other than interval lengths and grades for SRS.

What I’m getting at is that all the low-hanging fruits will be picked up once FSRS-7 is integrated into Anki, and at that point in order to significantly improve predictive accuracy you’d need a neural net with millions of parameters and tons of input features that would require changing a whole lot about how Anki handles scheduling and would require lots of clever tricks to not make everything scheduling-related painfully slow.
So while I proposed this just because why not (and because DerIshmaelite wanted to discuss it), the more I think about it, the more I think it would be either too little gain (replacing FSRS with LSTM instead of using LSTM as a teacher) or a big gain with even bigger development costs (using Alex’s RWKV, the ā€œeverything is an input feature, including twitching of the user’s left toe on Tuesdayā€ neural net).

  1. Is this accurate? And the only thing you need to do for 99% accuracy is having a 2-step optimiziation (first LSTM, then FSRS)?
  2. What does that mean for the user? Will the intervals that FSRS provides then be basically 100% correct (assuming good grading habits), meaning it will precisely know when to present the card to actually reach the DR exactly? I’m having a hard time understanding the practical impact here.