Any estimates on how buggy it will be on old devices (phones especially)?
Right now itās slow as hell. Alex said right now RWKV can process like 13 reviews/second. But there are many ways to speed it up. Itās not impossible, but it does require someone who knows this stuff and is willing to put in the work.
We could probably get to 1000 reviews/second + use a smart way to prioritize some cards to run RWKV on, to save some time.
Given you seem somewhat positive about RWKV, what do you think of all the FSRS related work? Feels like that was for nothing if now we just ditch FSRS and do RWKV.
Well, thatās just the nature of progress. New stuff replaces old stuff. Plus, maybe some users would prefer FSRS (we would keep both).
Anki using SM2, FSRS, RWKV all at the same timeā¦
What about implementing RWKV as an addon at first. Some users would trade speed for precision
As far as I know, the neural network seems to be too big to be used as a plugin
Alex isnāt interested in maintaining add-ons so we wonāt have anyone for the work too
If the size of the file is too large it is easier to use add-ons, when uploading to AnkiWeb there is a limit of about 150MB but add-ons can download files from outside after installation, so there is no limit. (e.g. Ankimon, Ankihub) I donāt know about NN but the RWKV that Expertium mentions looks very small size.
The weights for RWKV are like 10 MB, so thatās not a problem. But Iām not sure if using an add-on is possible. I asked Alex in the Anki Discord server, Iāll copy his replyā¦if he replies.
Do you have experience yourself for making an addon like this to host the NN if the plan goes through. Doesnt have to be RWKV, but we are still here talking about it.
I donāt know how it works because I havenāt studied anything about NN. If the program is already working and does not need adjustment it is probably possible to embed it in an add-on, but in that case maybe other developers can develop it, AI is popular among developers so there are relatively many developers of add-ons related to AI.
It does not sound practical to integrate in the near future, and we still have our hands full dealing with FSRS.
Fair enough
Once FSRS reaches its full potential, then maybe this could be revisited.
Do you have a link to the pretrained weights? I wonder how difficult it would be to make an add-on. Might even contribute a way to extend the plugin hooks if necessary.
Well, consider the following scenario: (which isnāt reality, but seems possible to me)
-
we can run some RWKV or similar NN on 10,000 cards every 10 minutes on a smartphone
-
thatās way fewer cards than the average user has in their collection
-
but itās way more new cards than anyone is going to add in a day
-
the NN performs better than FSRS at scheduling new cards.
Then it would be reasonable to turn over management of cards with short intervals to the NN and leave the rest to FSRS. As for āmore or less confusingā, I guess it is subjective. We did have a similar thing not too long ago where we turned over management of short intervals from FSRS to a different algorithm, namely the manually-specified ālearning stepsā. Actually, I still use them occasionally.
This doesnāt address any of the issues with developing this feature, just seems like it might be worth keeping in mind.
Unrelated thought on this topic: the currently-widespread solution to āI donāt want to run an NN on my smartphoneā is to outsource the scheduling computations to the cloud (or home servers). Which seems like it would be very difficult to adapt anki for, for about a thousand reasons. But it would be pretty cool ![]()
Hello @dae,
What if how Scheduling algorithm integrates in Anki was reworked later on ?
I mean, I can understand that making both SM2 and FSRS coexist is already challenging, but I think with more maturity on what it means to schedule something, we could maybe find a way to abstract those schedulers to be able to implement new one more easily.
When you think about it, having Anki defining concepts like revlog that would be fed to a different scheduling layer that would output if a card is due or not, could make the adoption of future algorithms far easier. Anki could even free itself from concepts like āIntervalā and simply receive when a card is due or not (For example if some nextgen scheduling technic were re-scheduling other cards based on what is currently being reviewed right now).
Unless we run into technical difficulties, FSRS-7 will have an additional option to use the GPU to optimize an LSTM neural net first, and then train FSRS-7 to mimic it (knowledge distillation). With that + the improvement from FSRS-6 to FSRS-7, I think the case for replacing FSRS with a neural net is even less clear.
- With knowledge distillation you can make FSRS-7, a model with 35 parameters, become almost as accurate as a neural net with almost 9000 parameters. You get 99% of accuracy for a tiny fraction of the cost to run it.
- FSRS-7 will be accurate enough that I donāt think pursuing predictive accuracy any further will yield substantial improvements, unless we implement a massive neural network that has tons of different input features like I described here. And that would be A LOT of effort. Anki isnāt made for using anything other than interval lengths and grades for SRS.
What Iām getting at is that all the low-hanging fruits will be picked up once FSRS-7 is integrated into Anki, and at that point in order to significantly improve predictive accuracy youād need a neural net with millions of parameters and tons of input features that would require changing a whole lot about how Anki handles scheduling and would require lots of clever tricks to not make everything scheduling-related painfully slow.
So while I proposed this just because why not (and because DerIshmaelite wanted to discuss it), the more I think about it, the more I think it would be either too little gain (replacing FSRS with LSTM instead of using LSTM as a teacher) or a big gain with even bigger development costs (using Alexās RWKV, the āeverything is an input feature, including twitching of the userās left toe on Tuesdayā neural net).
- Is this accurate? And the only thing you need to do for 99% accuracy is having a 2-step optimiziation (first LSTM, then FSRS)?
- What does that mean for the user? Will the intervals that FSRS provides then be basically 100% correct (assuming good grading habits), meaning it will precisely know when to present the card to actually reach the DR exactly? Iām having a hard time understanding the practical impact here.
