I feel like this is deserving of its own topic.
- Do you know why FSRS-rs is slightly worse than the Python implementation?
- I’m being annoying at this point, but please try my idea with seed=n reviews. Or at least find your old code that you used to plot this, maybe I’ll be able to adapt it to test my idea (tbh, probably no). I get that you are burned out, but if there is a chance that we could improve the optimizer by simply making the seed depend on the number of reviews, that would be awesome.
This matter has troubled me for quite some time. I intend to resolve it, yet I have not yet identified the cause. It may be due to some underlying Rust libraries, making the issue exceedingly difficult to pinpoint.
Finally, I located the bug. The Adam optimizer doesn’t work expectedly after applying parameter clipper in FSRS-rs.
It costs me 8 hours… And I cannot fix it by myself. This bug requires a patch from the upstream lib:
Is there a chance this would be fixed before Dae releases the final stable 24.11 release
It’s highly unlikely. This bug is caused by a library that fsrs-rs depends on. A fix requires a new release of the dependent library. Moreover, fsrs-rs relies on an older version. Upgrading the dependency with breaking changes would involve an unpredictable amount of work.
Regarding seeds, I have a crappy idea.
We can add a new toggle - slow vs fast optimization. Fast - only one seed. Slow - 10 different seeds, keep the parameters that result in the lowest RMSE among all ten.
The problem is that “slow” would be only marginally better than “fast” in most cases, at the cost of making the optimizer 10x slower.
Also
Upside: the user can now make a choice
Downside: the user now has to make a choice
The old flexibility vs simplicity dilemma
@sorata @vaibhav thoughts?
I am team simplicity down the drain.
I don’t think adding such a toggle would be worth it. People would feel that they are missing out if they don’t enable slow optimization but the improvement in the RMSE won’t be proportional to the extra time spent.
By the way, is it that the greatest improvement in RMSE with different seeds occurs only in small collections? If it is so, we can try multiple seeds in smaller collections but only a single seed in larger ones.
Good question, I haven’t measured it. I’ll do it, but it will take a while.
Not to be a jerk about it, but that was my idea
Did you test it and find a positive effect? I tested it in the thread that I just linked to and I did not find that it improved anything.
I’d say we came up with it simultaneously.
Anyway, I couldn’t figure out a way to test it. I’m not that great at coding.
Maybe instead of having the toggle in the regular deck options it could be somewhere in the Anki preferences, like in the Review tab. That way people who care about the possible marginally better parameters can toggle it, and it wouldn’t add any complexity in the deck preferences for regular users.
And if/when automatic parameter optimization gets added to Anki, you can have settings related to that in the same area in preferences to keep things consistent.
Can you do something like you did for CMRR? Say, here if it takes 5 secs with the slow method it’s probably worth trying it. If it takes 20 secs then do optimisation with one seed.
I’ll bet that’s not happening.
That’s what I will probably do, if it turns out that there is a significant (both statistically and practically) difference between using 1 and 10 seeds.
I’ll make the number of “runs” depend on the number of reviews.
(well, I’ll ask Jarrett to implement it)
What about the number of presets? Wouldnt it be extremely painful if you are to optimise params for 240+ different presets?
Don’t make 240 presets, buddy
But seriously though, it’s the number of reviews that matters. The amount of time spent on optimization depends on the number of reviews, not presets.
A bit unrelated…Is there any progress on this.
No, but we will get a new dataset soon, and then we’ll measure how much metrics improve if parameters are set for every preset/deck individually.
However, it will be difficult to figure out the optimal number of decks/presets. We will be able to calculate how much of a difference it makes on average (across 10k users with all kinds of decks and presets), but not what number gives you the biggest difference (or at least right now I can’t think of way to do that).
We’ll also be able to identify siblings, but that doesn’t matter because:
- Even if we can incorporate sibling reviews into the optimization “on paper”, we can’t actually do that in Anki
- I can guarantee you Jarrett will be like “I’m taking a break, f*** your siblings and their reviews, call me again after 10 years”