FSRS vs SM2 in 2 seperate profiles

chrislg · November 20, 2024, 5:06pm

Currently with FSRS 5 I get the following statistics on my Profile1:

Log loss: 0.4479, RMSE(bins): 2.76%. Smaller numbers indicate a better fit to your review history.

I also have a Profile2 that I used 2-3 years ago with SM2. Although SM2 is not based on recalling probability, is there a way to compare or generate RMSE in these deck collection?

Danika_Dakika · November 20, 2024, 6:29pm

No, you can’t generate RMSE on an SM-2 scheduling history. SM-2 wasn’t doing anything to predict your recall, so there’s nothing to compare it to.

RMSE (bins) can be interpreted as the average difference between the predicted probability of recalling a card (R) and the measured (from the review history) probability. For example, RMSE=0.05 means that, on average, FSRS is off by 5% when predicting R.

chrislg · November 20, 2024, 6:34pm

I found this:

SM-2-trainable: a variant of SM-2 where the parameters are trainable

in GitHub - open-spaced-repetition/srs-benchmark: A benchmark for spaced repetition schedulers/algorithms

How could I generate a SM2 trainable model for comparison purposes ?

Danika_Dakika · November 20, 2024, 6:36pm

I’ve moved your post to the FSRS category so someone familiar with the benchmarking models might see it.

Expertium · November 20, 2024, 7:12pm

That was made just for benchmarking purposes, so you’d have to copy-paste the Python code to use it on your own. The original SM-2 doesn’t predict probabilities, so in the benchmark LMSherlock added some extra formulas on top of it.
Theoretically, it’s possible to add the same probability-related formulas to Anki’s version of SM-2, hook it up to the optimizer and run optimization the same way as we do with FSRS…but why?
Btw, according to the benchmark, FSRS-5 outperforms SM-2 trainable in 97.4% of cases, so even if you level the playing field by making both optimizable and using the same optimizer, FSRS is still clearly better.

Actually, on second thought, it would be fun and it would make it clear that SM-2 isn’t as good as FSRS. We could show both FSRS and SM-2 metrics side-by-side in “Evaluate”, so that people will be like “SM-2 has higher numbers, so it’s worse, ok, got it”.
Is it worth the effort to implement? Debatable. Will Jarrett do it? I wouldn’t bet on it.

Danika_Dakika · November 21, 2024, 1:06am

Thank you! I was looking for the graph with the flat SM-2 line across the top, and I couldn’t think of where it was. But this is much clearer!

system · December 21, 2024, 1:06am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Has anyone done a live comparison of FSRS and SM2 as implemented in Anki? (It looks like no, so can anyone help me set it up?) Scheduling	12	5674	December 13, 2023
Anki SM2 beating FSRS in FSRS Simulator Scheduling	4	1486	October 23, 2023
Higher rmse in fsrs 5 FSRS	11	262	January 5, 2025
FSRS on a deck with many easy cards FSRS	3	113	February 13, 2025
FSRS for Low nomber of cards Help	10	130	December 25, 2024

FSRS vs SM2 in 2 seperate profiles

Related topics