Has anyone done a live comparison of FSRS and SM2 as implemented in Anki? (It looks like no, so can anyone help me set it up?)

Oh, hey, I guess this is unlocked now, which I noticed after contributing to a discussion on github instead.

It has been 729 days since October 19, 2023, so I’ll be looking at cards introduced since then (if full-collection history is useful, let me know, but I have a lot more cards that had many SM-2 reviews before I started this experiment).

Also, the stats page has more info than it used to! I don’t seem to be able to attach the PDF so I guess you’re getting a bunch of screenshots.

prop:cdn:expcat=0 introduced:729 (FSRS with default weights: 5691 cards, 68 of those suspended, 57 tagged as leech though I may have deleted some of those tags, 4470 mature)

(I’m not including a screenshot of the “what button did you press” one since there aren’t any numbers on that any more? but these are the numbers I see when I hover over)

learning: 88.62% correct 32,578/36,963, young: 90.82% correct 18,396/20,256, mature: 90.69% correct 7,591/8370

prop:cdn:expcat=1 introduced:729 (SM-2, 5718 cards, 52 of those suspended, 45 of those tagged as leech, same caveats, 4391 mature)

learning: 88.83% correct, 33,194/37,369; young: 92.07% correct, 22,200/22,411; mature: 89.68% correct, 7,210/8,040

prop:cdn:expcat=2 introduced:729 (FSRS with calibrated weights, 5661 cards, 70 of those suspended, 60 of those tagged as leech, same caveats, 4708 mature)

learning:88.57% correct 32,663/36,877; young:89.14% correct 14,918/16,736; mature 89.36% correct 6,894/7,715

I’d appreciate any help with analyzing this data (and we can dig deeper into other data if desired, this is just what I could grab quickly from the stats page) since statistics was never my strong suit, but a 4-8% reduction in study time for 1-2 percentage points reduction in retention seems like it’s significant in some way to me unless the 1-2 percentage points is itself significant.