Comparing anecdotal actual retention to FSRS average predicted retention

I have this 10000 cards-sized deck. I made a custom session to review all of them at once. Before I did my average predicted retention of it was 98.2% (desired retention was set at 95%)

When I went down to reviewing, my ACTUAL retention was 94% after doing the math. So is the size of this discrepancy normal, and what are the error ranges of the average predicted retention should I take it into account the next time I read one?


Note: The reviewing was carried out on an extended period of time (3 days), so my performance based on stamina, nutrition and time of the day may have had an influence (though I doubt this alone would cause this large a differene)

Can you explain how exactly you got this number?

Also, make sure to read this post: Reddit - Dive into anything.

what are the error ranges of the average predicted retention

That’s a really good question. Frankly, I have no idea.

Can you explain how exactly you got this number?

I simply tested myself on ALL the cards in my deck (was 9226 cards in this case). Anki suggests that I should have a 98.2% retention as per the average predicted retention. I tested myself, and I failed to retrieve 550 cards of those, meaning I only passed 8676 cards, hence true retention (and in this case, my actual and real retention of the entire deck) of 94%.

98% is not 94%, even at desired retention set at 95% so this gave me a false sense of security (nothing too dramatic, but still).

Now I do feel like this is much lower for some reason. Whenever I look away from Anki I feel like I know much less than even what True Retention suggests which is weird. (but that is another problem).

I see. Unfortunately, this likely means that FSRS just isn’t accurate enough.

1 Like

I don’t know if my input helps, if at all. Maybe Sherlock could use it like a tiny probe into the performance of FSRS in the current build.

If it helps, I also had virtually no backlog for that deck at all.

Out of curiosity, what’s your logloss and RMSE for that deck (well, for the preset that is applied to the deck, but you get what I mean)?

FSRS will never be precise in the moment. After all, your memory can be affected by many factors. You haven’t had enough sleep, you’re stressed, you’re not eating well, you’re annoyed, you’re tired. Or vice versa, everything is fine with you. You can be more productive at certain hours of the day and less at others. Or it’s just that FSRS didn’t have enough checks to simulate your memory more accurately.

I will assume that 10,000 cards in 3 days is your non-standard load, which could also affect memory.

1 Like

The deck consists of many subdecks with their own presets, so I could only give ranges. From what I have seen log loss is around 0.15-0.21 and RMSE averages around 5% (I think 6.8% is the highest I have seen)

Could very well be. I was just wondering if the difference between 94% and 98% is not too large a difference to suggest that it may be because of something else other than just fatigue or monotony.

(I sleep generally more or less 6 hours, drink <750ml of water, I eat only at midnight before I sleep, taking sulpirides as antidepressants and I bare move myself at all) so I am open to that suggestion

The example was not to say that there is something wrong with you. And about the fact that your memory today may be very different from yesterday for many reasons.

Your check lasted 3 days, if you had enough cards with low stability and you did not check them in the first place, then their “R” could simply fall during the check. 98% is too extreme. Imagine that you would be able to bring the retention to 99.999%, and immediately after that you decided to check the FSRS score. Do you think the result would be correct?

Speaking of that…Here is what it thinks off my deck now
image

@Expertium So do I trust it or not? Or should I just mentally deduct 5% of it to get my actual retention? Or is there a certain point where FSRS can’t predict retention accurately anymore?

I am nowhere near 100% confident that I have mastered the content that well.

That’s about average. FSRS probably just isn’t accurate enough.

Or should I just mentally deduct 5% of it to get my actual retention?

That’s very crude, idk if that’s a good idea.

Or is there a certain point where FSRS can’t predict retention accurately anymore?

In general, FSRS’s ability to accurately predict the probability of recall is worse for very high (>95%) and very low (<60%) probabilities. Around 80%-90% is where FSRS performs the best. We’re still trying to figure out how to fix that. We might change the shape of the forgetting curve again, for the third time, but there are reasons why that’s not necessarily a good idea. At least if our goal is more than just minimizing RMSE at any cost.

1 Like

In general, FSRS’s ability to accurately predict the probability of recall is worse for very high (>95%) and very low (<60%) probabilities. Around 80%-90% is where FSRS performs the best.

Huh. Well, that explains it a bit. But this also worries me, because if I understood correctly, this means that FSRS believes that its doing a good job at higher retention rates when it actually isn’t.

Is there a paper-over-crack solution for this?

Is there a paper-over-crack solution for this?

Nope.

1 Like