I keep trying to wrap my head around this and I’m not convinced this should be the thing we are using as the main metric. The name is misleading for one, if you actually want to see a “Total Remembered” type stat, you’d want the code to be something like
So what do you suggest be new criteria for the simulation to ensure robustness of the simulation, so that Sherlock and Expertium get an intelligible idea of what is to be done. This discussion has digressed a lot, I feel, and I think Sherlock is having a meltdown.
How is that not? FSRS can not tell you whether you’ve forgotten something or not. It only gives you a percentage. It’s similar to asking a person how many tails they got in coin tosses. You take the number of tosses and multiply with 50% which is the probability.
The way we define it elsewhere is different. It’s taking a flashcard and multiplying it with the R value. Doing it for every card and summing them up (or average R × total cards). Similar to how you would do in a coin toss situation.
I guess the reason this feels more right to me is because if your entire deck is Retrievability > DR, then you have zero cards to study today. Sure, that doesn’t technically mean you “remember” or “don’t remember,” but we do kind of treat it that way. We aren’t studying cards above the DR at all, ostensibly because that’s the level of R that we’re satisfied with. That’s the level we consider ourselves “knowing” the information.
Well for starters I thought that backlog size might affect how different sorting methods work because of how fast it takes to work through the backlog.
You can imagine a 10000 card backlog.
You could only do 1000 cards a day
You have a true retention of 90% from those 1000 cards. That means 900 cards have passed, and for simplicity’s sake the remaining 100 cards are sent to the next day.
This means that the backlog has been reduced down to 9100. But the new cards that are due the next day is another factor of influence. The backlog may then end up >10000 card large again or less. It depends
I think it is hard to get a completely robust simulation out of this covering all variables.
1- Backlog Size
2- Number of New Due Reviews per day
3- Average True Retention per day
4-New cards per day
This should be the exact same thing as
Potential Retrievability Loss (PRL) = R(Today) - R(Tomorrow)
R^3/S would give you it for the continuous distribution, but cards aren’t scheduled continuously, they’re scheduled discretely in 1 day increments. So R(Today) - R(Tomorrow) gives you that discrete distribution calculation.
@rich70521 I don’t think your formulae is very robust either. I’m finding a very hard time thinking about it but let me explain one pitfall.
Say, you review four set of due cards A, B, C and D at four different dates. Now, all of them were due today but you cannot review them all at once with the set review limit.
Now, if I try to apply your PRL formulae, say, I review the cards in order of A, B, C, D. That is, each day I compare PRL in relation to tomorrow, then I study A first. Then B, C, D remains and I compare PRL next day and study B. Then I study C and then D.
At each step, the decisions make sense. My question is though, what if PRL over four days is greater for D compared to A? Then, studying A cards the first day doesn’t make sense.
This can happen if A is a high S card, no? The first day’s R drop is high but the later days it’s not so much. @expertium Thoughts? This was bugging me all day but I’m hopefully worrying about nothing.
Edit: Maybe I should’ve taken Math. My Biology isn’t helping me today.
This could never happen because of the shape of the forgetting curve. If it’s losing more retrievability in the next one day, it’s losing more in any future time span.