Ordering Request: Reverse Relative Overdueness

I keep trying to wrap my head around this and I’m not convinced this should be the thing we are using as the main metric. The name is misleading for one, if you actually want to see a “Total Remembered” type stat, you’d want the code to be something like

total_remembered = (card["retrievability"] > desired_retention).sum()

Not sure just adding the retrievability scores is easy to understand or is that useful.

Edit:

It’s workload:knowledge which is also what we use elsewhere like in CMRR or “knowledge acquisition rate” in Stats.

So what do you suggest be new criteria for the simulation to ensure robustness of the simulation, so that Sherlock and Expertium get an intelligible idea of what is to be done. This discussion has digressed a lot, I feel, and I think Sherlock is having a meltdown.

How is that not? FSRS can not tell you whether you’ve forgotten something or not. It only gives you a percentage. It’s similar to asking a person how many tails they got in coin tosses. You take the number of tosses and multiply with 50% which is the probability.

1 Like

Hopefully they don’t feel obligated to follow it too closely. I don’t have any expectation they perform any of my requests.

True. Yeah, you all might be right.

This is what I was suggesting:

But I’m not sure that I’m right. Total R may be the best.

The way we define it elsewhere is different. It’s taking a flashcard and multiplying it with the R value. Doing it for every card and summing them up (or average R × total cards). Similar to how you would do in a coin toss situation.

I guess the reason this feels more right to me is because if your entire deck is Retrievability > DR, then you have zero cards to study today. Sure, that doesn’t technically mean you “remember” or “don’t remember,” but we do kind of treat it that way. We aren’t studying cards above the DR at all, ostensibly because that’s the level of R that we’re satisfied with. That’s the level we consider ourselves “knowing” the information.

Are you sure something is being multiplied here? I think flashcards just have an R, they’re not being multiplied by R. Here’s the code for it:

total_remembered = int(card["retrievability"].sum())

It’s just summing up each card’s R value.

You also said “elsewhere”, so not sure I’m following.

It’s equivalent to taking the average retrievability and multiplying it by the total number of cards

Each flashcard is one card so you’re multiplying 1 with the R value. The word “elsewhere” means the FSRS wiki in GH.

Right

The sorting will be the exact same.

Well for starters I thought that backlog size might affect how different sorting methods work because of how fast it takes to work through the backlog.

You can imagine a 10000 card backlog.

  1. You could only do 1000 cards a day
  2. You have a true retention of 90% from those 1000 cards. That means 900 cards have passed, and for simplicity’s sake the remaining 100 cards are sent to the next day.
  3. This means that the backlog has been reduced down to 9100. But the new cards that are due the next day is another factor of influence. The backlog may then end up >10000 card large again or less. It depends

I think it is hard to get a completely robust simulation out of this covering all variables.


1- Backlog Size
2- Number of New Due Reviews per day
3- Average True Retention per day
4-New cards per day

Those are the variables I could think of

Gotcha. Agreed.

This should be the exact same thing as
Potential Retrievability Loss (PRL) = R(Today) - R(Tomorrow)

R^3/S would give you it for the continuous distribution, but cards aren’t scheduled continuously, they’re scheduled discretely in 1 day increments. So R(Today) - R(Tomorrow) gives you that discrete distribution calculation.

In the simulation they have this at the beginning:

learn_limit_perday = 20
review_limit_perday = 80

I tried to do that in Anki, I’d get this warning:

If adding 20 new cards each day, your review limit should be at least 200.

Is that warning obsolete, or should the sim be changed? Should it be 10/100 or something?

Edit: or was that intentional to keep a backlog maintained?

Normally if you’re learning 1 new card everyday you should have a daily load of 10 cards.

It was intentional, yes.

It’s also kind of an obsolete guideline. We currently should be using simulation to determine the expected daily load.

1 Like

@rich70521 I don’t think your formulae is very robust either. I’m finding a very hard time thinking about it but let me explain one pitfall.

Say, you review four set of due cards A, B, C and D at four different dates. Now, all of them were due today but you cannot review them all at once with the set review limit.

Now, if I try to apply your PRL formulae, say, I review the cards in order of A, B, C, D. That is, each day I compare PRL in relation to tomorrow, then I study A first. Then B, C, D remains and I compare PRL next day and study B. Then I study C and then D.

At each step, the decisions make sense. My question is though, what if PRL over four days is greater for D compared to A? Then, studying A cards the first day doesn’t make sense.

This can happen if A is a high S card, no? The first day’s R drop is high but the later days it’s not so much. @expertium Thoughts? This was bugging me all day but I’m hopefully worrying about nothing.

Edit: Maybe I should’ve taken Math. My Biology isn’t helping me today.

This could never happen because of the shape of the forgetting curve. If it’s losing more retrievability in the next one day, it’s losing more in any future time span.