Ordering Request: Reverse Relative Overdueness

rich70521 · September 30, 2024, 2:19pm

I keep trying to wrap my head around this and I’m not convinced this should be the thing we are using as the main metric. The name is misleading for one, if you actually want to see a “Total Remembered” type stat, you’d want the code to be something like

total_remembered = (card["retrievability"] > desired_retention).sum()

Not sure just adding the retrievability scores is easy to understand or is that useful.

Edit:

sorata · September 30, 2024, 2:19pm

It’s workload:knowledge which is also what we use elsewhere like in CMRR or “knowledge acquisition rate” in Stats.

DerIshmaelite · September 30, 2024, 2:21pm

So what do you suggest be new criteria for the simulation to ensure robustness of the simulation, so that Sherlock and Expertium get an intelligible idea of what is to be done. This discussion has digressed a lot, I feel, and I think Sherlock is having a meltdown.

sorata · September 30, 2024, 2:21pm

How is that not? FSRS can not tell you whether you’ve forgotten something or not. It only gives you a percentage. It’s similar to asking a person how many tails they got in coin tosses. You take the number of tosses and multiply with 50% which is the probability.

rich70521 · September 30, 2024, 2:22pm

Hopefully they don’t feel obligated to follow it too closely. I don’t have any expectation they perform any of my requests.

rich70521 · September 30, 2024, 2:25pm

True. Yeah, you all might be right.

rich70521 · September 30, 2024, 2:27pm

This is what I was suggesting:

But I’m not sure that I’m right. Total R may be the best.

sorata · September 30, 2024, 2:29pm

The way we define it elsewhere is different. It’s taking a flashcard and multiplying it with the R value. Doing it for every card and summing them up (or average R × total cards). Similar to how you would do in a coin toss situation.

rich70521 · September 30, 2024, 2:31pm

I guess the reason this feels more right to me is because if your entire deck is Retrievability > DR, then you have zero cards to study today. Sure, that doesn’t technically mean you “remember” or “don’t remember,” but we do kind of treat it that way. We aren’t studying cards above the DR at all, ostensibly because that’s the level of R that we’re satisfied with. That’s the level we consider ourselves “knowing” the information.

rich70521 · September 30, 2024, 2:33pm

Are you sure something is being multiplied here? I think flashcards just have an R, they’re not being multiplied by R. Here’s the code for it:

total_remembered = int(card["retrievability"].sum())

It’s just summing up each card’s R value.

You also said “elsewhere”, so not sure I’m following.

Expertium · September 30, 2024, 2:35pm

It’s equivalent to taking the average retrievability and multiplying it by the total number of cards

sorata · September 30, 2024, 2:39pm

Each flashcard is one card so you’re multiplying 1 with the R value. The word “elsewhere” means the FSRS wiki in GH.

rich70521 · September 30, 2024, 2:39pm

Right

The sorting will be the exact same.

DerIshmaelite · September 30, 2024, 2:40pm

Well for starters I thought that backlog size might affect how different sorting methods work because of how fast it takes to work through the backlog.

You can imagine a 10000 card backlog.

You could only do 1000 cards a day
You have a true retention of 90% from those 1000 cards. That means 900 cards have passed, and for simplicity’s sake the remaining 100 cards are sent to the next day.
This means that the backlog has been reduced down to 9100. But the new cards that are due the next day is another factor of influence. The backlog may then end up >10000 card large again or less. It depends

I think it is hard to get a completely robust simulation out of this covering all variables.

1- Backlog Size
2- Number of New Due Reviews per day
3- Average True Retention per day
4-New cards per day

Those are the variables I could think of

rich70521 · September 30, 2024, 2:40pm

Gotcha. Agreed.

rich70521 · September 30, 2024, 2:44pm

This should be the exact same thing as
Potential Retrievability Loss (PRL) = R(Today) - R(Tomorrow)

R^3/S would give you it for the continuous distribution, but cards aren’t scheduled continuously, they’re scheduled discretely in 1 day increments. So R(Today) - R(Tomorrow) gives you that discrete distribution calculation.

rich70521 · September 30, 2024, 2:54pm

In the simulation they have this at the beginning:

learn_limit_perday = 20
review_limit_perday = 80

I tried to do that in Anki, I’d get this warning:

If adding 20 new cards each day, your review limit should be at least 200.

Is that warning obsolete, or should the sim be changed? Should it be 10/100 or something?

Edit: or was that intentional to keep a backlog maintained?

sorata · September 30, 2024, 2:56pm

Normally if you’re learning 1 new card everyday you should have a daily load of 10 cards.

It was intentional, yes.

It’s also kind of an obsolete guideline. We currently should be using simulation to determine the expected daily load.

sorata · September 30, 2024, 3:15pm

@rich70521 I don’t think your formulae is very robust either. I’m finding a very hard time thinking about it but let me explain one pitfall.

Say, you review four set of due cards A, B, C and D at four different dates. Now, all of them were due today but you cannot review them all at once with the set review limit.

Now, if I try to apply your PRL formulae, say, I review the cards in order of A, B, C, D. That is, each day I compare PRL in relation to tomorrow, then I study A first. Then B, C, D remains and I compare PRL next day and study B. Then I study C and then D.

At each step, the decisions make sense. My question is though, what if PRL over four days is greater for D compared to A? Then, studying A cards the first day doesn’t make sense.

This can happen if A is a high S card, no? The first day’s R drop is high but the later days it’s not so much. @expertium Thoughts? This was bugging me all day but I’m hopefully worrying about nothing.

Edit: Maybe I should’ve taken Math. My Biology isn’t helping me today.

rich70521 · September 30, 2024, 3:22pm

This could never happen because of the shape of the forgetting curve. If it’s losing more retrievability in the next one day, it’s losing more in any future time span.

Topic		Replies	Views
Several FSRS-related suggestions FSRS	167	2294	October 29, 2024
Due Column - Changing Days (from Whole Numbers to Decimals in Scheduling) Suggestions	60	439	January 8, 2025
Non-SRS Custom scheduler? There's very little info about it Scheduling	16	314	December 22, 2024
Clarify what optimal retention means FSRS	77	3383	January 9, 2025
FSRS 5: <1d Scheduling and Learning Steps FSRS	168	2934	December 25, 2024

Ordering Request: Reverse Relative Overdueness

Related topics