This is probably the 3rd or 4th time I speak of this. Ever since I started a new collection for my new phase of studies, I have consistently made sure I clear all backlog for about a month now.
And over this past month (since the arrival of FSRS 5 in the first beta release), I have consistently scored lower true retention scores with a desired retention of 95%, so it is not just a one-off. I thought that having a backlog (especially a backlog consistently 20000+ big) was the most probable reason for my scoring lower scores for my previous collection.
I don’t know how I could explain this. Could it be that the cards that are due much harder than the the entirety of my collection, such that my true retention scores don’t actually reflect the actual overall retention of the collection?
I wouldn’t be surprised. The benefit of having several presets is that parameters are more individualized, but it also means less data per preset. I suspect in extreme cases - like yours - the latter is outweighing the former.
I suggest you to try combining presets: put two decks that were previously under two different presets under the same preset, then Optimize, then Evaluate. If the RMSE after combining them is lower than the average of RMSE before, keep one preset.
Example:
Before: preset A, RMSE=5%, preset B, RMSE=6%, average=5.5% (for simplicity, assume equal number of reviews in both)
After: preset A+B, RMSE=4%
This will be time-consuming, but I don’t have a better idea in mind.
On one hand, I am happy that I have found a probable cause.
On the other hand, I am distraught because now I think I have ***** myself over. And I don’t know how to fix this mess.
I have 240+ different presets and hundreds of decks.
I don’t know which preset combination is the best and what is the best number of cards in a deck for it to have a preset
Well, I wish you patience, you’re gonna need it to combine 240 presets.
As for the optimal number, let me ask you: how many major categories of material do you have? Like, for example, I have:
English
Japanese
Geography
Science
Random stuff like birthday dates and some statistics and trivia
I further broke Japanese and English into several presets, so I actually have more.
But anyway, I doubt that you have more than 5-10 major categories.
If you are speaking about subjects, I am in the 5. Semester and so far I have had about 20 subjects. But each subject is large enough that there is no uniform way to define a “category”. For me, I have selected a “chapter” in a reference book to be the “category”
Btw, to be more rigorous, I suggest using a weighted average
(N reviews(preset 1)*RMSE(preset 1) + N reviews(preset 2)*RMSE(preset 2)) / (N reviews(preset 1) + N reviews(preset 2))
If the RMSE of the combined preset were within the RMSE of deck 1 and deck 2, it would make sense to count something (8.22 - 9.3).
Obviously, the average of the two presets cannot be lower than 8.22 - 9.3.
@Expertium So given that I don’t have that much time to sieve through 240 different presets and trying and testing, I have decided I will give each subject a preset and save the preset to all their subdecks.
Giving the entire subject a uniform preset seemed to have improved RMSE by a lot (for some subjects the RMSE didnt improve by much or at all).
Problem is that some of these parameters really give me the creeps and makes me cringe.
For example, in my Biology deck, I find 2.5867 as first param to be extremely high.
The reason behind me assigning a seperate preset for chapters is that chapters have different difficulties.
But it seems that FSRS really likes to have a lot of reviews on its hand for the RMSE to lower down (well…duh).
But I don’t know how to balance that. This is a makeshift solution, but I really wish there is a more sophisticated way of assigning presets sensibly.
It seems that my average interval size has decreased heavily and now I am left with a huge backlog of cards (which makes sense given that my retention is conistently low, which means that I am consistently not doing reviews that I should have
Average Difficulty stays the same, which is…weird…
It seems that decks that Subjects that I am learning post arrival of FSRS 5 have better RMSEs than Pre-arrival. I dont know if these are just outliers, but interesting observation nonetheless.
Then based on this, it appears that I was on the very, very extreme end of preset insanity. Any combination of 2 presets that I do helps decrease the RMSE.
Ergo, I have artificially inflated my RMSE.
I am now more eager than ever to know the actual threshold of cards in a deck that deserves a preset so that it doesn’t artificially inflate RMSE (Something like a preset for every 1000 cards, or a 20 presets for 10000 cards or something), even more so for a mechanism that does it automatically.
In my case, calculating for different preset combinations is too time-consuming. I think I will stick with a preset per subject basis, though this may be on the other extreme end (too few presets per cards).
There has to be a sweet spot between number of presets and total number of cards without it damaging the RMSE.
My main concern about presets is that they often overlook appropriate intervals for very difficult (outlier) cards, so they are having a very hard time to schedule appropriate intervals for them, hence my behaviour of setting a preset for every single deck.
Maybe if you move the difficult cards to a separate deck and give it a preset optimized on the more difficult cards in the decks, while the deck with the other cards uses a preset optimized on both decks?..
Although with 24.10rc1, I have -is:suspended -is:new prop:d>=0.95 = 1009 cards -is:suspended -is:new = 59717 cards