Continuing the discussion from Big update in FSRS4Anki v3.0.0:
This week, I updated FSRS4Anki from v3.0.0 to v3.6.0. Here is a summary of the new features.
Scheduler
- Add
fuzz
to prevent cards introduced simultaneously and given the same ratings from sticking together and always coming up for review on the same day.
Optimizer
- Add
analysis
of review logs to count the true retention more accurately.
- Add
suggested retention
to minimize the repetitions in each card.
Simulator
- Add a comparison between FSRS and Anki’s built-in scheduler.
- Add
leech threshold
and leech action
to suspend and count leech cards.
Helper
- Reschedule cards in the collection or a specific deck.
- Add support of
easy bonus
and hard interval
.
FSRS4Anki v3.6.0 has been released at:
8 Likes
This is cool! I’m looking at the new suggested retention graph you added:
terminal stability: 637.63
0%| | 0/10 [00:00<?, ?it/s]
expected_repetitions.csv saved.
-----suggested retention: 0.86-----
Your output says that the suggested retention is 0.86
, but in the graph, isn’t the lowest amount of repetitions the line d=1, r=0.88
? Why isn’t the suggested retention 0.88
?
I also notice there’s d=1, r=0.88
and d=4, r=0.88
, what is d?
Is it the difficulty? I’m not sure how to interpret the graph and verify that 0.86
is the optimal value
EDIT: I looked at the code and the exported expected_repetitions.csv
file. I think I understand now. The graph is showing the optimal retention for each of the difficulty values. And the suggested optimal retention is just the average of all of the optimal retention values for each of the difficulty values. Cool
3 Likes
Shouldn’t it also take into account the number of cards, as opposed to just assuming that there is an equal number of cards in each difficulty group? I imagine there are many more cards with d=1 than with d=10.
Edit: I see that the optimizer now(?) also counts the distribution of difficulty. Very nice, and I was very off with my above guess.
1 Like
It depends on materials and learner’s habits. I will improve it in next few versions.
3 Likes
Having thought about it again and analyzed the nice graph that fsrs4anki_optimizer.ipynb
generates at the end, I have an additional thought:
The overall time spent on difficult cards is dozens of times higher than the time spent on easy cards (considering that they require more repetitions as well as more time per repetition).
The goal is to minimize overall review time, right? You don’t process information about the time per repetition, but you can approximate the goal by minimizing the number of repetitions – and perhaps skewing the target retention a bit in favor of difficult cards, based on the assumption that they take more time per repetition.
Perhaps the mathematical way to find the point of minimum repetitions is to take the average difficulty of all cards and calculate the most efficient target retention for this average difficulty?
I hope my thoughts are not too obvious or useless, lol. I don’t understand how the suggested retention is currently calculated.
@kuroahna, it is actually not just the average of all ten values, at least in version fsrs4anki/blob/v3.7.0/fsrs4anki_optimizer.ipynb
:
The average is 0.83, the suggestion is 0.85. Unless the difference is due to rounding.
In v3.7.0, the optimizer calculated the distribution of difficulty, and apply it to weighted average.
3 Likes
it is actually not just the average of all ten values, at least in version fsrs4anki/blob/v3.7.0/fsrs4anki_optimizer.ipynb
Ya, you’re looking at v3.7.0. The version I was looking at was v3.6.0 which has the code
print(f"\n-----suggested retention: {sum(optimal_retention_list)/len(optimal_retention_list):.2f}-----")
In v3.7.1, this was changed to take the inner product (dot product), so that it’s weighted based on the optimal retention for each difficulty group
print(f"\n-----suggested retention: {np.inner(np.array(difficulty_distribution), np.array(optimal_retention_list)):.2f}-----")
1 Like