Pass/Fail Grading as Default

Expertium · May 25, 2024, 5:31pm

Actually now that I think about it, this data only shows us that using Easy/Hard too much might be bad. Moderate use of those buttons still might have a positive effect on scheduling so dae maybe isn’t wrong.

I’m not sure what makes you say so. The graph shows that RMSE is lower for two buttons users for almost any threshold, from 2.5% Hard+Easy all the way to 42.5% Hard+Easy.

sorata · May 25, 2024, 5:34pm

Oh sorry I didn’t look at the graph. I just read the Text you posted here. If what you say is true, then Again/Good being default makes the most sense to me. (Funny how SuperMemo after years of research uses 6 buttons and we’re talking about 2 buttons)

Edit: The meta once was use 2 buttons because of ease hell, then it became you can use 4 buttons becaure ease hell is solved and now we’re back to 2 buttons.

guillempalausalva · May 25, 2024, 5:48pm

Recent versions of Supermemo algorithm had lifted weight on user grades to combat user bias. Grades within pass or fail (3 each) are still taken into account, but there are other factors that have a greater impact such as the priority set during review.

sorata · May 25, 2024, 5:49pm

If I’m not wrong the graph is showing the less Hard+Easy is used the less the RMSE is? Dang that’s crazy.

suiyuan · May 25, 2024, 6:38pm

Sorry, but I think I don’t totally understand the graph. What exactly is meant by threshold here? Percentage of hard/easy usage? Does it apply only to 4-button users then?

Expertium · May 25, 2024, 6:45pm

I gave an example. Here’s a step-by-step explanation:

Calculate how often the user uses Hard, in %
Calculate how often the user uses Easy, in %
Add them together
If the sum exceeds the threshold, put the user into the “four button users” category, else put him into the “two button users” category
Repeat steps 1-4 for many different values of the threshold, to get the full picture

nmjkjm · May 25, 2024, 7:43pm

You guys are missing the point I made before. You cannot “put” ppl in 4 button vs. 2 button groups based on thresholds of use. These groups are systematically different, therefore the conclusion does not hold that 2 buttons are more accurate to 4 buttons. You would have to take users who are in the 4-button group, and randomize them into 2 groups, 1 that is forced to use 2 buttons and one that continues to have 4 buttons.

This is similar to what was done in SM when there was a change in buttons.

This current analysis will miss a degrading in the performance of the predictive algorithm due to differences in 4-button and 2-button users, which not only includes experience (more advanced users), but also complexity of the material they are using, etc.

I’m not sure what you mean by “priority set during review”, I believe the priority set for items has no impact on scheduling intervals.

Because that analysis was done correctly (and it now uses 5 buttons)

Expertium · May 25, 2024, 7:55pm

You would have to take users who are in the 4-button group, and randomize them into 2 groups, 1 that is forced to use 2 buttons and one that continues to have 4 buttons.

Well, I cannot do that, but I could take some collections, randomly replace some Hard and Easy with Good, and see whether it will degrade the performance of FSRS. Spoiler: yes, it will.

guillempalausalva · May 25, 2024, 8:13pm

I mean the priority of the element when you do the review (repetition in SM jargon); as priorities are dynamic and the element will slightly change the priority as soon as the user grades the recall.

It does. Just to name a few, a very low priority item will get higher intervals directly, or indirectly due a higher A-factor increase.Lapsing a very high priority item will get a lower next interval than a low priority item (can be days or several weeks), etc.

guillempalausalva · May 25, 2024, 8:15pm

Just for the record, SuperMemo has 5 buttons on the interface but the rating 0 (null recall) can still be used by the keyboard shortcut, so the latest algorithm SM-18 still has 6 grade options.

The removal of the button for grade 0 was to discourage users to overuse it.

nmjkjm · May 25, 2024, 8:28pm

I had no idea there was a hidden button! My guess is that it is practically not known by anyone and therefore rarely used.

In anki, it would not make much difference if the default view only showed 2 buttons and one could turn on 4 buttons once in the settings, and never have to worry about it again. But I suspect that ppl not using anki is not due to 2 vs. 4 buttons, there are many things suggested which I doubt would have much effect on retaining users if all the proposed changes which are supposedly affecting retention were to be implemented.

sorata · May 25, 2024, 8:29pm

So now what that might mean? Also Hard is often pressed in stead of Again so there’s that.

Okay but it still shows average 4 button user shifting to 2 button use will only benefit him. Or you’re claiming that people who use 4 buttons will continue to see higher RMSE?

nmjkjm · May 25, 2024, 8:32pm

It does not show this. That’s a misinterpretation of the data.

sorata · May 25, 2024, 8:36pm

Okay so what’s causing it? You mentioned difficulty of material but what does this has anything to do with difficulty of material?

sorata · May 25, 2024, 8:41pm

Hey @Expertium saw someone quoting this from FSRS wiki’s FAQ, Shouldn’t this need to be changed?

A12: Yes. FSRS is about equally accurate for people who rarely use “Hard” and “Easy” and for people who use all 4 buttons a lot. However, this is not the final conclusion, and as we gather more data, this conclusion may change.

Expertium · May 25, 2024, 8:50pm

github.com

open-spaced-repetition/fsrs4anki/blob/main/docs/tutorial.md#faq

中文版请见：[Anki 新算法 FSRS 配置指南](https://zhuanlan.zhihu.com/p/664758200)

# Table of contents
- [The Ultra Short Version](#the-ultra-short-version)
- [Step 1: Enable FSRS](#step-1-enable-fsrs)
- [Step 2: Configure FSRS settings](#step-2-configure-fsrs-settings)
- [Step 3: Find optimal parameters](#step-3-find-optimal-parameters)
- [Step 4: (optional) Evaluate the parameters](#step-4-optional-evaluate-the-parameters)
- [Step 5: (optional) Compute optimal retention](#step-5-optional-compute-optimal-retention)
- [Step 6: (optional) Custom Scheduling](#step-6-optional-custom-scheduling)
- [FAQ](#faq)

## The Ultra Short Version

Are you busy and have no time to waste? Here's a summary of the guide.

1) Go to deck options and enable FSRS under "Advanced" ("FSRS" in Anki 24.04), at the bottom of the deck options window.
2) Ensure that all your learning and re-learning steps are shorter than `1d` and that all steps can be completed on the same day. `23h` is not recommended because, while it's technically less than one day, it's very unlikely that you will be able to finish this step on the same day as your first review.
3) Click the "Optimize" button under the "Optimize FSRS parameters" section. The optimal parameters will replace the default parameters automatically. Parameters are preset-specific. If an error message pops up, it means you have less than 1000 reviews (400 in Anki 24.04) across all cards that this preset is applied to. In that case, just use the default parameters; it's still better than using the legacy SM-2 algorithm.
4) Choose a value of desired retention: the proportion of cards recalled successfully when they are due. **This is the most important setting in FSRS. Higher retention leads to shorter intervals and more reviews per day.** 80-95% is reasonable, 90% should work fine for most people.

This file has been truncated. show original

L.M.Sherlock · May 26, 2024, 4:27am

A supplement for current suggestion:

This correlation appeared to be weak due to the fact that all users tend to deploy their own grading systems, which is often inconsistent.
…
In that light, two grade systems would have the exact same effect on the algorithm as the six grade system.
…
Grade-retrievability correlations are also collected, however, their weight is negligible.

Source: First data-driven spaced repetition algorithm: Algorithm SM-8 - supermemo.guru

guillempalausalva · May 26, 2024, 3:59pm

For anyone curious, if Anki works best for pass/fail with grades 1 (again) and 3 (good), in SuperMemo the equivalent is 2 (fail) and 4 (good).

Here I explain it in more detail (time stamped) https://youtu.be/P22ig_erHoE?t=622

nmjkjm · May 27, 2024, 11:45am

I don’t have enough familiarity to know which buttons to replace with which (@sorata, the idea is to compare the result of that with the 2-button users), but is this even possible? Because changing a grade affects when the card would next be displayed, which is no longer in the dataset, so one doesn’t know which grade the user would have selected on that day since it didn’t take place.
In other words, any changing of grades affects future dates, which then affects the next date, etc.

sorata · May 27, 2024, 11:50am

Why you pinging me though? I understand what’s being attempted and what issues the other person raises. The question is “What if the 4 button users start using 2 buttons? Would the RMSE improve or not?”

One very flawed way of doing that might be actually asking people to change their behaviour, preferably in the same deck they’ve been using before, then compare before/after results. But we can’t possibly do any of that. It would also be nice to have a control group and make everyone learn similar material, etc etc.

What the other person suggested doesn’t work for the reasons you stated.

Topic		Replies	Views
Option to have hard be a passing grade for the last, or single, (re)learning step Suggestions	13	118	August 21, 2024
'Pass/Fail' ONLY, 'Ease' reset to 250 DAILY — FSRS not for me, right? Help	7	994	February 14, 2024
[Major Feature Request] A new card type Suggestions	7	307	May 12, 2024
Pass/Fail Review: Automatic difficulty rating based on answer time (beta ver.) [Official Thread] Add-ons	0	56	January 2, 2025
FSRS Med School Anki (in house + AK step) FSRS	17	975	September 6, 2024

Pass/Fail Grading as Default

Related topics