Pass/Fail Grading as Default

To add to this, just yesterday a student came to me to ask something about Anki. I had already explained to him how to use Anki 6 months ago and he uses it regularly with decks premade by me (including explicit instructions about only using again and good, both when I explained it and within the cards).

Well, while talking to him he reviewed a card in front of me and he pressed Easy. I was surprised and asked him why. And he said he always pressed Easy if he knew the answer.

What I mean is, the visual cue of having 4 buttons is strong. And the color/position of them (Easy being the opposite of Again, for example) has a tremendous influence on people’s interpretations.

I believe 2 buttons is the way to go for default settings.Any user that uses the 4 buttons correctly or understands how they work correctly would be able to activate them on the settings without problems. The opposite, people who don’t understand the system, would have trouble even changing from 4 to 2 buttons (via and option or an addon).

3 Likes

I also used to do the same when I started using Anki. Only when the intervals started getting too long, I realised that I should have pressed Good rather than Easy.

3 Likes

I’m linking here to a post in another thread where I shared some other thoughts that I think are relevant to this conversation: The reason the "forgot" button is incorrectly named "again" - #17 by suiyuan

I don’t believe it’s correct to compare statistics for those who use 2-buttons vs. those that use 4, because these are 2 different groups of ppl. The ppl who use all 4 prefer to use 4. If they were required to use 2, I’m fairly certain the algorithm accuracy would degrade.

The most data historically is for the SM algorithm, which uses 5 buttons. It used to use 6. It only went to 5 when there was extensive data and analysis to show that there was a minimal if any difference. I highly doubt that this would hold up if going from 4 to 2.

The other thing is that there isn’t a priority queue in anki, which is a factor for large, complex collections. This can be offset by having 4 buttons.

If someone wants 2 buttons, they can just use 2 buttons even if 4 are visible?

This is moot with FSRS, even if he is using Easy the algorithm will adjust.

1 Like

This makes me angry honestly. I just can’t understand how can someone not check out the Deck options? lol I didn’t see the dates. Last year’s September, back then FSRS wasn’t that famous. Most people I saw online had no idea what FSRS was. Understandable.

Yes but does FSRS come pre-enabled with Anki? If that was done then maybe it wouldn’t be too problematic. Hey @Expertium can you verify this statement? Also now that you can do optimisation with only a few reviews, how correct it is to say that if someone uses Easy like Good it won’t create problems even at the very beginning? I imagine using Hard instead of Again would be different case though.

New users don’t have such preferences. Neither do they understand the need of having such.

1 Like

That might be true, but

  • he doesn’t have FSRS enabled and having to activate it when starting using Anki would be adding complexity to the process (following instructions they don’t understand)

  • this doesn’t help with decision fatigue

  • having 4 buttons makes you think you should use them, and it’s not clear at all that Hard and Easy should be rarely used, as dar pointed out

That’s only true for people who had the time/willingness/patience to try to understand how Anki works in a deeper level and what’s most suitable for them. Otherwise they will use them just because they’re there and they seem equally important.

Maybe @L.M.Sherlock has some info about that. But with how inconsistent the use of 4 buttons is, I dare predict the difference won’t be very big, if any.

Which statement?

If you mean 4 buttons vs 2 buttons, here: RMSE and button usage - Album on Imgur

The RMSE values are somewhat outdated (we changed the methodology), but the overall trend should be the same

suiyuan said one of his students use the easy button just like you’d use the good button. My question was does this become a non-issue if you use FSRS? I can also ask by extension what if I never use good /easy but instead use only hard/again.

Tbh, I’m not sure how well Again+Easy and Again+Hard would work. I’m assuming FSRS would adapt, but I don’t have the data to back it up.

You will at least lose your normal “Difficulty” job. Why would you do that?

True, but the trend is that FSRS would be enabled as default at some point in the future.

With a small amt of regular use and reviews, intra-individual consistency becomes high. Ie someone who has a certain level of recall of 2 different items will score it the same way. This is also affected by suboptimal formatting of cards, which will improve as ppl get used to anki.

@sorata, as I suggested these groups are systematically different so the lower RMSE does not hold up for 4 buttons vs. 2 buttons.

No I’m not doing that lol. I’m just saying in case somebody does it will having FSRS reduce the damage. The point was brought up by others actually.

@sorata @nmjkjm @suiyuan @Keks @L.M.Sherlock @dae I re-did the analysis with new RMSE values, here: FSRS-4.5 - Album on Imgur
The conclusion is the same, but I also added another interesting comparison.

EDIT: give it some time, for some reason imgur keeps cutting off my text.
EDIT 2: ok, I don’t know what’s happening, imgur text is just broken.

Text:

All users were put either in the “two buttons group” or in the “four buttons group”. If the % of times the user used Hard + the % of times the user used Easy exceeded the threshold, the user would be put in the “four buttons group”, otherwise in the “two buttons group”.
Example: a user pressed Hard 5% of the time and Easy 10% of the time. The threshold is 12%. 0.05+0.1 > 0.12, hence this user belongs in the “four buttons group”.
Then I tried lots of different thresholds (x axis) and plotted the RMSE values of both groups. The green area indicates statistical significance, meaning that if the curves are in the green area, the difference between them is not a fluke (p-value<0.01). If the curves are in the white area, the difference between them might be a fluke.

FSRS is more accurate for users who only use two buttons (lower RMSE = better).

I also put users into 3 different groups: those who use Again and Hard, those who use Again and Good, and those who use Again and Easy 95% (or more) of the time, and use the other two buttons <=5% of the time.
The difference was statistically significant (p-value<0.01) for Again+Hard vs Again+Good and for Again+Easy vs Again+Good, but not for Again+Hard vs Again+Easy.

Oh, and of course most users were not included into any of those groups.

EDIT 3: I made a Reddit post: Reddit - Dive into anything
EDIT 4: screw imgur, here are the images


5 Likes

So, it is correct to assume that FSRS will be less efficient for the longer 4 options have been used? Personally, I used 4 options since I started in 2009, the only restriction was when Anki didn’t had the hard option for learning state cards for quite some years.

1 Like

If you have been using 4 buttons for so long, it would be unwise to switch at this point. FSRS is trained on your review history, so you would have to accumulate lots and lots of new reviews before you see the benefits of using 2 buttons.
EDIT: You could use “Ignore reviews before” [date], but it will still take you a lot of time to accumulate a lot of reviews. I suppose long-term it could be better though. The thing is, I don’t know how consistently people use 4 buttons. I suspect that the problem is not with FSRS and the buttons themselves, but with people using them inconsistently. You know, someone presses Easy but then they actually end up remembering those cards worse than Good cards, that kind of stuff.

3 Likes

Only if you assume they still have those very old decks. For new decks this would work.

Anyways, we now have a good reason to have Again/Good as the defaults. Dae argued “it would somewhat hamper the scheduler’s performance, FSRS or no.” which doesn’t work now. I still think using Hard/Easy occasionally would help. I use Easy probably on 1%-2% of my cards. Wait. Maybe, and this is a maybe the reason FSRS is performing better for 2-button users is exactly because we’re conservative when it comes to pressing Hard/Easy. I don’t know. Just a guess.

Yes that might be the reason. People who rarely press Easy would only use it when he actually feels like the review was too easy for him. I think that’s what happens with me. I never really think about Hard/Easy buttons but when something’s too easy I do press Easy button. Paradoxically enough, not thinking about Easy might be making my Easy gradings more accurate.

On the other side, people who use Hard/Easy heavily might be grading things badly.

Actually now that I think about it, this data only shows us that using Easy/Hard too much might be bad. Moderate use of those buttons still might have a positive effect on scheduling so dae maybe isn’t wrong.