Automated leech detection

Given the card’s history, we can either store or re-calculate the probability of recall predicted by FSRS, and then use the Poisson binomial distribution to calculate the probability of a given number of successes.

I am not even going to try to understand the math with complex numbers, but the usage is actually fairly simple. You just give it a list of probabilities for each trial and the number of successes, and then you can calculate the probability of this many successes or even fewer.
Example: p = np.asarray([0.9, 0.85, 0.95, 0.92, 0.87])
n_succ = 2
This gives me a p-value of 0.836%. So if a card has been reviewed 5 times with these probabilities (note that the order doesn’t matter) there is a 0.836% chance that 2 or fewer reviews will be successful.

We can identify leeches with as few as two or three reviews!
For example, if the probabilities of recall are 90%, 92% and 93%, then the probability of getting the card right zero times is 0.056%. The probability of getting the card correct once or zero times is 1.948%.
The higher desired retention, the higher the probabilities, the faster we can identify leeches. At DR=95% we can identify leeches with merely 2 reviews! Btw, the probability of 0 successes if both reviews had a 95% of success is 0.5%.

Btw, I think 1% is a reasonable cutoff. If a card has been failed so much that the chances of it happening (or having even more fails) normally are <1%, I think it’s most likely a leech.

EDIT: I came up with a good way to correct the threshold to ensure that we don’t tag too many cards, but that is beyond the scope of this topic.

There are 2 challenges:

  1. Implementing this mathematical function in Rust.
  2. Storing or re-calculating R for every review.

Then we can add a “Automatic leech detection” button here as an alternative to “Leech threshold” when FSRS is enabled.

Now the big question is: do we want a “Recalculate leeches” button if automatic leech detection is enabled? :thinking:
Since changing FSRS parameters will change retrievability at the time of the review, which in turn can change whether the card counts as a leech or no.

@L.M.Sherlock

Also, I asked Claude 3.7 Thinking to re-write it in Rust and remove the calculation of p-values (I calculate them from the PMF) and CDF, leaving only PMF. Idk if it’s any good, but so far Claude 3.7 Thinking has been really freaking good, at least for Python.

Btw, this repo uses Fast Fourier Transform to calculate the probabilities approxiamtely, but me and Alex tested using the exact (combinatorics) method and found that for n=64 reviews it’s fast enough that we don’t need FFT, so the Rust code uses the exact approach.

7 Likes

That’s really a very good idea way more fitting than the old one (counting lapses) for FSRS

1 Like

Does the detector consider the same-day reviews?

1 Like

No, since we need to use probabilities predicted by FSRS.

1 Like

@dae I want to bring your attention to this
Right now we have two ideas for a new leech detector: this (with a little bit of extra math and rules not mentioned in this topic) and a machine learning based detector. The latter would require thousands, if not hundreds of thousands of manually labeled (leech/not a leech) cards, so that is not going to happen.

The problem with my idea is that we won’t know how well it works until we try it. Jarrett cannot implement it in the Helper add-on first.
image

And he doesn’t want to do all the work of implementing a leech detector only for you to not merge the PR.
image

So ideally I’d like you to say “Sure, we can test this idea in a beta and/or as an experimental feature that can be removed if it doesn’t work well”, and then Jarrett would (hopefully) feel motivated to do it.

EDIT: @rossgb implemented it (not in Anki itself): GitHub - rbrownwsws/leechkit
We’re currently testing it

1 Like

I had an attempt at graphing the cards’ percentages:

Here’s how I implemented it if someone wants to check it:

Doesn’t seem to work well in its current form.

1 Like

Are you sure you aren’t using same-day reviews and the first review?

Also, I can’t verify the code, so idk if it’s implemented correctly. I guess you can give me probabilities for a given card and the output of your function, and I’ll see if it matches mine

Can you give me a list of probabilities for any card, and what your function outputs for those?

I don’t know TypeScript or any Anki-specific code

Sorry this is the best I can do:
I think I forgot to multiply the percentages by 100 :sweat_smile:

50% for this review history:

image
Seems about right

1 Like

Btw, Jarrett said “Anki doesn’t provide the API to calculate the historical retrievability”. Maybe you can help him? Then we could try the leech detector out using the Helper add-on.

I think he’s got that covered.

2 Likes

User would avoid wasting time on failing. When user is alerted they can improve the card or put more effort in or drop it. This kind of like the thing the first paragraph should say for us layman.

The higher desired retention, the higher the probabilities, the faster we can identify leeches.

I don’t get this part..

My reasoning is that if the DR is higher, the card will be shown sooner, so the probability of getting is wrong is lower. What makes you think otherwise?

If a card is a leech, it will be failed more often than FSRS predicts. That’s how we define leeches with the new detector. So yes, the probability of recall will be higher at higher DR, but since leeches have a lower p(recall) than FSRS predicts, they will be failed more often. So depending on how much lower it is exactly, it’s possible that leeches can be identified faster at higher DR because you will do reviews more frequently, so the necessary information for the detector will be gathered faster.

Anyway, @dae, we figured out all the math, and I already wrote a detailed specification of the leech detector for @jakep (which he may or may not decide to implement, lol), but there is a problem. The current leech tag works on a per-note basis, meaning that all siblings get tagged as leeches. This is undesirable, and even more undesirable with the new detector. So we need to use a flag instead of a tag for marking individual cards as leeches. Ideally, both the old lapse count detector and the new one should use a flag instead of a tag.

(writing this made me realize how confusing the whole tag vs flag thing is)

@L.M.Sherlock

1 Like

Do we really need to use flags or tags? Can’t we just store the p value in the data column (just like FSRS memory states) and then Anki would show all cards having p < 0.05 (or whatever you choose) when searching is:leech?

Just like FSRS memory states, the stored p value will be updated on each review and each time the FSRS parameters are updated.

This approach will also allow the user to search cards by defining their own threshold for the p value.

2 Likes

Then we have to explain probabilities and whatnot to users instead of a simple “This card is a leech” thingy. I’m trying to keep things simple. Btw, that also includes no settings for the automatic leech detector. It will be a black box with a toggle. We’ll need to choose a leech threshold, like 1% or 2% or 5%.

We can make it possible to search for cards based on their p(leech) so that power users can do power user things, but the detector should work purely automatically, so that most users can just turn it on and forget about it.

1 Like

You don’t have to explain anything to a basic user.

  • The basic user would search is:leech in the Browser and Anki will “internally” search for cards with p < 0.05
  • The advanced user can search for prop:leech-p<0.01

That’s pretty much what I’m saying. The detector adds and removes the leech flag automatically, and the user doesn’t need to think about it. If the user wants to think about it, he can do something like prop:leech-p<0.01

If we don’t use flags/tags to mark leeches, that’s a net loss of functionality