Creating decks based on DR/interdependent facts

Hey all.

I’m new to Anki and SRS in general and I’m still in my how-to-structure-my-decks phase.

It feels like structuring my decks around categories of knowledge like “math”, “languages” or “misc” is a dead end.

My problem is the wildly different values for DR I would need for different scenarios.

If I consider 3 different scenarios:

  • Enriching my vocabulary by doing “A Word A Day”
  • Studying words for vocabulary tests over a school year
  • Learning morse code.

For the first one, a pretty low retention is still pretty good. If I have 1000 cards but can only remember half of them, that would still be pretty useful.

On the other hand, studying words for tests would probably fall into the same broad category, but a higher DR than 50% would surely be needed.

The biggest problem is something like the morse code alphabet. It feels like interdependent facts. If you want to write something in morse code, it would be very helpful to know all the letters. And only knowing a handful of letters is virtually useless.

For the sake of argument and simplicity, let’s assume I have one card for each letter (=26). Now, if I want to morse, chances are good I’ll need all the letters. But given a DR of 90% and using the simple cumulative binomial distribution, I would still have a ~50% chance to not know 3 or more letters, and ~10% to have forgotten 5 or more letters. The chance of remembering the whole alphabet is only ~6%.

What would be a good way to manage that? A few things that come to mind:

  1. Create decks based on needed DR instead of conceptual category. “A word a day” would go into something like a 70% deck, test vocabulary into a 90% deck, and morse code and other either important or interdependent facts would go into a 99% deck (giving roughly 80% chance to remember a full morse deck).

  2. Do basically 1), but maybe there are larger sets of interdependent facts for which 99% is not enough or 99% is too much work for everything not super important, so it’s better to add new cards that test the whole interdependent group (or parts of it) after you got the individuals mostly down. For morse, it could be creating cards with single words, a sentence that contains every letter, or even the whole alphabet as a table.

  3. Maybe SRS is not a good way to learn those interdependent, high DR facts at all?

Are any of the above the “right way” to do it? Any other suggestions on how to tackle the problem?

I want to point out a few details to start.

  1. As a baseline, you’ll get a lot more mileage out of Anki if you organize your decks based on your subjects/material, not based on the DR you want. If you have a particular section within a subject that you want a particular DR for, you could create a subdeck for that.
  2. DR is set for an Options preset – which can contain any number of decks/subdecks. The same goes for FSRS parameter optimization (which you didn’t mention, but seems like it would be a significant factor in your analysis).
  1. Your cards are scheduled for review when you Retrievability (chance of getting a card correct) crosses the threshold of your DR. Every day before that, when the card is not yet due, you have a higher chance of remembering it. [And as your intervals get longer, the chance of remembering ]

    • So, if you study all 26-cards of your Morse deck on any given day, you’re studying a portion of the deck early, and your retention should be higher than your DR. Same goes for using Morse on any given day. See: Reddit - The heart of the internet for more – and Stats > Card Retrievability (e.g., average retrievability, estimated total knowledge).

I don’t claim to understand how you’re getting from 90% DR to a 6% chance of knowing the whole deck, so perhaps you won’t find the rest of this interpretation useful. But in addition to the above, it’s worth noting – frequency of letter use is not evenly distributed (I generalize from English), so I think you’ll get through most exchanges even without knowing 3-5 letters. And finally – a 90% chance of remembering is just a prediction, not a foregone conclusion.

Hi, thank you for your reply!

You’re completely right that I misinterpreted “desired retention.” I thought FSRS tries to keep the chance of recalling a card at that level, but it’s more like a lower bound. It tries to guarantee that your chance to recall a fact is never lower than that level (by scheduling cards that are reaching that level).

Looking at the chart from the linked Reddit thread, when using a DR of 90%, the average recall chance is more like 95%. But even given that, if I have 26 cards with an average recall chance of 95%, the chance of recalling the whole deck is 0.95^26 ≈ 26% (the 6% figure was just 0.9^26). (I’m aware that you can’t just raise the average to a power to combine multiple probabilities, but since they are roughly in the same range and accounting for it would only make matters worse, I think it’s good enough.)

Regarding the distribution of letters: It’s true that letter usage in languages isn’t equally distributed. But since FSRS doesn’t discriminate, the chance of forgetting a commonly used letter is not less than that of forgetting a rarely used one (if you only ever see them while doing Anki), so I don’t think the uneven distribution of letters helps.

So, if I understand correctly, your recommendation is basically to keep my broad high-level decks, but then have different subdecks—maybe one for Morse—and those subdecks can have different presets (and therefore different DR)?

That does sound like a good solution, at least for Morse, where the number of interdependent facts isn’t that high. :+1:

That’s a pretty big “if” though, right? :wink:

That sounds good. For studying something like an alphabet – Morse, Kana, Arabic, Cyrillic, etc. – with a small, fixed set, where you’re just trying to encode a sound to a shape, it’s perfectly reasonable that you might want a higher DR. Each subdeck can use a separate preset (with its own parameters and DR), regardless of how the decks are arranged in relation to each other.