FSRS Desired Retention Adjustment Program

Hi everyone, about 4 months ago I finished a program that calculates how much you need to lower your desired retention setting in FSRS-5 to account for higher order learning.

To better explain it, imagine this. Let’s say you’re preparing for a Trigonometry test at a math comp. Using Anki would be very helpful for memorizing stuff like trig identities, law of cosines, and other formulas, but it’s more important to do practice tests where you’re applying and connecting your knowledge to solve actual problems. While you’re doing practice tests, you’ll likely remember the formulas better as well.

The idea with my program is if you’re using higher order study techniques in addition to studying flashcards on Anki, but your deck is relatively small and/or you don’t have the volume of reviews for FSRS-5 to fully adapt in time, you can conservatively estimate how effective your higher-order techniques are, and reduce the desired retention setting with minimal risk to save time.

I also made a big spreadsheet with the exact value that you can decrease your desired retention to based on your existing desired retention, effectiveness estimate, and other info. These also a bunch of other data but that’s the most important one.

Right now I’m just looking for any initial feedback before I attach the code + spreadsheet and explain how everything works in more detail. I want to know if it’s worth trying to turn this into an add-on for example or trying to share with more people. I see it as a more precise way to determine what desired retention you should use on Anki because right now it seems like you have to just guess and there’s no way to find the exact value except through trial and error.

2 Likes

There is Compure Minimum Recommended Retention, but it will likely be removed in the next update as it’s kinda useless with FSRS-6, it just outputs 70% most of the time.


So maybe your idea will be fruitfuil. Though, it sounds like you are trying to estimate things that happen outside of Anki, and I’m very sceptical of how well that can be quantified.

2 Likes

The code is pasted. I couldn’t attach the spreadsheet so I just took multiple screenshots.

How the code works is you have FSRS-5, but instead of the decay constant being 0.5, it can be modified by E, which represents how effective your outside study techniques (forming relationships, practice, class time, etc) are compared to just isolated study or retrieval practice on Anki.

When E is 100 the speed at which the original forgetting curve is decreasing is 2x faster than new speed.
When E is 50 the original is 1.5x the new speed.
When E is 200 the original is 3x the new speed.

Then the program finds the best matching desired retention in the original algorithm (without E) that corresponds with the modified algorithm affected by E.
There’s two ways I did this. The first was by taking fixed grade sequences (e.g. “4313”, “3313”) and finding the intervals that had the least average % deviation from each other.
The second method only involved using again and good grades. You input a timeframe, and the program finds every possible again and good combination that could occur within the timeframe when good is pressed equal to or more than the desired retention% number of times. Then each generated sequence is checked and matched normally.

My main findings were this for default params:

So to drop from 90% to 88% desired retention and get similar retrieval, E would need to be about 40.

Though, it sounds like you are trying to estimate things that happen outside of Anki, and I’m very sceptical of how well that can be quantified.

This is the biggest challenge of my program. E has to be estimated by the user but one way you could do this is comparing the time you spend reviewing in class for example vs your time spent in Anki. However, as long as the relative decrease isn’t too large, let’s say greater than 6%, there’s not too much risk.
But if you’re spending twice as much time doing practice tests than reviewing in Anki you don’t need E to be as low as 5. It’s unlikely for any study system in real life to decrease forgetting speed by more than half (E > 100). The estimate just needs to be reasonable.

It’s geared more towards students who may have a few hundred cards or less, not super large 3000+ card long-term knowledge decks with info you need to be able to retrieve in 10+ years. Another limitation is that it doesn’t work well for <80% desired retention but if you’re a student you’re probably not going to want a retention that low, esp if you’re doing outside practice. The goal is to be more precise with desired retention instead of just using 95%, 90%… in steps of 5 and help save time for work, extracurriculars, or leisure.

algorithm.py
import math
import itertools


"""
S : Stability
D : Difficulty, D ∈ [ 1 , 10 ]
R : Retrievability (probability of recall)
r : Request retention
t : Days since last review ("elapsed days")
I : Interval ("scheduled days"). Number of days before the card is due next

 G : Grade (card rating)
    1 : again
    2 : hard
    3 : good
    4 : easy
"""

# Default parameters FSRS-5 (CHANGEABLE)
w = {0:0.40255, 1:1.18385, 2:3.173, 3:15.69105, 4:7.1949, 5:0.5345, 6:1.4604, 7:0.0046, 8:1.54575, 9:0.1192, 
10:1.01925, 11:1.9395, 12:0.11, 13:0.29605, 14:2.2698, 15:0.2315, 16:2.9898, 17:0.51655, 18:0.6621}

# Forgetting curve constants (FSRS-4.5 and 5)
baseDecay = -0.5 # Base decay value
factor = 19/81
minDiffculty = 1
maxDifficulty = 10

# Algorithm configuration
# Context class so decay isn't recalculated after every interval
class FSRSContext:
    def __init__(self, E=0):
        self.decay = baseDecay * (100 / (100 + E)) # E=0 would be original FSRS-5

    def retrievability(self, t, S):
        return math.pow(1 + factor * t/S, self.decay)

    def calculateInterval(self, r, S):
        rawInterval = (S/factor) * (math.pow(r, 1/self.decay) - 1)
        if rawInterval < 1:
            return 1
        return round(rawInterval)

def initialDifficulty(grade):
    return min(max(w[4] - math.exp(w[5] * (grade - 1)) + 1, minDiffculty), maxDifficulty) 

def initialStability(grade):
    return w[grade-1]

def updateDifficulty(currentD, grade):
    deltaD = -w[6] * (grade - 3)
    dPrime = currentD + deltaD * ((10 - currentD)/9)
    dTarget = initialDifficulty(4)
    return w[7] * dTarget + (1 - w[7]) * dPrime

def stabilityIncrease(D, S, R, grade):
    w15 = w[15] if grade == 2 else 1
    w16 = w[16] if grade == 4 else 1
    return 1 + w15 * w16 * math.exp(w[8]) * (11 - D) * math.pow(S, -w[9]) * (math.exp(w[10] * (1-R)) - 1)

def updateStability(currentS, D, R, grade):
    if grade == 1:
        newS = w[11] * math.pow(D, -w[12]) * (math.pow(currentS + 1, w[13]) -1) * math.exp(w[14] *(1-R))
        return min(newS, currentS)
    else:
        sInc = stabilityIncrease(D, currentS, R, grade)
        return currentS * sInc

def updateSameDayStability(currentS, grade):
    return currentS * math.exp(w[17] * (grade - 3 + w[18]))


# Not a part of the algorithm, but combinations function for the secondary program
# Constants for string length (CHANGEABLE) maxLength can be larger for more accuracy but slower run time
minLength = 1
maxLength = 15

def generateCombinations(minPercent3):
    if not (1 <= minPercent3 <= 100):
        raise ValueError ("Percentage must be between 1 and 100")
    validCombinations = []
    for length in range (minLength, maxLength + 1):
        for combo in itertools.product("13", repeat=length):
            string = "".join(combo)
            count3 = string.count("3")
            percent3 = (count3/length) * 100
            if percent3 >= minPercent3:
                validCombinations.append(string)
    return validCombinations


"""
NOTES: values aren't 100% precise 
https://open-spaced-repetition.github.io/anki_fsrs_visualizer/
is close enough though

https://www.reddit.com/r/Anki/comments/18jvyun/some_posts_and_articles_about_fsrs/ 
https://expertium.github.io/Algorithm.html  

https://github.com/ankitects/anki/issues/3094 
"""
main.py
from algorithm import FSRSContext, updateStability, updateDifficulty, initialDifficulty, initialStability, generateCombinations

def simulateSequence(grades, context, targetRetention):
    D = initialDifficulty(int(grades[0]))
    S = initialStability(int(grades[0]))
    intervals = []

    for i in range(1, len(grades)):
        t = context.calculateInterval(targetRetention, S)
        R = context.retrievability(t, S)
        intervals.append(t)

        grade = int(grades[i])
        D = updateDifficulty(D, grade)
        S = updateStability(S, D, R, grade)

    if len(grades) > 1:
        t = context.calculateInterval(targetRetention, S)
        intervals.append(t)
    
    return intervals

def checkTimeframe(intervals, timeframeLimit):
    if not intervals or timeframeLimit is None:
        return True
    
    # Sum of all intervals should be at least timeframeLimit
    total_sum = sum(intervals)
    if total_sum < timeframeLimit:
        return False
    
    # If sum without last interval exceeds timeframeLimit, exclude
    sum_without_last = sum(intervals[:-1])
    if sum_without_last > timeframeLimit:
        return False
    
    return True

def calculatePercentDeviation(intervals1, intervals2):
    if (not intervals1) or (not intervals2) or (len(intervals1) != len(intervals2)):
        return float("inf")
    deviations = []
    for i1, i2 in zip(intervals1, intervals2):
        deviation = abs(i1 - i2) / i1 * 100
        deviations.append(deviation)
    return sum(deviations) / len(deviations)

def findMatchingRetention(sequence, targetRetention, E, maxTimeframe=None):
    modifiedContext = FSRSContext(E)
    originalContext = FSRSContext(0)
    print(f"\nANALYZING SEQUENCE {sequence}")
    modifiedIntervals = simulateSequence(sequence, modifiedContext, targetRetention)

    if maxTimeframe and not checkTimeframe(modifiedIntervals, maxTimeframe):
        print("Sequence invalid due to timeframe constraint")
        return 0, float('inf')
    
    print(f"Modified intervals (E={E}): {[f'{x:.2f}' for x in modifiedIntervals]}")
    minDeviation = float('inf')
    bestRetention = targetRetention

    # Start from target retention and go down
    for retention in range(int(targetRetention * 100), 69, -1):
        testRetention = retention / 100
        originalIntervals = simulateSequence(sequence, originalContext, testRetention)
        deviation = calculatePercentDeviation(modifiedIntervals, originalIntervals)
        if deviation < minDeviation:
            minDeviation = deviation
            bestRetention = testRetention
            print(f"\nNew best match found!")
            print(f"Retention: {bestRetention:.4f}")
            print(f"Unmodified intervals: {[f'{x:.2f}' for x in originalIntervals]}")
            print(f"Average deviation: {deviation:.2f}%")

    return bestRetention, minDeviation

fixedList = ["3313", "4313", "3314", "4314", "3333", "4333"]
# Inputs and validation
while True:
    try:
        retentionInput = float(input("Enter desired retention (70-95): "))
        if 70 <= retentionInput <= 95:
            targetRetention = retentionInput / 100
            break
        else:
            print("Value must be between 70 and 95.")
    except ValueError:
        print("Number not valid.")

while True:
    try:
        eValue = float(input("Enter E value (0-200): "))
        if 0 <= eValue <= 200:
            break
        else:
            print("Please enter a non-negative value between 0 and 200.")
    except ValueError:
        print("Please enter a valid number.")

timeframeInput = input("Enter timeframe limit (days) or press Enter to skip: ")
timeframeLimit = None
if timeframeInput.strip():
    try:
        timeframeLimit = int(timeframeInput)
        if timeframeLimit <= 0:
            print("Invalid timeframe. Proceeding without timeframe limit.")
            timeframeLimit = None
    except ValueError:
        print("Invalid input. Proceeding without timeframe limit.")

# List choice input and simulation
while True:
    listChoice = input("Choose list (1 for fixed, 2 for generated): ")
    if listChoice in ['1', '2']:
        break
    print("Please enter either 1 or 2.")

if listChoice == "1":
    sequences = fixedList
else:
    minPercent3 = targetRetention * 100
    sequences = generateCombinations(minPercent3)

totalRetention = 0
totalError = 0
validSequences = 0

print("\nAnalyzing sequences...")
for sequence in sequences:
    matchedRetention, error = findMatchingRetention(sequence, targetRetention, eValue, timeframeLimit)
    if error != float('inf'):
        totalRetention += matchedRetention
        totalError += error
        validSequences += 1

if validSequences == 0:
    print("No valid sequences found with given parameters.")
else:
    averageMatchedRetention = totalRetention / validSequences
    averageError = totalError / validSequences
    relativeDecrease = (targetRetention - averageMatchedRetention) / targetRetention * 100

    print(f"\nResults:")
    print(f"Average Matched Retention: {averageMatchedRetention * 100:.2f}%")
    print(f"Average Error Margin: {averageError:.2f}%")
    print(f"Relative Decrease (Risk Factor): {relativeDecrease:.2f}%")
    print(f"Number of valid sequences analyzed: {validSequences}")

There’s also a third file I made called visualizer.py which calculates the time saved for different E values and desired retention once modified. The data I got from it is on the spreadsheet but the two above are far more relevant.

Spreadsheet





1 Like

In FSRS-6 decay (how flat the forgetting curve is) is an optimizable parameter, so while this may sound harsh, I think your idea is already obsolete. Btw, Anki 25.05 beta is available

1 Like

Understood. I read up on FSRS-6 and it seems that now there’s not anymore obvious things that could be improved, except maybe the difficulty calculation. I think my code could still be used for default parameters or optimizations with super low reviews but it’s def not worth making into an add-on. I won’t mark the solution yet in case someone wants to add anything but you gave me the answer I was looking for. Thank you

Why is 70% from the calculator a bad thing?

I hope we would only say that because the actual results demonstrably contradict the calculator’s assumptions later on! Otherwise, I’m really worried about FSRS caving in to pressure from people who don’t actually understand what spaced repetition is, when we should be doing a better job of educating people about the fundamentals.

I’m not worried about the integrity or aptitude of the team but about the collective pressure created by ignorance. Even among heavy users, there are many people incorrectly envisioning their mind as a leaky sieve, and retrieval as something necessary to pull droplets back in before they hit the floor and disappear forever. So when they see a card they need to struggle for, they worry something is wrong, when the entire point of spacing is to create those opportunities to struggle. If we struggle, we don’t even lose much of the spacing effect when we fail at retrieving—that’s how important trying hard to remember is (Unsuccessful retrieval attempts enhance subsequent learning).

On Bahrick’s account (The Importance of Retrieval Failures to Long-term Retention: A Metacognitive Explanation of the Spacing Effect), spacing creates no magic in itself, it is only giving us the opportunity to discover if the way we originally encoded the information is capable of building a memory that is durable over the space we’re testing over. On this account, there is no necessary transfer between ability to recall an item over small gaps, and ability to recall over large gaps. No matter how many times we retrieve over days, we eventually have to attempt to retrieve over months to discover if our encoding method was capable of making a memory that can last for months. Failing during these attempts is then quite literally the only thing that can provide the opportunity to realize that we need to try a new encoding method if we want a memory that can last more than days.

Don’t all these lines of reasoning broadly imply that lower retention targets should tap deeper into the core of why spacing repetition is powerful in the first place?

The influence of people not really understanding spaced repetition is even inescapable within the data. Some people fly through their reviews without thinking and immediately hit “Again” on anything that isn’t immediately obvious. To the degree that they do that — and we all do it to some degree some of the time — they lose the actual benefits of spaced repetition, and then the spacing effect itself will not show up so clearly.

1 Like

It’s just that if it always outputs 70%, we might as well replace is it with a sticker that says “70%”.
Welcome the new CMRR:

2 Likes

lol

I do wonder though, if it is hard-capped at 70% now… isn’t it likely that it’s actually recommending many people a number lower than that? There could be a big difference hidden in there between the people/decks getting 67 and the ones that are actually getting 70, no?

Yes, that’s probably the case. The “true” minimum is below 70%.