The code is pasted. I couldn’t attach the spreadsheet so I just took multiple screenshots.
How the code works is you have FSRS-5, but instead of the decay constant being 0.5, it can be modified by E, which represents how effective your outside study techniques (forming relationships, practice, class time, etc) are compared to just isolated study or retrieval practice on Anki.
When E is 100 the speed at which the original forgetting curve is decreasing is 2x faster than new speed.
When E is 50 the original is 1.5x the new speed.
When E is 200 the original is 3x the new speed.
Then the program finds the best matching desired retention in the original algorithm (without E) that corresponds with the modified algorithm affected by E.
There’s two ways I did this. The first was by taking fixed grade sequences (e.g. “4313”, “3313”) and finding the intervals that had the least average % deviation from each other.
The second method only involved using again and good grades. You input a timeframe, and the program finds every possible again and good combination that could occur within the timeframe when good is pressed equal to or more than the desired retention% number of times. Then each generated sequence is checked and matched normally.
My main findings were this for default params:
So to drop from 90% to 88% desired retention and get similar retrieval, E would need to be about 40.
Though, it sounds like you are trying to estimate things that happen outside of Anki, and I’m very sceptical of how well that can be quantified.
This is the biggest challenge of my program. E has to be estimated by the user but one way you could do this is comparing the time you spend reviewing in class for example vs your time spent in Anki. However, as long as the relative decrease isn’t too large, let’s say greater than 6%, there’s not too much risk.
But if you’re spending twice as much time doing practice tests than reviewing in Anki you don’t need E to be as low as 5. It’s unlikely for any study system in real life to decrease forgetting speed by more than half (E > 100). The estimate just needs to be reasonable.
It’s geared more towards students who may have a few hundred cards or less, not super large 3000+ card long-term knowledge decks with info you need to be able to retrieve in 10+ years. Another limitation is that it doesn’t work well for <80% desired retention but if you’re a student you’re probably not going to want a retention that low, esp if you’re doing outside practice. The goal is to be more precise with desired retention instead of just using 95%, 90%… in steps of 5 and help save time for work, extracurriculars, or leisure.
algorithm.py
import math
import itertools
"""
S : Stability
D : Difficulty, D ∈ [ 1 , 10 ]
R : Retrievability (probability of recall)
r : Request retention
t : Days since last review ("elapsed days")
I : Interval ("scheduled days"). Number of days before the card is due next
G : Grade (card rating)
1 : again
2 : hard
3 : good
4 : easy
"""
# Default parameters FSRS-5 (CHANGEABLE)
w = {0:0.40255, 1:1.18385, 2:3.173, 3:15.69105, 4:7.1949, 5:0.5345, 6:1.4604, 7:0.0046, 8:1.54575, 9:0.1192,
10:1.01925, 11:1.9395, 12:0.11, 13:0.29605, 14:2.2698, 15:0.2315, 16:2.9898, 17:0.51655, 18:0.6621}
# Forgetting curve constants (FSRS-4.5 and 5)
baseDecay = -0.5 # Base decay value
factor = 19/81
minDiffculty = 1
maxDifficulty = 10
# Algorithm configuration
# Context class so decay isn't recalculated after every interval
class FSRSContext:
def __init__(self, E=0):
self.decay = baseDecay * (100 / (100 + E)) # E=0 would be original FSRS-5
def retrievability(self, t, S):
return math.pow(1 + factor * t/S, self.decay)
def calculateInterval(self, r, S):
rawInterval = (S/factor) * (math.pow(r, 1/self.decay) - 1)
if rawInterval < 1:
return 1
return round(rawInterval)
def initialDifficulty(grade):
return min(max(w[4] - math.exp(w[5] * (grade - 1)) + 1, minDiffculty), maxDifficulty)
def initialStability(grade):
return w[grade-1]
def updateDifficulty(currentD, grade):
deltaD = -w[6] * (grade - 3)
dPrime = currentD + deltaD * ((10 - currentD)/9)
dTarget = initialDifficulty(4)
return w[7] * dTarget + (1 - w[7]) * dPrime
def stabilityIncrease(D, S, R, grade):
w15 = w[15] if grade == 2 else 1
w16 = w[16] if grade == 4 else 1
return 1 + w15 * w16 * math.exp(w[8]) * (11 - D) * math.pow(S, -w[9]) * (math.exp(w[10] * (1-R)) - 1)
def updateStability(currentS, D, R, grade):
if grade == 1:
newS = w[11] * math.pow(D, -w[12]) * (math.pow(currentS + 1, w[13]) -1) * math.exp(w[14] *(1-R))
return min(newS, currentS)
else:
sInc = stabilityIncrease(D, currentS, R, grade)
return currentS * sInc
def updateSameDayStability(currentS, grade):
return currentS * math.exp(w[17] * (grade - 3 + w[18]))
# Not a part of the algorithm, but combinations function for the secondary program
# Constants for string length (CHANGEABLE) maxLength can be larger for more accuracy but slower run time
minLength = 1
maxLength = 15
def generateCombinations(minPercent3):
if not (1 <= minPercent3 <= 100):
raise ValueError ("Percentage must be between 1 and 100")
validCombinations = []
for length in range (minLength, maxLength + 1):
for combo in itertools.product("13", repeat=length):
string = "".join(combo)
count3 = string.count("3")
percent3 = (count3/length) * 100
if percent3 >= minPercent3:
validCombinations.append(string)
return validCombinations
"""
NOTES: values aren't 100% precise
https://open-spaced-repetition.github.io/anki_fsrs_visualizer/
is close enough though
https://www.reddit.com/r/Anki/comments/18jvyun/some_posts_and_articles_about_fsrs/
https://expertium.github.io/Algorithm.html
https://github.com/ankitects/anki/issues/3094
"""
main.py
from algorithm import FSRSContext, updateStability, updateDifficulty, initialDifficulty, initialStability, generateCombinations
def simulateSequence(grades, context, targetRetention):
D = initialDifficulty(int(grades[0]))
S = initialStability(int(grades[0]))
intervals = []
for i in range(1, len(grades)):
t = context.calculateInterval(targetRetention, S)
R = context.retrievability(t, S)
intervals.append(t)
grade = int(grades[i])
D = updateDifficulty(D, grade)
S = updateStability(S, D, R, grade)
if len(grades) > 1:
t = context.calculateInterval(targetRetention, S)
intervals.append(t)
return intervals
def checkTimeframe(intervals, timeframeLimit):
if not intervals or timeframeLimit is None:
return True
# Sum of all intervals should be at least timeframeLimit
total_sum = sum(intervals)
if total_sum < timeframeLimit:
return False
# If sum without last interval exceeds timeframeLimit, exclude
sum_without_last = sum(intervals[:-1])
if sum_without_last > timeframeLimit:
return False
return True
def calculatePercentDeviation(intervals1, intervals2):
if (not intervals1) or (not intervals2) or (len(intervals1) != len(intervals2)):
return float("inf")
deviations = []
for i1, i2 in zip(intervals1, intervals2):
deviation = abs(i1 - i2) / i1 * 100
deviations.append(deviation)
return sum(deviations) / len(deviations)
def findMatchingRetention(sequence, targetRetention, E, maxTimeframe=None):
modifiedContext = FSRSContext(E)
originalContext = FSRSContext(0)
print(f"\nANALYZING SEQUENCE {sequence}")
modifiedIntervals = simulateSequence(sequence, modifiedContext, targetRetention)
if maxTimeframe and not checkTimeframe(modifiedIntervals, maxTimeframe):
print("Sequence invalid due to timeframe constraint")
return 0, float('inf')
print(f"Modified intervals (E={E}): {[f'{x:.2f}' for x in modifiedIntervals]}")
minDeviation = float('inf')
bestRetention = targetRetention
# Start from target retention and go down
for retention in range(int(targetRetention * 100), 69, -1):
testRetention = retention / 100
originalIntervals = simulateSequence(sequence, originalContext, testRetention)
deviation = calculatePercentDeviation(modifiedIntervals, originalIntervals)
if deviation < minDeviation:
minDeviation = deviation
bestRetention = testRetention
print(f"\nNew best match found!")
print(f"Retention: {bestRetention:.4f}")
print(f"Unmodified intervals: {[f'{x:.2f}' for x in originalIntervals]}")
print(f"Average deviation: {deviation:.2f}%")
return bestRetention, minDeviation
fixedList = ["3313", "4313", "3314", "4314", "3333", "4333"]
# Inputs and validation
while True:
try:
retentionInput = float(input("Enter desired retention (70-95): "))
if 70 <= retentionInput <= 95:
targetRetention = retentionInput / 100
break
else:
print("Value must be between 70 and 95.")
except ValueError:
print("Number not valid.")
while True:
try:
eValue = float(input("Enter E value (0-200): "))
if 0 <= eValue <= 200:
break
else:
print("Please enter a non-negative value between 0 and 200.")
except ValueError:
print("Please enter a valid number.")
timeframeInput = input("Enter timeframe limit (days) or press Enter to skip: ")
timeframeLimit = None
if timeframeInput.strip():
try:
timeframeLimit = int(timeframeInput)
if timeframeLimit <= 0:
print("Invalid timeframe. Proceeding without timeframe limit.")
timeframeLimit = None
except ValueError:
print("Invalid input. Proceeding without timeframe limit.")
# List choice input and simulation
while True:
listChoice = input("Choose list (1 for fixed, 2 for generated): ")
if listChoice in ['1', '2']:
break
print("Please enter either 1 or 2.")
if listChoice == "1":
sequences = fixedList
else:
minPercent3 = targetRetention * 100
sequences = generateCombinations(minPercent3)
totalRetention = 0
totalError = 0
validSequences = 0
print("\nAnalyzing sequences...")
for sequence in sequences:
matchedRetention, error = findMatchingRetention(sequence, targetRetention, eValue, timeframeLimit)
if error != float('inf'):
totalRetention += matchedRetention
totalError += error
validSequences += 1
if validSequences == 0:
print("No valid sequences found with given parameters.")
else:
averageMatchedRetention = totalRetention / validSequences
averageError = totalError / validSequences
relativeDecrease = (targetRetention - averageMatchedRetention) / targetRetention * 100
print(f"\nResults:")
print(f"Average Matched Retention: {averageMatchedRetention * 100:.2f}%")
print(f"Average Error Margin: {averageError:.2f}%")
print(f"Relative Decrease (Risk Factor): {relativeDecrease:.2f}%")
print(f"Number of valid sequences analyzed: {validSequences}")
There’s also a third file I made called visualizer.py which calculates the time saved for different E values and desired retention once modified. The data I got from it is on the spreadsheet but the two above are far more relevant.
Spreadsheet