Measures to prevent new FSRS parameters from being worse than the old ones

LMSherlock shared this graph with me


If you optimize after every 1000 reviews, there is a ~75% chance that logloss will be better. Though, I think this is just one collection, so the number could be different for other people.
Also, here are his thoughts:

The stochastic gradient descent is stochastic. There is not theory assuring the algorithm could find the global minimum. And you can see that the difference is very small between last weights and current weights.
And log loss is related to the average retention.

1 Like