There are a lot of variables in Anki where we might have an intuitive sense about whether or not specific settings are a good idea, but we don’t have empiric data.
We have a theoretical case that new algorithm is better than SM-2 but we don’t really know.
Most technology companies use AB testing to learn whether or not a new feature is progress or isn’t. Anki would gain in efficiency if new features would also get AB tested.
In Anki’s case, I would look at the metrics of total time spent per card and user retention to decide whether to adopt a new algorithm or even whether to change the default font size.
Wouldn’t that mean anki would have to have 2 separate versions, one being the default and the other one being for experiments?
I guess that would contribute to way more complexity and confusion for the average user.
We could already ask one group of users to use fsrs and the other group to use SM-2, then perform statistical analysis based on their results. There are problems though, including:
Finding enough willing participants could proof very difficult or even impossible.
The effectiveness of the algorithms is influenced by a varity of factors, including
a) button usage (e.g. how consistent people grade themself)
b) individual memory and cognitive performance (which depends on things like age, health & disorders, prior exposure to certain drugs, how well and how much you sleep (REM sleep in particular), genetics, stress ect. → you’d basically have to track a lot of things, which is only possible in larger scale scientific studies)
c) The topics that those users are learning.
d) card design (card template, stuff like the 20 rules…)
I don’t think that is reasonably achievable (though I’d personally very much welcome any empirical efforts in this regard).
I’m not sure that ‘intuitive’ sense and ‘theoretical’ case are entirely accurate appraisals, here. This topic has been pretty well hashed-out at this point.
The idea of AB testing has similarly been discussed.
There are some add-ons that can be used for research purposes so I think technically already I can develop an add-on for that (e.g. participants are randomly set up with FSRS or SM2 by add-on and auto send stats and data to the server from Anki for desktop).
So I think as Anon_0000 says the problem is finding participants. Medical school teachers might be able to do that. (e.g. collect data from medical students who use Anki and compare it to their exam scores.)