I’ll first briefly note down the two issues with evaluation:
Log-loss and RMSE (bins) are technical ideas not easy to understand for everyone.
There was never a good way of telling people what log-loss is. Now, after we add a recency weighting function in evaluation, it’ll become even harder.
As an exemplar for the first point, this user in Github posted:
What’s log loss? What’s RMSE(bins)? How small should the numbers be to be acceptable? For what purpose is this information presented?
Information is confusing when people don’t know it’s purpose.
Evaluation did have one practical use but it can be replaced, as a better way of seeing whether FSRS works for your collection or not, is looking at “True Retention”.
Okay, then remove log loss, but leave Evaluate alone otherwise. RMSE isn’t too technical to understand – as it’s been described in the tutorial from the start, and it now is in the manual –
RMSE (bins) can be interpreted as the average difference between the predicted probability of recalling a card (R) and the actual probability measured from your review history. For example, RMSE=5% means that, on average, FSRS is off by 5% when predicting R.
It’s a very useful tool to determine whether someone is overreacting to perceived changes or whether there is actually a problem that needs to be addressed. It’s also immediate, and that’s much better than waiting a week (or more) for signs to show up in True Retention.
Evaluate has no active effect on a collection, it’s confusing for most users and difficult to find in the manual, yet in terms of visual hierarchy, we’re giving it the same importance as ‘Optimize’. From a UX perspective, we should reduce its importance.
At the very least, move it to an advanced section. In this case I’d vote remove: it’s not a deck option.
General feelings on how I feel we should approach FSRS options (I don’t believe I’ve posted about this previously):
The more options we have, the more we contribute to information overload. I feel we (mostly @Expertium ) handle a lot of user support for FSRS currently and we should be working towards a simpler UX. This level of support indicates that we could be doing better with in-app messaging, and means there’s likely a silent majority of people who aren’t reaching out.
A user looking to use the FSRS options to get the most effective Anki experience currently needs 3 things, with an optimal UX having 1. I’ll strongly support anything we can do to move in this direction:
Desired Retention
We will always want this
Over time, make it more significant, likely via including a visual/interactive explanation of what it is
Enable/Disable FSRS
With FSRS as a default, this can be moved to ‘advanced’
Optimize
Over time, we should move to automate & remove this
Based on a tiny sample size of 13, people are in favor of keeping it. At least people on this forum. I imagine the ratio would be very different if we asked compeletely random users.
Regarding “Optimize” and desired retention, as I said here, I don’t see how we could explain what those are and why they exist without an onboarding tutorial, which would also solve, like, 20 other UX issues.