[Question] How difficult would it be to give FSRS access to certain info?

Specifically, I mean:

  1. The time of the review, in the 24-hour format. Not the answer time (in seconds), but when the review happened, like, time of the day
  2. The number of reviews done before the current review

If someone is doing their 500th review at 3 AM, they are probably more fatigued than someone who is doing their first review at 3 PM. We could incorporate that information into FSRS to improve it. I think me and LMSherlock have squeezed almost every drop of accuracy out of FSRS, which only uses interval lengths and grades. So any major further improvement would require new input features.

@dae, can you tell me how difficult it would be to make it so that FSRS can access the two pieces of information I mentioned above? Like, “just add 2 more lines of code” or “we would have to change some hard-coded stuff that is deeply ingrained into Anki”? If it doesn’t require remaking 90% of Anki’s backend from zero and 10 000 hours of work, that would be good news. Of course, this also depends on how much you and other devs care about pursuing algorithmic efficiency further.
EDIT: just to clarify, when I say “give FSRS access”, I mean when scheduling happens. When answer buttons are shown. Well, and give that info to the optimizer too, of course. But from my (very limited) understanding, it’s the former that is the problem, not the latter. It’s accesssing that info during reviews that is the problem.
EDIT 2: for 2 I mean “total reviews done on day X before the review of card Y”, not “number of times card Y has been reviewed”. Idk how to explain it better. Basically, I need this number.
image

3 Likes

If you can access those information, you can also calculate the actual delay between two reviews (during review)?

Then maybe FSRS can be made to “somehow” take that extra information into account which would be great (for others: because right now all other things being equal, if a card goes through 1m 5m and 1m 60m they get assigned the same initial stability, which is counter intuitive).

Regardless, I think the optimiser being able to work with those info is useful if FSRS is able to work with them.

At first glance, these changes are technically possible without major refactoring or performance impact, assuming we trust the unreliable rep number on the card instead of consulting the review history. That said, there’s a large backlog of other tasks (FSRS and otherwise) that need doing, and I’d really like to see us whittle them down some more before we start chasing further diminishing returns (or have evidence to indicate otherwise).

4 Likes

That said, there’s a large backlog of other tasks (FSRS and otherwise) that need doing, and I’d really like to see us whittle them down some more before we start chasing further diminishing returns

Well, in that case, I’m waiting for you here :slightly_smiling_face:.

1 Like

Just to clarify, for 2 I mean “total reviews done on day X before the review of card Y”, not “number of times card Y has been reviewed”. Idk how to explain it better. Basically, I need this number.
image

Also, I assume even if both suggestions are implemented, the values won’t be recalculated for every user and their entire history? If so, that would be very problematic, since me and LMSherlock need a huge dataset to run benchmarks and test ideas. This would mean that even if 1 and 2 are implemented right now, we won’t get a good dataset for 1-3 years. Though, of course, that depends on how many Anki users always use the newest version and how many reviews they do.

The values you mention in 1 & 2 can be easily calculated if the dataset includes the id column from the revlog table and the value of next day starts at (adjusted for time zone).

Yeah, but they have to be available in real time, when “Show Answer” appears.

Sorry, I wasn’t clear enough. I was responding to the concern you expressed here:

This would mean that even if 1 and 2 are implemented right now, we won’t get a good dataset for 1-3 years.

As I said in my previous comment, whenever dae finds the time to build the dataset, you will get the dataset immediately. There won’t be any need to wait for years.

3 Likes

That would be sweet!