Statistics: plot #stable, #learning (etc) over time

The statistics section doesn’t show the cumulative number of cards that have been learned (at the various degrees of maturity) as a function of time. This seems like the key progress statistic that users would want to follow to feel like they are accomplishing their learning goals. Is it truly missing? If so, why not add this?

Furthermore, after consulting the help pages that suggest I access the raw revlog database myself, there isn’t documentation of where the revlog is. I used find and grep within the anki desktop package, and returned no files

1 Like

The straightforward part first – your revlog should be included in your collection, which is profile-by-profile. Did you check there? Managing Files - Anki Manual
[Don’t go in without a map! Database Structure · ankidroid/Anki-Android Wiki · GitHub]

I can think of 2 reasons not to have that stat, but they both probably come down to “just another opinion.”

  1. A card – ideally – moves through all 3 phases, so separating the phases would primarily show the difference between the number of learning, young, and mature cards (which is already included in Card Counts, albeit not as a function of time).
  2. Cards don’t necessarily move in only one direction through those phases, and aren’t necessarily in each phase only once, so – do you count only the first time a card is in that phase? Or every time? But either way, what are you counting that isn’t tracked elsewhere?
2 Likes

Thanks Danika I found the db file. I regard the mature cards as my current vocabulary size (I’m learning a language), and that should naturally expand or contract over time depending upon the amount of effort I devote to studying, and how often I use the language. So simply tracking the quantity in each category would provide a snapshot of my progress or regression. I tend to like to have the time plots because it shows me clearly how much progress i’ve made for all my effort, whereas I can only guess at my progress by looking at the pie chart and trying to remember what I used to look like.

I’ll try to learn to use the db to compute these stats myself, but I think people would find this satisfying to see in their app. The other stats do not seem that useful to me right now.

2 Likes

Interesting! Thanks for responding, so I could see your perspective!

1 Like

For your target language, you can find estimates of the number of words that typical native speakers know, so one’s total stable vocabulary size gives a progress indicator that can be calibrated against a real world baseline. I don’t think of that as a matter of perspective. Anyway, I found the db table, and executed an sql query, so I can can presumably compute whatever I want now.

I think the SQL query below gives the correct number of mature cards, expressed as a cumulative over time. Does type=4 mean suspended cards? I eliminated those changes from consideration. Maybe this code will be useful to someone. It would be awesome if this could be added as a progress graph in the app somehow. I’ve added a screen shot of a python plot to show what it might look like. You can see that the plot does go down in mid December, since I was working to prepare for holiday and must have forgotten more than I learned. That’s a good sanity check on the validity of the code.

From this plot, you can see the speed of my learning mature cards: very roughly 150/mo, or about 2 notes/day. This seems quite slow to me, but I had no idea without generating this plot. I can also extrapolate to guess when I might reach certain goals, and tune my study strategy/duration to increase my learning rate over time. These capabilities that are enabled by the plot are all of significant practical value to learners, but I don’t think you can find them in the app today. Definitely you can’t see them intuitively like this. I wonder if anyone would be interested to collaborate to bring this into the desktop and/or mobile app.

-- Compute the cumulative maturations over time, just for times when it changes
WITH CountedMaturityChanges AS ( 
    -- Mark maturation with +1 and de-maturation with -1, else 0
    WITH MaturityChanges AS ( 
        -- Add the previous type of the card to each row
        WITH TypeLagged AS ( 
            SELECT 
                *,
                LAG(type) OVER (PARTITION BY cid ORDER BY id) AS type_lagged
            FROM 
                revlog
			WHERE type != 4  -- Are these suspensions? The times were 0.
        )
        SELECT 
            *, 
            CASE
                WHEN (type = 1 and ivl >= 21 and lastIvl < 21) THEN 1
                WHEN (type = 1 and ivl < 21 and lastIvl >= 21) THEN -1
                WHEN (type != 1 and type_lagged = 1 and lastIvl >= 21) THEN -1
                ELSE 0
            END AS maturity_change
        FROM TypeLagged
    ) 
    SELECT 
        *,
        SUM(ABS(maturity_change)) OVER (PARTITION BY cid) AS total_card_maturity_changes
    FROM MaturityChanges
)
SELECT
    datetime(id/1000, 'unixepoch') as dt,
    id,
    ease,
    ivl,
    lastIvl,
    factor,
    time,
    type,
    maturity_change,
    total_card_maturity_changes,
    SUM(maturity_change) OVER (ORDER BY id) AS cumulative_maturity_changes
FROM CountedMaturityChanges
WHERE ABS(maturity_change) = 1
ORDER BY id

And here is the plotting code

import csv
import matplotlib.pyplot as plt
from datetime import datetime

# This is the output file I saved using SQLite
SQL_QUERY_OUTPUT_AS_CSV = 'cumulative.csv'

dates = []
matures = []
with open(SQL_QUERY_OUTPUT_AS_CSV) as csvfile:
    csvreader = csv.reader(csvfile)
    first = True
    for row in csvreader:
        if first:
            first = False
            continue

        date = datetime.strptime(row[0], '%Y-%m-%d %H:%M:%S')
        dates.append(date)
        matures.append(float(row[-1]))

# Plot it
plt.figure(figsize=(10, 7), facecolor='white')
plt.plot(dates, matures)
plt.xlabel('Date')
plt.ylabel('Mature cards')
plt.xticks(rotation=0)
plt.grid(True)
plt.gcf().autofmt_xdate()
plt.title('Cumulative number of mature cards')
plt.show()

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.