Large collection size (246mb) although Note Size reports only 66mb text size

Hello,

I searched the forum already and tried every posted solution including:
Deleting large decks then performing check database.
Exporting decks then re-importing them.
Using Clean disconnect review logs.
Emptying the trash can of media.
Searching for ‘base64’ cards.

My problem is that ankiweb reports a collection size of 246mb with many decks deleted already. This already lead to an error when trying to sync.
I installed the note-size addon, and it reports a raw text size of 66.7mb which seems way more realistic. Still the collection size report 246mb.
Using the note size addon I also don’t see any cards that are unusually large looking at either the raw text size nor the overall size.

Could you help me in identifying where this amount of data comes from so I can import my currently exported decks once again and sync?

Kind regards.

2 Likes

Most likely the revlog table is taking up a lot of space. That is all the reviews you have ever done, including from cards/notes you deleted. For reference, I have more than 500mb with just the reviews alone (several million reviews).

Perhaps you might want to delete the reviews from deleted cards, or reschedules (time=0), etc.

2 Likes

Note Size addon shows size of collection.anki2 file:

image

So, the revlog which is stored within collection.anki2 can’t be larger than 66Mb.
Looks like, there is another problem on the AnkiWeb side.
Maybe, it’s just a delay in updating the size in AnkiWeb interface.

1 Like

Well the collection size reports 245mb now, after I deleted my review history from before May 2024. So 1mb down :wink:

But the 66mb I’m refering to is visible from the browse menu when selecting all cards and looking at the note size for all of them:

1 Like

Don’t use the add-on as a reference. It does not take into account other stuff. Please check your collection.anki2 located under your user folder.

I attach an example of a collection that exceed the limit

1 Like

Roughly the same, 256mb.

Then you have confirmed your collection is indeed 256mbm, of which 66mb will be taken for notes only.

1 Like

perhaps you have a ton of graves, that is another possibility. I suggest running a collection check, and later installing DQ browser (sqlite) to actually see the contents

Leaves the question what are the 190mb. My review history is now from beginning september till today. I can’t imagine this adding up to 190mb.

If you have added and deleted tons of stuff, which is what I imagine being that you have 100k cards, it is likely the case of graves. Or hundreds of thousands of reschedules, which will be in revlog table with time=0

2 Likes

It seems there are none:

Deleting review logs makes FSRS inaccurate. So, if you are using FSRS or plan to try it in the future, please restore from a backup.

2 Likes

Thanks for the heads up, but I stayed clear from deleting review history on cards I acitvely use. I only really started using anki from september on, so everything beforehand was safe to delete.

1 Like

I think it is save to delete reschedules. Anki after a certain version logged the reschedules adding a time or 0 milliseconds. Once you switch to FSRS for instance and resched all your cards that can quickly add many entries in the revlog.

What about revlog table?

I think if you create a new profile and move the decks and media in order you might be able to identify which deck or cards are causing the problem.

1 Like

revlog has 20.620 entries. But I don’t know whats common and if this could be the problem?

Yep had this in head already but now proceeded with it. Turns out that the largest amount comes from the Ankizin deck which adds a whopping 140mb. With a few other decks installed and a few months of review history the limit is promptly reached. I will send the creators this thread and see if something can be done about it.

1 Like
  • It’s very, very, very unusual for a young collection like yours to have a database of that size based on revlog entries. 20K revlog entries is a bit high for only a few months’ of use, but not outlandishly so. Have you been using FSRS “reschedule cards on change” frequently?

  • Have you run Tools > Check Database again since you started deleting things from the database? [Manually editing your database is always more risky than not doing that, so better to save that for a last-resort.]

  • But 100K notes is a lot. Have you taken a look at Note Size again since it caught up with reality (I assume it was having a caching issue before)? If you sort by the “Size (texts)” column, especially on that Ankizin deck, you might be able to narrow in on the culprits. Look for exceptional amounts of hidden HTML/CSS formatting in those notes.

2 Likes

Yes, deleting reschedules is safe, but deleting other review logs (if you haven’t stopped studying the card) is not.