As a long-time Anki enthusiast, I am immensely grateful for the success it has brought to my studies and am proud to be part of this wonderful project. I would gladly contribute a monthly fee to ensure its continued existence and development.
Today, I seek your help in addressing a challenge that has arisen with my collection. After a decade of extensive use, primarily for medical studies, my collection has grown to about 400MB, exceeding the 250MB limit and hindering my ability to add new decks. I still can sync but cannot add new decks or implement Ankihub, for which I just signed up.
Here are the key points of my situation:
• Anki Version: Using the latest version, 23.12.1 on MacOS.
• Collection Size: Currently at 408 MB (as shown on Ankiweb), over the 250 MB limit.
• Decks and Cards: Several dozen decks with nearly 100,000 cards, many including large media files (which to my understanding doesn't affect the collection size).
• Attempted Solutions: Followed standard recommendations like deleting unused decks, “Check Database”, using the “localize media” add-on and deleting empty note types, with no significant reduction in collection size.
• Media Files: Not specifically compressed or removed.
• Review history: I somehow suspect the review history to be the culprit. Deleting 25000 cards from an unused deck did not result in a significant lower collection size but deleting some few cards I have had in my collection for years did give me some MB. I have statistics spanning over the past ten years, but the review history is not at all important to me. I'd rather have my scheduling data per card retained.
• Backups: Complete backup of my collection is available.
• Synchronization: I usually synchronize with AnkiWeb multiple times daily when opening the iOS app or the desktop one.
I am seeking advice on how to effectively manage my collection size without compromising scheduling data. I am open to advanced methods (but don’t really speak any python or database languages) or insights into the impact of review histories on collection size and if so, how to get rid of review history without compromising my scheduling (and avoiding starting my ten-year-old cards anew).
Thank you for your time and assistance. Any suggestions or guidance would be greatly appreciated!
Almost all of your collection is taken up by note data. You likely have one or more fields that include paragraphs or pages of text from a website, which will take up a lot of space when multiplied by many notes.
Thank you for your replies and considerations! I think self-hosting might be an option, but I don’t see myself setting it up, as I’m nowhere near proficient with the command line .
Regarding dae’s suggestion: Is there a smart way to determine which note type contains the fields taking up the most space, and by how much? I have quite a few from all the years of collecting decks, but I think I could browse through the types in an afternoon and report the effects here!
Also, is review history a significant factor, or not worth worrying about?
Thanks again, everyone, for your time and effort!
You can use regex to search for fields with a high number of characters, which should at least give you some clues where to look. For instance, to search for notes with fields containing 1000+ characters – re:.{1000} – or you can restrict that to a specific field – FIELDNAME:re:.{1000} – but you might need to start even higher than 1000 .
Look for fields that look like they’ve been harvested from a website, like a ‘definition’ field. The more text you see, the more space it’s likely taking up.
the pre 2.1.54 addon community is hard to setup, failed by a college grad who is first in this class.
but the 2.1.57 one is quite user friendly , may be even for dummy(idk). i guess a primary school student could set it up. (the above college grad did succeed in setting this up)
By using re:.{1000} and focusing on the suspended cards, I managed to delete 15,000 large, unused cards from a French deck. These cards contained embedded scripts and lengthy definitions (and represented the less useful Canadian French portion of the deck ). This action effectively reduced my collection size by more than half. Now, syncing is smooth sailing. Thank you, @Danika_Dakika, and all others!
I am not sure if this is to be expected. If I search re:.{1000}, the results come up instantaneously but if I search re:.{2000}, it takes several minutes for the results to come up.