Size efficient way to handle redundant text across cards

Hi. I’ve been generating a pretty big deck through genanki. The deck itself isn’t inherently super massive, but I really like having a few paragraphs of context about the section I’m learning on the answer side of the cards, which balloons the size.

As is, this is manually entered as a field for each note, but for large groups of notes the value of this field is the same. Is there some better way to do things, so I’m not slamming the Ankiweb servers with a lot of redundant info?

Deck size uncompressed is about 3.3GB, doing napkin math it really should be closer to 200MB, with most of that size as media.

It’s not an answer for the topic, but maybe useful to identify big notes: Note Size addon, which shows note sizes in the browser (with sorting by size) and overall collection size.

1 Like

I don’t think you’re getting to 3.3GB on the back of a few paragraphs of text per note. [Really though – are you including media in that? Have you run a Check Media lately to weed out unused files that can take up a ton of space? Maybe you should run a Check Database too?]

Text – if it’s just the text – is pretty cheap. If it’s more than just the text – like if you have big swaths of redundant/unnecessary HTML/CSS that were copied in – that can start to eat up space.

When you’re searching for solutions to this redundancy problem, look for questions and answers related to keeping this extra info synchronized across many notes, without having to change it in multiple places with each edit – like this one I am pretty excited about experimenting with myself. That will help you reduce redundant text as ell – even if updating it isn’t your first concern.

The size difference between media vs text is really enormous. A gigabyte of text would be roughly hundreds of millions of characters - hundreds of long novels worth.

1 Like

Thanks for the replies! Turns out you were right, I had some bug in my code meaning I was accumulating a lot more of that text towards the end of the deck than I reasonably needed to (deck size ended up being quadratic in number of cards :grimacing:).

I am still interested in the answer here from a speculative perspective, but it’s not so much a pressing concern.

1 Like

I suppose you’d include all the possible context paragraphs in your card template. Then you’d use some fields in each note that identifies which specific topic the note pertains to. The value in the note’s topic field would be either empty or “1”. The card template would then use conditional fields like this:

{{#Topic A}}
  Context paragraph for topic A
{{/Topic A}}

{{#Topic B}}
  Context paragraph for topic B
{{/Topic C}}

If there’s dozens of different topics, I’d use a single topic field, put the topic paragraphs into a .js file included in the collection.media folder and then use javascript to load the correct paragraph:

<p id="context-paragraph"></p>

<script>
    import("/_context_paragraphs.js").then(moduleObj => {
      const topicKey = "{{Topic}}";
      const contextParagraph = moduleObj.dict[topicKey];
      document.getElementById("context-paragraph").innerText = contextParagraph;
    })
  }

Contents of _context_paragraphs.js:

export const dict = {
  "Topic A": "Context paragraph for topic A",
  "Topic B": "Context paragraph for topic B",
}
1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.