Hi, I have a platform that generates a tons of anki apkg. packages. The problem is the package update. No matter what I do, the imported cards are duplicated. What I do:
I make sure the GUID is stable for each card
I tried adding custom first stable field based on the GUID
I am using some heavy javascript and html, custom template, and python anki generator.
I am unable to get a stable updates working with my decks. I tried:
re-importing small dataset, with a constant added - worked OK
re-importing a small dataset, with a dynamic value changed in the note field - a single note (the first one) was duplicated, rest OK (see the image)
re-importing a bigger dataset, with a constant added - a single note (the first one) was duplicated (?), rest updated OK
re-importing a bigger dataset, with a dynamic value changled - all notes are duplicated (I can see Card and Card 1 in the card browser, each row twice, both updated to the new value)
[without the custom stable first field] re-importing big collection, data items modified and/or model content updated - complete mess
Thank you for your reply. From the attached post, I can see that
just write the script in Python instead and use Anki’s API.
is one of the advices, and that’s what I am doing. I cannot use text files, which do not allow me to make complex HTML templates for my cards. I need the full flexibility of apkg. I am using python API, which, IMHO was developed so that people can have apkg files this way - to be written by external tools. Why make API otherwise?
I would just prefer it if anki had a solid way of merging notes, so that one can use an ID field to really control this most basic feature that all such systems need. Even if I used a CSV file, I would maybe need to change the rows in a way anki has no chance to reconstruct the link, so I need to rely on a solid ID entry anyway - so I would expect that .apkg offers similar logics… the basics of any database systems.
You’re using a third-party API, which might not be up-to-date with the latest changes. I recommend giving the official API a try - it’s the same code used by Anki to export/import files, so you should get better results.
So I tried, and failed. Basically the only thing that changed was creation strategy.
def create_anki_deck(key, reader, filename):
temp_dir = tempfile.mkdtemp()
col_path = os.path.join(temp_dir, "collection.anki2")
col = Collection(col_path)
try:
deck_name = f"KanTanJi::{key}"
deck_id = col.decks.id(deck_name)
model = col.models.new("KanTanJi Anki Model")
field_names = ["UID", "Q", "A"]
for i, name in enumerate(field_names):
fld = col.models.new_field(name)
fld['id'] = FIELD_IDS[i] # Force the field ID
fld['ord'] = i
col.models.add_field(model, fld)
template = col.models.new_template("KanTanJi")
template['id'] = TEMPLATE_ID # Force the template ID
template['ord'] = 0
template['qfmt'] = (
"<div class='c'>{{Q}}</div>"
"<script>['click','touchstart'].forEach(event=>document.addEventListener(event,()=>document.querySelectorAll('ruby rt, .rlbl').forEach(x=>x.style.visibility='visible')));</script>"
+ HANZIWRITER_LIB_INLINE + HANZIWRITER_INIT_JS
)
template['afmt'] = (
"<div id='hw-back-marker' style='display:none'></div><div class='c qa'>{{Q}}</div><br><br><div class='c'>{{A}}</div>"
"<script>['click','touchstart'].forEach(event=>document.addEventListener(event, ()=>document.querySelectorAll('ruby rt, .rlbl').forEach(x=>x.style.visibility='visible')));</script>"
+ HANZIWRITER_LIB_INLINE + HANZIWRITER_INIT_JS
)
col.models.add_template(model, template)
model['css'] = css
# otherwise the library crashes
col.models.add(model)
model['id'] = MODEL_ID
col.models.update(model)
model = col.models.get(MODEL_ID)
col.models.set_current(model)
for row in reader:
note = col.new_note(model)
note.guid = row[2]
note.mod = int(time.time())
note.fields[0] = row[2] # UID
note.fields[1] = row[0] # Q
note.fields[2] = row[1] # A
card_type, significance = row[4], row[5]
if card_type == "kanji":
note.add_tag("KanTanJi_Kanji")
else:
note.add_tag("KanTanJi_Tango")
tag_map = {0: "KanTanJi_Learn_Now", 1: "KanTanJi_Learn_Deck"}
note.add_tag(tag_map.get(significance, "KanTanJi_Learn_Future"))
col.add_note(note, deck_id)
# 4. Handle Media
if HANZIWRITER_LIB_PATH.exists():
# Use the absolute path to ensure Anki finds it during the temporary session
col.media.add_file(str(HANZIWRITER_LIB_PATH.absolute()))
exporter = AnkiPackageExporter(col)
exporter.did = deck_id
exporter.includeSched = False
exporter.exportInto(str(Path(filename).absolute()))
finally:
col.close()
# Small delay to ensure SQLite releases the file handle before deletion
time.sleep(0.1)
shutil.rmtree(temp_dir)
I tried two note sets - B contains the whole A, both belong to the same pack but different sub-pack. Importing fresh package - always OK. Importing B - always wrong. I even stopped trying to change the datasets, importing the exactly same data works (=breaks) too. So I tried hardcoding IDs to entities to enforce same ID and overwrite. No luck.
without hardcoded IDs - anki tries to create new ‘+’ models and stacks them ‘++’ etc. But it does not move entries to these new + fields and cards are empty. Or, sometimes it updates the cards and breaks the model. I could not reach a stable behavior, but I suspect the content break only happened before when I still attempted to change the card content between re-imports, now it just makes cards look empty - it bumps the model name to ‘+’ and leaves the + fields empty.
with harcdoded IDs - it seems this should not be done - setting model ID before add() make the library crash (`thread ‘’ panicked at rslib\src\storage\notetype\mod.rs:248:9`), the behavior was worse, it instead of ‘+’ added hash to model name, but still updated the model, generated empty cards, but at least it tried to be a little bit more consistent I had the feeling, updating all the things
I was unable to get the anki import work. It should update notes that are the same, but it does not. It always tries to bump the model level, always generates random IDs, never updates the notes properly.
then I re-tried everything once again hardcoded ID version when writing the post and suddely the stuff started working
I will try a little bit more, I just want to keep the progress here.
It seems to work now, when I set the IDs directly. The library still seems to not to support this well (setting ID directly on collection creation crashes the lib), but so far updates looked as desired. I just hope some update does not break this. I still was unable to implement a stable way of transferring the scheduling info from the old package to the new, at least the new package updates seem not to break anymore.