Created .colpkg; invalid file format for Import

This is a cross post from the Anki reddit post here.

Somehow I cannot import the deck collection I exported before I wiped my old drive.

I tried Ubuntu’s Anki repository, version 2.1.15 as well as the newest version ⁨2.1.44, direct from https://apps.ankiweb.net/.

I open Anki, go to Import, select the .colpkg and receive this error:

This file does not appear to be a valid .apkg file. If you’re getting this error from a file downloaded from AnkiWeb, chances are that your download failed. Please try again, and if the problem persists, please try again with a different browser.

Okay I’m going to go out on a limb and start hacking away at this, because 20 hours of work shall not go out without a bang.

My thought: What if I can convert the colpkg to a ZIP, then rename to apkg? By the way, I’m using Ubuntu 20.04.

This first command tries to convert a colpkg to a ZIP archive.

sudo zip -FF ANKI-collection.colpkg --out anki.zip

Here is the output of that command:

Fix archive (-FF) - salvage what can
    zip warning: Missing end (EOCDR) signature - either this archive
        is not readable or the end is damaged
Is this a single-disk archive?  (y/n): y
    Assuming single-disk archive
Scanning for entries...
copying: collection.anki2  (68665 bytes)
copying: 0  (24466 bytes)
copying: 1  (21607 bytes)
copying: 2  (52674 bytes)
...
copying: 432  (99158 bytes)
zip warning: unexpected signature 50 4b 01 00 on disk 0 at 59504457

zip warning: skipping this signature...
zip warning: unexpected signature 50 4b 05 00 on disk 0 at 65937896

zip warning: skipping this signature...
zip warning: unexpected signature 50 4b 08 05 on disk 0 at 66926129

zip warning: skipping this signature...
zip warning: unexpected signature 50 4b 0f 0a on disk 0 at 70069916

zip warning: skipping this signature...
zip warning: unexpected signature 50 4b 0c 03 on disk 0 at 71301166

zip warning: skipping this signature...
zip warning: unexpected signature 50 4b 04 00 on disk 0 at 74944458

zip warning: skipping this signature...
zip warning: unexpected signature 50 4b 08 06 on disk 0 at 79761354

zip warning: skipping this signature...

I also set the file with ( do not run chmod 777 on any system files or folders, please! It’s a nightmare you don’t want to be faced undoing):

sudo chmod 777 anki.zip

and then renamed it:

mv anki.zip anki.apkg

Now I attempt to open this fresh ZIP archive renamed to anki.apkg in Anki and receive this error:

Import failed. Debugging info:
Traceback (most recent call last):
File "aqt/importing.py", line 395, in on_done
File "concurrent/futures/_base.py", line 432, in result
File "concurrent/futures/_base.py", line 388, in __get_result
File "concurrent/futures/thread.py", line 57, in run
File "anki/importing/apkg.py", line 37, in run
File "zipfile.py", line 1473, in read
File "zipfile.py", line 1512, in open
File "zipfile.py", line 1439, in getinfo
KeyError: "There is no item named 'media' in the archive"

I feel like I’m getting a lot further than the original error and with some tweaking, perhaps I can produce some results.

Crisis averted (sorta): I have found a solution, and although it produces errors, upon reopening the program, it imports everything (haven’t gone in-depth; could have some missing attributes somewhere).

This process excludes media in colpkg. If we’re lucky, someone will chime in on a fix for this part.

Step 1): Before you* do this, backup your colpkg. You’ve been warned!

On the duplicate colpkg, run the following commands (go to directory and right-click, then “Open in Terminal” if using Ubuntu 20.04).

I’m sure someone here will give some pointers on how to do this for Windows 10.

Step 2): Open Anki and create a New Profile:

File > Switch Profile… > Add > Enter name > Press OK > Open

This ensures any mishaps or erasures of other data do not interfere with your current profile. Do not skip this step: You’ve been warned.

Also, instead of “anki”, feel free to rename your colpkg to whatever works for you.

Step 3): Force conversion with ZIP:
sudo zip -FF ANKI-collection.colpkg --out anki.zip

Step 4): Set permissions to anybody (do not perform this chmod on system files or directories; is a nightmare and security risk):
sudo chmod 777 anki.zip

Step 5): Rename using the mv command:
mv anki.zip anki.colpkg

Step 6): Open Anki to the New Profile you’ve created, then:

File > Import… > Select your converted anki.colpkg file > Open

It will give you an error:

This will delete your existing collection and replace it with the data in the file you’re importing. Are you sure?

Because you’re on a New Profile, you’re not worried about the data loss, right? If you skipped that step where I said, “Create a New Profile”, then click "No and go back to Step 2).

Step 7): Click “OK”.

This step will produce the error:

The provided file is not a valid .apkg file.

Just accept the error. It’s okay, I promise.

Step 8): Now close & reopen Anki.

Step 9): Now that you have your collection loaded into Anki, create new exports, to save the repaired data.

Create a .colpkg and .apkg and select “All Decks” for each, in the profile you’ve imported the repaired colpkg file.

File > Export… > colpkg
File > Export… > apkg

Step 10): Go study and be good to each other.

Importing a collection package (.colpkg) replaces your whole collection, and I think you need to open the file normally (double-clicking) to be able to import it. File > Import doesn’t work with .colpkg files.

I see what you’re saying.

However, the colpkg always gave me the error describe above as:

This file does not appear to be a valid .apkg file. If you’re getting this error from a file downloaded from AnkiWeb, chances are that your download failed. Please try again, and if the problem persists, please try again with a different browser.

whether I double-clicked it or attempted to import it.
I will run a test and save the recovered colpkg as a colpkg and double click it to open in a New Profile, perhaps that will properly import the media.

1 Like

Nope - this does not properly import the media.

Like I mentioned, the old colpkg does not import regardless of how I tried opening it in Anki, whether with a double-click or Import.

I will try a few other ways once I am back from work; perhaps there is a way to recover the media (images, videos) in my cloze cards (which makes up 99% of the ~462 cards in one deck.

Thank you for your response, @abdo

Actually, you can import a colpkg file using File > Import. I was too lazy to test before writing my previous reply (:slight_smile:) and thought that the fact the error message specifically mentions .apkg is significant.

Perhaps there is a way to merge the ZIP file’s media, as this ZIP contains all the images that are missing from converting the ZIP to a COLPKG.

I have not yet figured out a way to re-merge the media files so my solution only brings back the textual data in corrupted colpkg files.

Next time, I will test my backups before erasing a drive.

Your file appears to have been truncated, so the list of media filenames is missing. The numbered files are media files, and the ones you can read can be “recovered”, but you’ll need to label them again with appropriate filenames.

1 Like

OK on what you’re saying regarding the truncated media filenames.

Question: How do I rename the truncated files in a way that the apkg will identify them as includes?

Inside the converted colpkg file, I see “collection.anki2”; by renaming it with a “.json” extension and opening in a text editor, I see it contains all the textural information on cards from the decks - great!

However, many of the characters are unreadable; is there a program or file extension that I can rename this anki2 file to make the contents human-readable?

I feel like there is some scripting coming up in my near future, to recover the card data while retaining the media.

The .anki2 file can be manually placed in a profile. But all the media filenames will have been lost - there is no way to automatically associate them, so you will need to manually rename the files one by one.

I will import the anki2 file - er no, I won’t.

.anki2 files are not directly importable - please import the .apkg or .zip file you have received instead.

I’m okay with this; after converting the colpkg to ZIP and then renaming it with the “.colpkg” extension, it imports the collection and card data, minus the media.

Note: Importing a colpkg after it was force-converted to a ZIP archive produces errors; just click “OK”, close & reopen Anki in the New Profile you created in Step 2 above.

At the moment, the media files in the ZIP archive are labeled 1, 2, 3, 4…462.
How should I go about finding the correct filenames to rename them?

There is an option in the editing screen to show the HTML of a field, which will reveal the image filename. Or you can export to a text file.

Okay, I am seeing a recurring pattern with how Anki converts file to HTML;

<a href="" https://www.pinterest.com/pin/246994360796228444/""> src="" paste-8996d51358ffff248ec810078d80852b5d388d10.jpg"">

This is what the export looks like. Maybe I am nitpicking here, but I feel like there are two too many quotation marks and extra spacing surrounding all attribute data. This makes for messy getting of tag & attribute data.

Anki sometimes doesn’t include any filename whatsoever, such as here:

<a href="" https://fitnessandnutritious.wordpress.com/tag/carbohydrates/"">
<img src="" .jpg (3)"">

As you can see, some images don’t even have filenames; they’ve only got what looks to be soft references (if I were to call it anything) to some mysterious list, such as “.jpg (3)” above.

I wonder if this has any implication on opening it in a web browser, but I don’t think it matters at this point anyway. Let me explain in another reply.

After prancing about with the plain text conversion export and gaining some (perhaps useless) insight into how Anki converts files to plaint text & HTML, I recalled seeing a list of images (1, 2, 3…) inside the zip -FF archive I created out of my colpkg.

In that archive, there’s also a file named “media”. This might be what I need, as I may be able to parse it and automate file renaming. Here’s a snippet:

{“0”: “paste-0d2f16d92608eb5cdb82979703f28f837134252b.jpg”, “1”: “paste-d1174f45a1583a7680d3edc235139e54dca29540.jpg”, “2”: “9e0f70b9869d4ad7affdfcb8d03c8a1f-ao-O.svg”, “3”: “.jpg (43)”,

But there it is again! An unnamed file, “3”: “.jpg (43)”. That means the third image’s filename is “.jpg (43)”.

By the way, I had no plan on where I was going with this; I’m merely sharing some stuff I’ve come across as strange. I will be back with some more information when I get it.

I apologize, but I am not seeing the so-called editing screen HTML of a field.

Would you kindly share a shortcut key or screenshots of what you’re referring to?

Thanks @dae

Depending on your version it should look similar to this:

1 Like

The error you pasted at the start said that the ‘media’ file could not be found when importing. If you’ve repaired the zip file and managed to recover the media file undamaged, then you should just be able to rename it back to .apkg and import it as normal. If for some reason that does not work, yes, you can use that file to determine the original filenames.

I’m gonna thank @cqg in this post, since a separate post just for that seems excessive. So, thank you @cqg for the screenshot.

@dae: Unfortunately, converting the colpkg to a ZIP and then renaming with an apkg extension and importing it into Anki does not produce the correct media images.

Note: I believe merely renaming archives (instead of using ZIP to convert them) will cause data loss. For instance: I manually renamed anki.apkg to anki.zip and opened it in Archive Manager to find tens of media (images, in my case) missing. So, if you convert anything, do it with command line tools.

Continues it will, solutions shall be found - eventually.
I will progress in the direction of writing a program or script to convert filenames. It will take me quite some time, as I’ve never written a program before.

1 Like

This is the best solution for the problem I experienced.

This step-by-step guide will not recover media files that were lost either by exporting the original colpkg or by force converting sudo zip -FF... the colpkg to a ZIP archive.

Regardless, these ten steps should save you tens of hours of retyping and re-clozing your cards; all you have to do then is replace the missing media.

Important note 1: Please check all media before publishing your decks; after recovering a colpkg, Anki’s file references may be corrupt - linking the wrong media files to cards.

Important note 2: Please manually export your collections (colpkg) consistently; this gives you a chance to save the backup off-site (such as a USB drive) in case of data loss.

I know it’s a lot of work, but the struggle to publish helpful, good, and accurate information is a worthy battle. Go study and be good to each other.