Import of CSV not reading all File Headers in v. 25.07.01

I have four File Headers in the CSV to import, following instructions in File Headers in the Anki Manual.

Importing now ignores the following separators (commas, in my case) with this recent release. At least it seems to. However, only the separator, html, and deck headers were honoured.

The notetype header was not honoured.

However, if I manually go load the CSV into a text editor and delete all the following separators, the import works fine.

The File Headers of my CSV follow.

#separator:Comma,,,,,,,,
#html:true,,,,,,,,
#notetype:Basic (optional reversed card) EO AK,,,,,,,,
#deck:EO All Official Esperanto Words and Roots,,,,,,,,

File Headers of another file, which also shows the same behaviour (i.e. that the Note Type is ignored).

#separator:Comma,,,,
#html:true,,,,
#notetype:Basic (optional reversed card) EO 1000,,,,
#deck:EO 1000 Most Frequently Occuring Words,,,,
1 Like

The #notetype:... header is a free-form entry one, unlike #separator:... and #html:... which have a restricted set of possible values. In this case, the notetype header’s value is seen as Basic (optional reversed card) EO AK,,,,,,,,

Then why is the deck header being honoured? I’d have expected both of them to be processed ignored following separators.

It shouldn’t be. It’s also freeform, i.e. looking for a deck with the name EO All Official Esperanto Words and Roots,,,,,,,,

Refer to Loosen csv metadata parsing by iamllama · Pull Request #3862 · ankitects/anki · GitHub. The manual should probably be updated with a notice for this, if anyone wants to get on that.

So no chance of removing the separators, if:

  1. they have been declared by e.g. #separator:Comma, and
  2. the separators are the last characters on the line?

Using File Headers is the only way I can see to document what Note Type and Deck to use for a given tab of data in a spreadsheet. Some decks include more than one Note Type and this relatively new feature has been a boon for me to document the relationships between Note Types and data. It thus also ensures that data is presented correctly in a deck.

I use it in conjunction with the Save Note Type settings add-on, so that I can always recreate the deck, its associated Note Types, and its data (Notes). This helps, when I am asked to share or re-share a deck, which I no longer have in my own Anki collection.

Is there another way to link Note Types, Decks, and Notes, apart from an Anki export? Keeping the data in a spreadsheet allows me to update the data quickly, if a word or term needs to be modified, added or deleted. If I were only able to store them in an Anki export, then I’d have to load that export into a dummy profile, make the correction, and then export it again.

It’s going to need some pretty involved changes in the parser code to support this, from what I understand.

And this is only going to make things more confusing, as the way the freeform headers are parsed would now depend on whether there’s a valid #separator:... header or not. To say nothing of whether said header comes before or after, assuming there’s only one

Bummer! Oh well, I thought we had a solution. Looks like there will still be manual editing steps before we can import.
Thanks for your time, @llama

Yeah, I suspected this might happen - having it work automatically for only certain headers will inevitably be confusing.

Instead of the alphanumeric stripping we were talking about in the past, maybe we could change it to a 2-part process? The first step would be to look for ‘#separator:…’ with all trailing alphanumerics stripped out, and fall back on detection otherwise. And then the second step would be to limit any other header lines to the first column, based on that separator. We’d presumably also need to support quoted first columns. WDYT?

That sounds like a perfect solution!
If you decide to support quoted columns, then we can add that to the manual. I got a surprise the first time, when quoting that spaces in Note Types and Decks were OK.
Thanks for the thinking, @dae !

I’m assuming this would mean running the csv parser over the metadata alone? That’d make quoting mandatory then right. Not sure how often existing/old csv files get shared and used, and this might break them. Unless we keep the existing behaviour as a fallback

The only time a field needs to be quoted is if there’s a line break, double quotes or the delimiter itself present in it.

If they are old CSV files, they wouldn’t have the File Headers, would they? I thought they were recent functionality. If that is the case, then just the normal scan of contents should work, should it not?

Yes, you are right. One of my deck names had a comma in it. I have since removed it, shared the newly named deck, and retired the old one with a comma in its name.

I don’t import csv so I wouldn’t know. Every change breaks something for someone. I’ll wait for dae’s reply before putting time into it.

1 Like

It probably depends on what you call old. I have a .csv that is approx. 1 year old and in use since at least anki version 24.11. File headers had long been supported since I first created my .csv as far as I know.

Yep, that’s a good point, and perhaps that’s not the best idea. To be clear, I think we should fix this somehow, but have no strong thoughts about the best way to do it yet.

Ok, back to the drawing board. What if we make the separator field special/mandatory for this case? E.g. Anki could see #separator:Comma, and note down the trailing characters, then remove them from any other file headers. Provided users ensured the file headers were only placed in the first column of the spreadsheet and didn’t pick a separator that requires quoting of file headers, that might work?

Other ideas welcome :slight_smile:

1 Like

Sounds good to me! That’s the way I’d have expected it to work.

Is there a reason why quoting of the file headers is a bad thing? (Dumb question probably, but I don’t understand.)

I’ve logged this on Better handle file headers when exporting from a spreadsheet program · Issue #4205 · ankitects/anki · GitHub

The file header code does its own parsing and happens before we treat the file as a csv file, so it doesn’t understand quoted text.

1 Like

Thanks, @dae !

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.