Anki creates unnecessary duplicates

anki version of pc
Version ⁨2.1.44 (b2b3275f)⁩
Python 3.8.6 Qt 5.14.2 PyQt 5.14.2

I am learning Serbian and they have special characters like this
odeća
IPA(key): /ôdet͡ɕa/
ȍdeća f
(short vowel with falling tone)

Sing
Nom. odeća
Gen. odeće
Dat. odeći
Akk. odeću
Lok. odeći
Ins. odećom
Vok. odećo!

Plur
Nom. odeće
Gen. odeća
Dat. odećama
Akk. odeće
Lok. odećama
Ins. odećama
Vok. odeće!

When I import my excel input file Anki doesn’t detect that entry already exists and thus creates duplicates. Each time I upload my excel file I have to delete manually entries which contain a ć.

The whole entry is like this
#word picture gender,personal connection, extra info (back side) pronunciation (recording and/or IPA) test spelling tags

odeća
IPA(key): /ôdet͡ɕa/
ȍdeća f
(short vowel with falling tone)

Sing
Nom. odeća
Gen. odeće
Dat. odeći
Akk. odeću
Lok. odeći
Ins. odećom
Vok. odećo!

Plur
Nom. odeće
Gen. odeća
Dat. odećama
Akk. odeće
Lok. odećama
Ins. odećama
Vok. odeće! clothing

Wear comfortable, loose clothing to your exercise class.
Nosite udobnu široku odeću na času vežbanja.
weibliche Substantive in der Regel auf ein -a enden. [sound:odeća.mp3] y clothing

It looks as if Anki has difficulties with word1.

The note type is picture words. Import option is
Update existing notes when first field matches.
Allow HTML in fields is ticked.

Could you export your deck along with the Excel file and post them here so I can test it?
You can use this service for uploading: Gofile - Free file sharing and storage platform

hi

I uploaded it to gofile. Download link is here

https://gofile.io/d/sVDpVW

I attached as well the excel file.

regards Andreas

(Attachment 625 wichtigsten Wörter auf Serbisch csv Vorlage notiz typ picture words.csv is missing)

I can only download the Excel file. Maybe the deck file is set as private or something?

Hi Andi

No nothing special with this csv file. Can you supply me with an email ID then I can attach it there?
Regards Andreas

I mean you appear to only have uploaded the Excel file, without the deck package. See Exporting - Anki Manual for info on exporting decks. This is also needed so I can test with the same note type that you’re using.

I guess the problem is that the first field in your note type is set to a field you don’t want to check duplicates against. Anki only checks for duplicates in the first field (the one that appears at the top of the note editor).

You can go to Tools > Manage Note Types, select the note type you’re using, click fields, and use the Reposition button to reposition the field you want Anki to check for duplicates in (the “word” field I guess) to be the first field.

1 Like

hi Abdo

https://gofile.io/d/wVRiox link for

625 wichtigsten Wörter.apkg

https://gofile.io/d/FBQMfI link for

625 wichtigsten Wörter auf Serbisch csv Vorlage notiz typ picture words.csv

import settings are such
that
update existing notes when first field matches.

This setting works for 90% of the notes just not for the notes with character “ć” like in iseći / seći / odeća / ćerka / naći / ići / srećan / kuća / cveće .

Unfortunately the find duplicates function is not that helpful either.

regards Andreas

I think I figured it out: it’s caused by Anki not normalizing the first field of imported notes before version 2.1.45. Upgrading to 2.1.45 (which was released today) should fix the problem. it was fixed here: normalize first field before comparing with local DB · ankitects/anki@a90a6ab · GitHub

Technical details: some characters like “ć” can be written using either two separate codepoints for both the “c” letter and the accent (U+0063 + U+0301), or alternatively as a precomposed character (U+0107). The two forms should be treated as equivalent for comparison purposes but Anki was failing to do that when importing.

2 Likes

hi Abdo

right you are. Now it works fine. A big relief for me so that I don’t have to delete manually unnecessary duplicates .

Thanks for your support.

regards Andreas