Long search query optimization/debugging/update

Aleksej · February 10, 2021, 1:49pm

I am checking if I can upgrade from 2.1.22. I use Cardistry, so it will be to .33.

A 15 KB search query I use for prioritization takes 18 seconds with 2.1.22. With .40 and .33 or .26 (the latter is slow either way), it takes over 1.5 minutes.

Filtered decks based on related queries (much shorter and much longer) find nothing.

What should I do?

Rumo · February 10, 2021, 6:24pm

Well, without seeing the queries it will be hard to give any advice. Anyway, 15 kB are 15,000 ASCII characters which seems like an exorbitant length for a search query that’s generally used with manual input. Why do you need such long queries?

Aleksej · February 10, 2021, 6:45pm

Important cards are to be exported as text to feed MorphMan’s Readability Analyzer for the generation of a frequency.txt file to prioritize vocab cards (cards with only one unknown morph) by frequency.
To learn the more important new cards first, and to review only the most important ones in case I don’t have enough time in a day.

Aleksej · February 10, 2021, 6:54pm

A short example query of that kind: example/важность-важное.ankisearch · master · aleksejrs / anki-priofiltergenerator · GitLab
But now many tags begin with an asterisk.

This is prepended to it: -is:learn -is:buried deck:/ (is:due OR is:new)

Rumo · February 10, 2021, 9:07pm

Can’t really help you with the issue of not finding any cards as this is dependent on your personal collection.
But I did some quick benchmarking and it looks indeed as if the search is drastically slower on 2.1.35 than on 2.1.22 and again a whole lot slower on 2.1.41.
I’ll dig a little deeper. (I have contributed some code to this part of Anki in the last year.)

Rumo · February 11, 2021, 9:01pm

The bad performance seems mostly restricted to tag searches. On 2.1.22, those were implemented with a fast SQL comparison, but didn’t support escaping wildcards an escpecially weren’t respecting tag boundaries. By now, these issues have been addressed and tags are now compared by regex which should also be the main contributor to the performance loss, though.
It should be possible to have much faster and still correct tag matching but not with the way tags are currently stored (as a single whitespace separated string). I happen to have suggested that only recently, here is the answer I got:

that’s a change we could potentially make in the future, but it will make maintaining compatibility with older clients more of a headache & slow down the upgrade/downgrade process, so I think it’s best to keep the notes table untouched for now.

So for the time being, you’re out of luck but maybe tag searches will become fast again in the future.

khs · February 11, 2021, 11:16pm

Don’t know the sqlite schema for current Anki, but are tags indexed, indexes are cheap in SQLite?

dae · February 12, 2021, 12:36am

Tags are packed into a single field, and need to be prefix matched, so indexes do not help us here.

khs · February 12, 2021, 12:46am

Thx, I need to download and look at the code to see how it’s put together (so busy with my daytime job just now). Another option is to keep a small sqlite-database in memory for the most common queries and sync up with the file database from time to time. Assuming the dbase hits for tags is a bottleneck. Or nest up the single field and prefix values into something easier to query in this in-memory database.

Long time ago I had to do something similar with a then-known-photo-app, but I ended up just using NSDMutableDictionary caches to speed up XML export writing.

Topic		Replies	Views
Search query term changed? Development	5	507	May 1, 2023
Browser extremely laggy for cards that contain fields with copious information Help	5	998	May 1, 2023
Is this a bug, or a limitation of search? Help	4	304	May 1, 2023
Browser Search Box: Quick Insert Tag, Deck, Notetype [Official thread] Add-ons	89	5708	April 17, 2025
Filtered Decks and Object IDs Help	4	494	May 1, 2023

Long search query optimization/debugging/update

Related topics