How to search for head tags without sub tags

I think it would be better if we didn’t leak the internal tag storage format, as that makes it harder to change in the future. The find&replace code avoids this by splitting the tags first, and applying the regex to each in turn. We could take a similar approach here, though we’ll probably need to add another SQL function for it.

4 Likes

Agreed, also because that would be pretty counter intuitive for most users.

Maybe we could simply add a not operator, that negates the query. Then matching the head tag a is simply (not tag:a::*) and tag:a. This way we keep the query at a sufficiently high level of abstraction users don’t need to learn about regex. Also, this matches pretty much standard SQL functionalities.

Anki has a negating operator: -. Is not supposed to do something different?

But then, the query -tag:a::* and tag:a does what the OP wanted.

No, because it doesn’t match a note with tags a a::child.

You’re right I’m an idiot, and it has already been proposed…
Maybe tag searching should be exact? (or there should be an option for having exact searching, since there are glob-like-pattern such as * to cover other needs).

I don’t think excluding child decks would be what users expect in the common case, as when you click on a deck to study it, subdecks are included. Adding a regex search that requires an exact match solves two problems - it addresses this particular (somewhat niche) case, and allows searches to be more complicated than simple globs.

1 Like

I often encounter a somewhat similar problem.
As far as I know, there is currently no simple (not involving regex) and direct way to search for notes that both have tag:a AND also any other tag different from tag:a.

E.g.

  1. First search: “tag:Europe”
    → all notes that contain the tag “Europe” are yielded as search result.
  2. Second search: “tag:Europe” AND “tag!=Europe”
    → notes that only have “Europe” as a tag would now be excluded from the search results, while notes that have “Europe” AND also 1+ other tags (e.g. Iceland/Spain/Flags etc.) would not.

I’d say ‘anything but’ is not a pattern usually supported by search engines.
If you want to match these notes, it might indicate that your ‘Europe only’ cards should have a different tag instead, or the other tags should have a common parent.

So basically like note field searches. Maybe the current search routine would benefit from that as well? Then we could ditch the custom regex construction I wrote at that time, and use SQL comparisons instead.

I agree that better tagging could prevent this problem.
Unfortunately, my most common use case for such a search would be for studying/changing the way the tags of an imported deck are structured. For instance, to re-organize them in a hierarchical fashion

(E.g. You could run a first search for “tag:Europe”, then a second search “tag:Europe and tag!=Europe”, then a third search “tag:Europe and tag!=Europe and tag!=Iceland”, then a fourth, a fifth, etc. until you’ve reorganized every relevant tag and made them a child/grandchild of Europe, with the certainty that you’ve missed nothing)

Not sure I understand. Why don’t you just drag&drop all these tags onto Europe in the sidebar?

The sidebar is useful, sometimes it is the best option, but it also has a couple limitations:
(sorry for the long answer)

  1. not every tag with the same name wants the same parent.
    To make an example: let’s say I have downloaded this hypothetical deck. It does not have subdecks, but it is divided in three main sections by way of tags: “History”, “Geography”, “Art”. I decide to make these tags the parent tags.
    I then focus my attention on children tags.
    In this pre-made deck, there are some notes who have “History” and “Germany” as their tags, others that have “Geography” and “Germany”, and others that have “Art” and “Germany”.
    I cannot simply drag and drop “Germany” under any of the three parent tags, or else I would tag incorrectly 2/3 of the notes. e.g. I could have all the Art and History notes about Germany tagged as “Geography::Germany”. So what I have to do is to click on “Germany” in the sidebar, then type “tag:Geography” in the browser bar, use the replace tag addon to replace all the “Germany” tags → “Geography::Germany” and then rinse and repeat for “tag:Art” and “tag:History”.
    That is to say, I cannot simply rely on the sidebar and I still have to use the manual browser search quite a bit
    (I can see how in this case it might be better to organize the tags in some other way e.g. Europe::Germany::History, Europe::Germany::Geography, Europe::Germany::Art etc., but bare in mind that this is only an example)

  2. Within the sidebar there are all my tags (a lot of tags), not just those of the newly imported deck. So it can be a bit messy and tedious to manually check all the tags, click on them to see if they are from this new deck, then decide under which parent-tag they belong etc.
    The most natural and orderly workflow would be to start from a given parent tag, then moving downwards.
    E.g.
    step1: search tag:“Europe”; I see that the first note also has tag:“Germany”, so I do a new browser search with the keywords: tag:“Europe” tag:“Germany”. I then select all the resulting notes and replace tag:“Germany” with tag:“Europe::Germany” and delete tag:“Europe” from these notes.
    step2: I do a new browser search with these new keywords: -tag:*Germany tag:“Europe”. I see that the first note has the tag “Norway”. I repeat step1, except this time I replace tag:“Norway” with tag:“Europe::Norway” etc.
    step3: I search -tag:*Germany -tag:*Norway tag:Europe. I see that the first note has “Portugal”. I repeat again step1 etc.
    And so on.
    This works quite well, but it can be suboptimal if not all the notes have to-be-made children tags (or grandchildren or grandgrandchildren tags, depending at what nesting level you’re working at). Let’s say I have “Art::1800” as the parent tag and that most notes that result from the search query do not have any other tag besides “Art::1800”. Some, although, do. Since I am now looking for all the tags which belong as children of “Art::1800”, all the notes that only have “Art::1800” as their tag are nothing but noise. I would gladly remove them from the query, but I currently cannot do that without cumbersome workarounds.
    So I have to scroll through all the notes manually, hope that my eye catches a note with some other tag besides “Art::1800”, replace it the way I explained before, then go back and search manually for to-be-made children tags, and so on.
    The process can get even more complicated if the nesting is complex and these children have granchildren etc.
    It would be much more rapid and convenient if you could just type: tag:“Art::1800” tag!=“Art::1800” to exclude all the notes that do not have any to-be-made children tag.
    (example of two more iterations: tag:“Art::1800::France” tag!=*France; tag:“Art::1800::France::Impressionism” tag!=*Impressionism)
    This way, if you did everything correctly, you would be assured that you’ve addressed every and each to-be-made children tag, something that cannot be easily achieved through manual searches, as some tags can be very rare and human error is always to be considered

I see, that’s quite a complex use case, thanks for explaining. This is not related to the original topic anymore, but what I’d do in this situation is to import the new deck into a clean profile first. Then you can reorder things to your heart’s content, and import it into your real profile once you’re done.

1 Like

What I meant was doing something like your regexp_fields() function, except the separator would be different, and there’d be no separate indices argument. We could attempt to combine the two functions into one, but it might be clearer to keep them separate. Or were you suggesting something else?

No, exactly that. I just thought instead of something like n.tags regexp "\SmyTag(::|\S)" that we currently do, we could do tag = "myTag" or tag like "myTag::%". But on second thought, I may be wrong, and it’s not possible after all. (Haven’t looked at the code yet.)

Because the tags are packed in together, we couldn’t do an equality check, and sqlite’s like function isn’t unicode aware. The fastest approach would likely be splitting and checking in Rust, like in regexp_fields.