Automatic removal of `<div></div>`

aPaci · April 8, 2021, 8:46am

Sometimes i find my fields cluttered with useless <div></div> or random   instead of simple spaces. It may be due to addons or aggressive copy-pasting or only god knows what.
Sometimes i even find something like this

Wouldn’t it be possible to add a simple feature that detects empty <div></div> or more generally “useless html junk” and delete them? Maybe when one checks the database? or when anki shutdown?

kleinerpirat · April 8, 2021, 9:58am

Regarding the divs

You can use the “Find & Replace” function in the browser to remove these empty divs.

Cleaning up with Regex

To clean up monsters like the one in your image, regular expressions will not be as good as a proper Python script / parser. But you could try anyways:

Since Anki’s Regular Expression engine doesn’t support recursion like (<div>)(?R)?(<\/div>) (Perl syntax), you would risk unmatched tags with this regular expression:
<div>(<div>|<\/div>)*<\/div>

You could test this on a single note and see if it’s worth risking some side effects.

With the current version of Anki, these empty divs should not appear as often anymore, because Enter no longer inserts <div></div>:

github.com/ankitects/anki

Prefer <br> over <div></div> in Editor

ankitects:master ← hgiesel:preferbr

opened 01:28PM - 14 Jan 21 UTC

hgiesel

+60 -18

In the Anki Editor, pressing Enter will create a mix of `<div></div>` and `<br>`…. Most WYSIWYG editors out there either use `<div></div>` or even `<p></p>`. But I think the Anki Editor is somewhat special here, because it allows you to use templating tools, like MathJax or Anki clozes. This behavior of occasionally using `<div>` makes using these template languages significantly harder. 1. You have to fit MathJax into one line. MathJax can get quite out of hand sometimes, but in Anki, you always have to fit it in a single line, making it very hard to read. MathJax offers an [option to allow specific self-closing tags within itself](http://docs.mathjax.org/en/latest/options/document.html), which Anki would greatly benefit from. **EDIT**: I noticed that actually Anki strips html inside MathJax tags now, and at the time of writing I wasn't aware of "Shift+Enter". So this problem is somewhat alleviated now. 2. Anki Clozes break when used with multiple lines. This is an example of a `Cloze 1` with broken Anki Clozes: ![Screenshot 2021-01-14 at 14 01 48](https://user-images.githubusercontent.com/7188844/104594215-219a8500-5671-11eb-87cc-b1ba46160ca7.png) There's three things wrong here: 1. The clozes are missing the blue highlight entirely (sometimes only the first line has it), because the Anki cloze wants to wrap it in a span, but the `div`s, break the layout. 2. The newlines are inconsistent. `}}` on a new line might or might not create a newline. 3. When clicking three times on a line, and activating the cloze shortcut, you might get one of the following results: ``` {{c1::Another line}} // or {{c1::Another line }} // or {{c1:: Another line }} ``` What I would expect is the first behavior, always. I introduced this small change into `editor.ts`, which would change this behavior to always insert `<br>` instead. After that, you get consistent behavior in that regard: ![Screenshot 2021-01-14 at 13 30 45](https://user-images.githubusercontent.com/7188844/104595317-b94ca300-5672-11eb-84e5-9c7cd3f38635.png) Wondering about this, I dug a little bit in Anki history, and found [this old post](https://anki.tenderapp.com/discussions/ankidesktop/4363-div-instead-of-br-are-now-inserted-when-pressing-enter), and this old commit 077d6b8187db4e5b2cfc44fe58acad2bd6614c53 from 2013, which seem to imply that this used to be the behavior, but was changed because of unspecified issues (at least I couldn't find an exact description). Considerations: * `execCommand("insertLineBreak")` [only works in WebKit](https://w3c.github.io/editing/docs/execCommand/#the-insertlinebreak-command), which means it won't work on AnkiMobile (even though I don't know whether the editor there is a webview).

Regarding non-breaking-spaces

I can reliably reproduce the insertion of   by copy-pasting formatted content in the editor:

It doesn’t matter whether you paste it between different fields or into the same one. I’d really appreciate this getting fixed.

How to clean up non-breaking-spaces

In the meantime @aPaci, you can select all notes with Ctrl+A and (again) use the “Find & Replace” feature like this:

dae · April 9, 2021, 10:58am

The non-breaking spaces are generated by the web toolkit, and it’s not trivial to work around them without breaking things.

aPaci · April 9, 2021, 5:04pm

I see that there is already a lot of work on this topic; what i was thinking was a simple workaround like that when you check the database (but this is just an example), an automatic process find and replace every non breaking space with a normal space…

garrettm30 · April 12, 2021, 2:14am

I would recommend against automatically converting nonbreaking spaces, as they are a legitimate character with legitimate uses. We use them relatively little in English, but some languages use them much more. For example, my work is in French for a small publisher, and they are in everyday use for punctuation in French. Similarly, I do use them with Anki, and sometimes I have to go out of my way to make sure they are not wrongly converted by Anki.

royst0n · April 12, 2021, 6:06am

I saw find-and-replace mentioned. If anybody needs a GUI text preparation tool, I created detergent.io (there’s npm package too) which allows to detect non-breaking space characters (U+00A0) and decide what to do with them; also strip HTML; also encode/decode entities (including non-breaking spaces); set the letter case, and collapse white-space — full toolkit. I originally created it to prepare text for pasting into HTML email templates but it helps me preparing text for Anki, especially lower-casing Cyrillic and German letters. Detergent is Open-Source and not monetized or tracked. I hope it helps somebody.

Topic		Replies	Views
Prevent Anki from adding <div> Help	12	1455	May 1, 2023
Remove <div>s on paste Suggestions	3	373	September 1, 2023
Where in source to replace newline <div> behaviour Development	9	1389	May 1, 2023
Paste adds <div> tags Help	6	967	May 1, 2023
Why does Anki randomly add div tags sometimes, and sometimes no? Help	4	627	January 23, 2023

Automatic removal of `<div></div>`

Regarding the divs

Regarding non-breaking-spaces

Related topics