PDF to notes [Support thread]

Using the poppler library (https://poppler.freedesktop.org/) the add-on converts a PDF file into notes.

  • Question/prompt extracted from each page (the first line of text)
  • Convert PDF pages to separate “normal” (front/back) notes or separate clozes in a single cloze type note.
  • PDF pages inserted as images or HTML (using poppler pdftoppm and pdftohtml).

a1

a2

a4

  • Deck: Which deck the added note(s) will be inserted into
  • Note type: Which note type to use for insertion, supports “normal” (front/back) note types as well as cloze note types.
  • Front (“normal” (front/back) note type): Which field to insert the “question” (first line of text in the page) in.
  • Back (“normal” (front/back) note type): Which field to insert the “answer” in.
  • Title field (cloze note type): Which field to insert PDF file name as title (for note types, such as the built-in cloze, that do not have a suitable field for this, select <none>).
  • Cloze field (cloze note type): Which field to insert clozes into. Clozes are inserted as prompt: {{c1::<br>answer}}<br> where prompt is the first line of text extracted from the page and answer is either an image of the page or a <div> with the page HTML.
  • Format: Format to insert the pages in, either as images (will preserve exact layout and work well on all screen sizes but no editable/selectable text) or HTML (does not give perfect results on any screen, especially not small screens but text can be copied/edited).

INSTALLATION

2 Likes

On linux I got
[Errno 2] No such file or directory: ‘/tmp/tmp5hp11s3m/pgtxt.txt’

Ok, could you run Anki from bash and see what you get from stderr? pgtxt.txt is the temp file that subprocess.run([PDFTOTXT, "-layout", pdf, tmp_file], stdout=subprocess.PIPE, universal_newlines=True, shell=True) is supposed to write to so I am guessing that fails for some reason. You are certain you have pdftotext installed an in the path?

This addon is very good, but please make a feature where you can make one page of the pdf the front and one page the back or even one pdf be all the front and one pdf all the back, this would be very useful for making occlusion cards using pdf

I am not sure I understand, do you mean something like this?

  • Page 1 and 2 make front and back of one note
  • Page 3 and 4 make front and back of a second note
  • Etc

When you say “occlusion cards”, do you mean image occlusions? Or something else?

sorry for not being clear:
using image occlusion add-on takes too long for me to make the cards that I want;

what I do is take the professors pdf slides convert them to images and hide the important parts, keywords, definitions, etc…

But when I use image occlusion add-on it takes a very long time to hide all the text I need hide(I have to do about 400 slides for 8 subjects each semester), so what I want to be able to do is the following:

  1. I have a pdf reader on my iPad called “documents by reddle”
  2. I open pdf on iPad using it
  3. highlight in black the text I want to hide (using iPad would be 10x faster then using image occlusion add-on to hide the text)
  4. Open anki on my pc and have the original pdf and the pdf with hidden text
  5. add the pdf to your addon and the addon makes them into flashcards( the details are below)

idea/ feature 1:
one PDF: page 1 front page 2 back, page 3 front page 4 back (like you said)

idea/ feature 2:



pdf1 on the right pdf2 on the left
two PDFs:
Card #1
(pdf1 page 1) front
(pdf2 page 1) back
Card #2
(pdf1 page 2) front
(pdf2 page 2) back

(meaning that all the front of all the cards will be from pdf1 and all the back of the cards will be from pdf2)

I have one pdf that is normal and a second pdf that I have covered some text from, I want to be able to use one pdf as the front part of the cards and one as the back part of the cards, (idea 1)

or I have a way to merge these pdfs in an alternating fashion (its an online tool) (I have page 1 pdf 1 and then page 1 pdf 2, page 2 pdf 1 page 2 pdf 2) so i can just use one pdf even number being front and odd number the back (idea 2)

I will screen shot of a merged pdf for you to see what i mean for idea 1 if you didn’t understand it

it would be best to make both the ideas so people can use idea 1 or 2 depending on there situation

Ok, I understand, I can maybe have a look at it when I have some spare time (a few other things are ahead in the pipe line).

Seeking Assistance with PDF to notes: unclear why Anki is still unable to detect popplear(installed with homebrew)

I installed homebrew and popplear from the homebrew. Then I entered the code for the add-on and restarted Anki. But it showed an error message.


So I reinstalled popplear library and I checked that the necessary components are in the correct location by running a command “pdftotext -v” as this command checks if the pdftotext utility from the Poppler library is accessible from the command line.

Based on the terminal output it appears that the Poppler library was successfully reinstalled using Homebrew. The output indicates that the Poppler version 23.08.0 was downloaded and poured into the appropriate location. But still anki keep on showing the same error. I am using the latest version of anki. I kindly request your guidance and support in resolving these issues. Any insights, suggestions, or fixes you can provide would be greatly appreciated.

Hmm, I don’t have a mac to test on, could it be that the poppler files are not in the path that is used by Anki? Or that sys.platform doesn’t return darwin? Do you have any familiarity with any programming languages? If so, you could:

  1. Go into the addon directory (From the Addon dialog → View Files)
  2. Edit __init__.py:
    a. Insert a new line at line 351 (the line after if not (sys.platform == 'win32' or sys.platform == 'cygwin'):)
    b. Enter the following on the inserted line: print(f'----------------------\n{sys.platform}\n{os.environ}\n----------------')
  3. Run anki from the console - this way you will see what is printed and I would be interested on seeing the output between the two ----------------------- (edit any sensitive info before posting)

I’m not familiar with what you said. Even the I did the of Homebrew and poppler installation with the help of Chat gpt. I’ll try doing this with the help of chat gpt again.
Edit:
I did inserted the command u mentioned.


But I couldn’t Run Anki from the console. Don’t know how to do it.


i dunno if it’s right i tried to run anki from the console

When i opened Anki after inserting the line you mentioned it showed me some kind of error.

Yes, Python is a whitespace sensitive language, you need to indent the inserted line to the same level as the next line, i.e. so it looks like this:

To run from the command line, try opening a terminal (untested as I don’t have a Mac):

  • cd /Applications/Anki.app/Contents/MacOS/
  • ./Anki

Then you should see some stuff printed to the terminal, among that info, look for the stuff between the ------------------------

If you are unable to get Anki running from the command line we can try this instead, at the same place in the same file insert
show_info(f'platform: {sys.platform}')
show_info(f'env: {os.environ}')

So that it looks like this:

image

That will show the same info but in two dialogs instead.

Hi Triaiou,
I got the same issue as ruban, who unfortunaly didn´t respond to your last massage.
After installing popplers Libary and checking every step twice with chatgpt i always get the Message from anki:

The following parts were not detected on the system:

  • pdfinfo
  • pdftohtml
  • pdftoppm
  • pdftotext
    The poppler library cannot be included with the addon on this plattform. please install the poppler library through a package manager (with homebrew (brew.sh) “brew install poppler-utils” or equivalent) and ensure the above are in the path.

twice checked.

So i tested your method, to includ a code into the init.py file and after that i am getting an extra massage saying. Platform:Darwin that i couldn´t click away. so i undid that step and after so many hours i wanna find out what else i can do to use your Add on because its so cool.

May the reason be that i am using a Macbook. (correct version of anki installed)

Thank you for your respons
Regards Jan


after trying your way the second time - i undid it again because anki showed me the same error @ruban had. so this cant be the solution.
You were interested whats the outcome between the -------- :slight_smile:

darwin
environ({‘__CFBundleIdentifier’: ‘com.apple.Terminal’, ‘TMPDIR’: ‘/var/folders/bl/bm3kz5k11bg67wpmrjrvxnmh0000gn/T/’, ‘XPC_FLAGS’: ‘0x0’, ‘LaunchInstanceID’: ‘B687C63F-65CB-4853-8983-9364DF05E895’, ‘TERM’: ‘xterm-256color’, ‘SSH_AUTH_SOCK’: ‘/private/tmp/com.apple.launchd.ucBTawK4iv/Listeners’, ‘SECURITYSESSIONID’: ‘186a3’, ‘XPC_SERVICE_NAME’: ‘0’, ‘TERM_PROGRAM’: ‘Apple_Terminal’, ‘TERM_PROGRAM_VERSION’: ‘447’, ‘TERM_SESSION_ID’: ‘26300E9D-933A-4849-B426-E3DA977096B4’, ‘SHELL’: ‘/bin/zsh’, ‘HOME’: ‘/Users/jannixdorf’, ‘LOGNAME’: ‘jannixdorf’, ‘USER’: ‘jannixdorf’, ‘PATH’: ‘/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/usr/local/Cellar/poppler/23.10.0/bin:/usr/texbin:/Library/TeX/texbin’, ‘SHLVL’: ‘1’, ‘PWD’: ‘/Users/jannixdorf/Desktop/Anki.app/Contents/MacOS’, ‘OLDPWD’: ‘/Users/jannixdorf’, ‘LANG’: ‘de_DE.UTF-8’, ‘_’: ‘/Users/jannixdorf/Desktop/Anki.app/Contents/MacOS/./Anki’, ‘__CF_USER_TEXT_ENCODING’: ‘0x1F5:0x0:0x3’, ‘TERMINFO_DIRS’: ‘/usr/share/terminfo’, ‘LIBOVERLAY_SCROLLBAR’: ‘0’, ‘PLATFORM’: ‘mac:13.4.1’, ‘QT_SCALE_FACTOR’: ‘1.0’, ‘QT_ENABLE_HIGHDPI_SCALING’: ‘1’, ‘QT_SCALE_FACTOR_ROUNDING_POLICY’: ‘PassThrough’, ‘QTWEBENGINE_DICTIONARIES_PATH’: ‘/Users/jannixdorf/Library/Application Support/Anki2/dictionaries’})

pastet that ito chatgpt that told me everything with darwin was fine but tbh nothing was fine haha

Hi, ok thanks. Everything looks as it should so I don’t really understand what is not working. Could you you try a few things from the terminal (i.e. open a terminal window) and paste the output here?

  • pdftotext --version
  • which pdftotext
  • ls -al /usr/local/bin
  • ls -al /usr/local/Cellar/poppler/23.10.0/bin

Also, if you run Anki from the terminal, do you get the same error? I think you run Anki from the terminal on MacOS through open /Applications/Anki.app depending on how you installed it.

Thanks.

update:
i tried import a pdf converted from uptodate,
seems this time the sorting is working. thanks

hi, i made the following comment at the add on page:
the plugin do it’s job, however i am NOT using it as it grabbed the 1st line into the sort field by default.
personally i’ll prefer the image’s name e.g. book-abc-p001.png etc so that i can sort myself.

currently, the sort order is RANDOM, make it rather not useable to ME.
but if it changed, it may save you from export a PDF to *.png, import by mediaimport.
I forget did this addon can import into image occlusion note types?
thanks.

if i do the export->media import way,
“media import” addon allow choosing what become what when importing,
thus i could make the image name/page name into the sort field.

now the sort field is random, making it quite handicapped.
thanks

hi, i import a PDF into the deck,
besides the image,
it included the text within the PDF.
thus make my size very large.
is it possible to skip it?

besides,
as many PDF are black text on white background,
is it possible to do an invert on them?

i asked gpt4 for a toggle for dark/white toggle,
but it would be good to be dark by default.
thanks