PDF to notes [Support thread]

TRIAEIOU · November 30, 2021, 9:33am

Using the poppler library (https://poppler.freedesktop.org/) the add-on converts a PDF file into notes.

Question/prompt extracted from each page (the first line of text)
Convert PDF pages to separate “normal” (front/back) notes or separate clozes in a single cloze type note.
PDF pages inserted as images or HTML (using poppler pdftoppm and pdftohtml).

Deck: Which deck the added note(s) will be inserted into
Note type: Which note type to use for insertion, supports “normal” (front/back) note types as well as cloze note types.
Front (“normal” (front/back) note type): Which field to insert the “question” (first line of text in the page) in.
Back (“normal” (front/back) note type): Which field to insert the “answer” in.
Title field (cloze note type): Which field to insert PDF file name as title (for note types, such as the built-in cloze, that do not have a suitable field for this, select <none>).
Cloze field (cloze note type): Which field to insert clozes into. Clozes are inserted as prompt: {{c1::<br>answer}}<br> where prompt is the first line of text extracted from the page and answer is either an image of the page or a <div> with the page HTML.
Format: Format to insert the pages in, either as images (will preserve exact layout and work well on all screen sizes but no editable/selectable text) or HTML (does not give perfect results on any screen, especially not small screens but text can be copied/edited).

INSTALLATION

Windows binaries of poppler are packaged in the addon (GitHub - oschwartz10612/poppler-windows: Prebuilt Poppler binaries packaged for windows with dependencies).
On many Linux installations it is included in the default install, otherwise install with a package manger, for instance apt: sudo apt install poppler-utils.
On macOS it can be installed with the homebrew package manager: brew install poppler (untested as I don’t have access to a Mac).

superj · December 8, 2022, 11:22pm

On linux I got
[Errno 2] No such file or directory: ‘/tmp/tmp5hp11s3m/pgtxt.txt’

TRIAEIOU · December 9, 2022, 7:52am

Ok, could you run Anki from bash and see what you get from stderr? pgtxt.txt is the temp file that subprocess.run([PDFTOTXT, "-layout", pdf, tmp_file], stdout=subprocess.PIPE, universal_newlines=True, shell=True) is supposed to write to so I am guessing that fails for some reason. You are certain you have pdftotext installed an in the path?

HamidG · January 19, 2023, 5:06am

This addon is very good, but please make a feature where you can make one page of the pdf the front and one page the back or even one pdf be all the front and one pdf all the back, this would be very useful for making occlusion cards using pdf

TRIAEIOU · January 22, 2023, 3:32pm

I am not sure I understand, do you mean something like this?

Page 1 and 2 make front and back of one note
Page 3 and 4 make front and back of a second note
Etc

When you say “occlusion cards”, do you mean image occlusions? Or something else?

HamidG · January 23, 2023, 8:05am

sorry for not being clear:
using image occlusion add-on takes too long for me to make the cards that I want;

what I do is take the professors pdf slides convert them to images and hide the important parts, keywords, definitions, etc…

But when I use image occlusion add-on it takes a very long time to hide all the text I need hide(I have to do about 400 slides for 8 subjects each semester), so what I want to be able to do is the following:

I have a pdf reader on my iPad called “documents by reddle”
I open pdf on iPad using it
highlight in black the text I want to hide (using iPad would be 10x faster then using image occlusion add-on to hide the text)
Open anki on my pc and have the original pdf and the pdf with hidden text
add the pdf to your addon and the addon makes them into flashcards( the details are below)

idea/ feature 1:
one PDF: page 1 front page 2 back, page 3 front page 4 back (like you said)

idea/ feature 2:

pdf1 on the right pdf2 on the left
two PDFs:
Card #1
(pdf1 page 1) front
(pdf2 page 1) back
Card #2
(pdf1 page 2) front
(pdf2 page 2) back

(meaning that all the front of all the cards will be from pdf1 and all the back of the cards will be from pdf2)

I have one pdf that is normal and a second pdf that I have covered some text from, I want to be able to use one pdf as the front part of the cards and one as the back part of the cards, (idea 1)

or I have a way to merge these pdfs in an alternating fashion (its an online tool) (I have page 1 pdf 1 and then page 1 pdf 2, page 2 pdf 1 page 2 pdf 2) so i can just use one pdf even number being front and odd number the back (idea 2)

I will screen shot of a merged pdf for you to see what i mean for idea 1 if you didn’t understand it

it would be best to make both the ideas so people can use idea 1 or 2 depending on there situation

TRIAEIOU · February 20, 2023, 9:57pm

Ok, I understand, I can maybe have a look at it when I have some spare time (a few other things are ahead in the pipe line).

ruban · August 5, 2023, 12:00pm

Seeking Assistance with PDF to notes: unclear why Anki is still unable to detect popplear(installed with homebrew)

I installed homebrew and popplear from the homebrew. Then I entered the code for the add-on and restarted Anki. But it showed an error message.

So I reinstalled popplear library and I checked that the necessary components are in the correct location by running a command “pdftotext -v” as this command checks if the pdftotext utility from the Poppler library is accessible from the command line.

Based on the terminal output it appears that the Poppler library was successfully reinstalled using Homebrew. The output indicates that the Poppler version 23.08.0 was downloaded and poured into the appropriate location. But still anki keep on showing the same error. I am using the latest version of anki. I kindly request your guidance and support in resolving these issues. Any insights, suggestions, or fixes you can provide would be greatly appreciated.

TRIAEIOU · August 5, 2023, 10:46pm

Hmm, I don’t have a mac to test on, could it be that the poppler files are not in the path that is used by Anki? Or that sys.platform doesn’t return darwin? Do you have any familiarity with any programming languages? If so, you could:

Go into the addon directory (From the Addon dialog → View Files)
Edit __init__.py:
a. Insert a new line at line 351 (the line after if not (sys.platform == 'win32' or sys.platform == 'cygwin'):)
b. Enter the following on the inserted line: print(f'----------------------\n{sys.platform}\n{os.environ}\n----------------')
Run anki from the console - this way you will see what is printed and I would be interested on seeing the output between the two ----------------------- (edit any sensitive info before posting)

ruban · August 6, 2023, 12:57pm

I’m not familiar with what you said. Even the I did the of Homebrew and poppler installation with the help of Chat gpt. I’ll try doing this with the help of chat gpt again.
Edit:
I did inserted the command u mentioned.

But I couldn’t Run Anki from the console. Don’t know how to do it.

ruban · August 6, 2023, 1:39pm

i dunno if it’s right i tried to run anki from the console

ruban · August 6, 2023, 1:43pm

When i opened Anki after inserting the line you mentioned it showed me some kind of error.

TRIAEIOU · August 8, 2023, 5:46pm

Yes, Python is a whitespace sensitive language, you need to indent the inserted line to the same level as the next line, i.e. so it looks like this:

To run from the command line, try opening a terminal (untested as I don’t have a Mac):

cd /Applications/Anki.app/Contents/MacOS/
./Anki

Then you should see some stuff printed to the terminal, among that info, look for the stuff between the ------------------------

If you are unable to get Anki running from the command line we can try this instead, at the same place in the same file insert
show_info(f'platform: {sys.platform}')
show_info(f'env: {os.environ}')

So that it looks like this:

That will show the same info but in two dialogs instead.

Jan7898 · October 19, 2023, 5:34pm

Hi Triaiou,
I got the same issue as ruban, who unfortunaly didn´t respond to your last massage.
After installing popplers Libary and checking every step twice with chatgpt i always get the Message from anki:

The following parts were not detected on the system:

pdfinfo
pdftohtml
pdftoppm
pdftotext
The poppler library cannot be included with the addon on this plattform. please install the poppler library through a package manager (with homebrew (brew.sh) “brew install poppler-utils” or equivalent) and ensure the above are in the path.

twice checked.

So i tested your method, to includ a code into the init.py file and after that i am getting an extra massage saying. Platform:Darwin that i couldn´t click away. so i undid that step and after so many hours i wanna find out what else i can do to use your Add on because its so cool.

May the reason be that i am using a Macbook. (correct version of anki installed)

Thank you for your respons
Regards Jan

Jan7898 · October 19, 2023, 5:52pm

Bildschirmfoto 2023-10-19 um 19.40.081740×978 385 KB

after trying your way the second time - i undid it again because anki showed me the same error @ruban had. so this cant be the solution.
You were interested whats the outcome between the --------

darwin
environ({‘__CFBundleIdentifier’: ‘com.apple.Terminal’, ‘TMPDIR’: ‘/var/folders/bl/bm3kz5k11bg67wpmrjrvxnmh0000gn/T/’, ‘XPC_FLAGS’: ‘0x0’, ‘LaunchInstanceID’: ‘B687C63F-65CB-4853-8983-9364DF05E895’, ‘TERM’: ‘xterm-256color’, ‘SSH_AUTH_SOCK’: ‘/private/tmp/com.apple.launchd.ucBTawK4iv/Listeners’, ‘SECURITYSESSIONID’: ‘186a3’, ‘XPC_SERVICE_NAME’: ‘0’, ‘TERM_PROGRAM’: ‘Apple_Terminal’, ‘TERM_PROGRAM_VERSION’: ‘447’, ‘TERM_SESSION_ID’: ‘26300E9D-933A-4849-B426-E3DA977096B4’, ‘SHELL’: ‘/bin/zsh’, ‘HOME’: ‘/Users/jannixdorf’, ‘LOGNAME’: ‘jannixdorf’, ‘USER’: ‘jannixdorf’, ‘PATH’: ‘/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/usr/local/Cellar/poppler/23.10.0/bin:/usr/texbin:/Library/TeX/texbin’, ‘SHLVL’: ‘1’, ‘PWD’: ‘/Users/jannixdorf/Desktop/Anki.app/Contents/MacOS’, ‘OLDPWD’: ‘/Users/jannixdorf’, ‘LANG’: ‘de_DE.UTF-8’, ‘_’: ‘/Users/jannixdorf/Desktop/Anki.app/Contents/MacOS/./Anki’, ‘__CF_USER_TEXT_ENCODING’: ‘0x1F5:0x0:0x3’, ‘TERMINFO_DIRS’: ‘/usr/share/terminfo’, ‘LIBOVERLAY_SCROLLBAR’: ‘0’, ‘PLATFORM’: ‘mac:13.4.1’, ‘QT_SCALE_FACTOR’: ‘1.0’, ‘QT_ENABLE_HIGHDPI_SCALING’: ‘1’, ‘QT_SCALE_FACTOR_ROUNDING_POLICY’: ‘PassThrough’, ‘QTWEBENGINE_DICTIONARIES_PATH’: ‘/Users/jannixdorf/Library/Application Support/Anki2/dictionaries’})

pastet that ito chatgpt that told me everything with darwin was fine but tbh nothing was fine haha

TRIAEIOU · October 24, 2023, 5:26pm

Hi, ok thanks. Everything looks as it should so I don’t really understand what is not working. Could you you try a few things from the terminal (i.e. open a terminal window) and paste the output here?

pdftotext --version
which pdftotext
ls -al /usr/local/bin
ls -al /usr/local/Cellar/poppler/23.10.0/bin

Also, if you run Anki from the terminal, do you get the same error? I think you run Anki from the terminal on MacOS through open /Applications/Anki.app depending on how you installed it.

Thanks.

krstoevan · December 3, 2023, 5:19am

update:
i tried import a pdf converted from uptodate,
seems this time the sorting is working. thanks

hi, i made the following comment at the add on page:
the plugin do it’s job, however i am NOT using it as it grabbed the 1st line into the sort field by default.
personally i’ll prefer the image’s name e.g. book-abc-p001.png etc so that i can sort myself.

currently, the sort order is RANDOM, make it rather not useable to ME.
but if it changed, it may save you from export a PDF to *.png, import by mediaimport.
I forget did this addon can import into image occlusion note types?
thanks.

if i do the export->media import way,
“media import” addon allow choosing what become what when importing,
thus i could make the image name/page name into the sort field.

now the sort field is random, making it quite handicapped.
thanks

krstoevan · December 13, 2023, 5:13pm

hi, i import a PDF into the deck,
besides the image,
it included the text within the PDF.
thus make my size very large.
is it possible to skip it?

besides,
as many PDF are black text on white background,
is it possible to do an invert on them?

i asked gpt4 for a toggle for dark/white toggle,
but it would be good to be dark by default.
thanks

paulj · February 19, 2025, 4:31am

Hi,
Not sure if you’ve already found a solution for this problem, but I had the exact same issue about finding a reliable way to mass-convert pdf notes to anki decks since it was easier for me to make handwritten flashcards using a note-taking tool on my iPad (GoodNotes, Notability, etc.) Any app would work as long as the final files are exported to PDF.

Built an app NotesAnkify to help automatically process and push pdf flashcards to Anki. Has different processing modes (eg: the top half of a pdf page can be taken as question and the bottom half as answer. If you have a specific marker or dimension of page that you use for making the cards, then you could have these cards in any of your pdf files and NotesAnkify can specifically find and extract those cards across all the files in a nested folder.)

Do check it out and let me know if you need any help setting it up.

NotesAnkify
NotesAnkify - Documentation

Relevant reddit post mentioning the usecase and features

Topic		Replies	Views
Pdf to notes throwing error Add-ons	3	37	November 3, 2024
Export flashcards with formatting Help	23	10016	May 1, 2023
Export alls Flashcards to a PDF Suggestions	2	9669	March 13, 2023
Add-on to create cards Help	8	3807	May 1, 2023
The best AI app/addon to generate flashcards from pdf Learning Effectively	8	13840	December 17, 2024

PDF to notes [Support thread]

Bildschirmfoto 2023-10-19 um 19.40.081740×978 385 KB after trying your way the second time - i undid it again because anki showed me the same error @ruban had. so this cant be the solution. You were interested whats the outcome between the --------

Related topics

Bildschirmfoto 2023-10-19 um 19.40.081740×978 385 KB

after trying your way the second time - i undid it again because anki showed me the same error @ruban had. so this cant be the solution.
You were interested whats the outcome between the --------