Implementing Range Requests for local html in collection.media through anki webserver?

Apologies, this is not my area of expertise.

I’m working on building Anki functionality that works by referencing a local webapp (css/html/js files in the collection.media folder) through an iframe.

I’ve gotten this to work well locally (ie, just pointing directly to the html file in the iframe), but some features require that the web app be hosted behind a server. The main one I am struggling to replicate locally is the use of range request headers to load partial data from local resources. Imagine this as like loading only pages 50-55 of a pdf.

I don’t know a TON about how Anki works, but I know that it serves up a webserver front-end, where the port can be variable depending on when you start Anki (ie, 127.0.0.1:5678). I’ve tried dynamically hosting the iframe pointing to the webserver (essentially the path of the html but with “http://127.0.0.1:5678/” appended before it) but am still not getting the functionality I am after.

The issue here is that I am really out of my depth on this specific integration with how anki works. Can anyone with knowledge of how the anki webserver works and the potential for range request headers point me in the right direction?

Thank you!

Anki’s local server is implemented in Flask: https://github.com/ankitects/anki/blob/main/qt/aqt/mediasrv.py

If support for range headers is missing from it, I suggest you implement a local server in an add-on and adds whatever features are required to get your idea going. I have an add-on that uses a local server to serve a pop-up dictionary, though it doesn’t use range headers: https://github.com/abdnh/anki-zim-reader/blob/05637870fe1b967b14d9108a29cbf44a16d0100a/src/server.py

1 Like

This is great, thank you!

Unfortunately an add-on defeats the purpose of the project. It needs to be cross-platform. It currently works across all platforms, but without support for range requests.

It looks like adding support for range requests for anki’s server through flask is prohibitive. I see things like this online: Flask-RangeRequest · PyPI Is it reasonable to try to implement something like this in Anki?

Sorry, I’m not really keen on adding extra dependencies to the core code that would not receive wide-spread use. You may be able to do so using an add-on though.

1 Like

After a little bit of research, there are a lot of resources about adding range requests without additional dependencies.

I’ll do more research and see if it warrants a pull request.

If this works, it will receive extremely wide-spread use. Happy to work through more of that idea if I find a solution.

it will receive extremely wide-spread use

I’m rather skeptical I’m afraid. I suggest you focus on an add-on for now, and if it proves popular, a PR can be considered at a later date.

I don’t understand… Did I go in to the actual solution here? I’m not sure how you can be skeptical of something I haven’t even told you about?

Again, I will not be focusing on an add-on. An add-on defeats the entire purpose of this project, and will not lead to wide-spread use. Thank you for the suggestion though.

Maybe someone will have different suggestions for you if you post more details about the project?

Why would you need range requests anyways? They might make sense when downloading data from the web (why download the whole file if you only need a part of it?) but your anki media is already on your device, downloading it whole or just a small part would not make much of a difference. The highest size a file in your media folder can be is 100MB, you can just load it whole, range requests would not make much of a difference.

1 Like

It’s a web app that uses pdf.js along with machine learning components to auto-tag and present pdfs as an embedded iframe to users. I’m working on it with the Ankihub team, and it essentially removes the need for copyright images in Anki decks.

Ie, instead of giving everyone screenshots of First Aid, they place their First Aid pdf in a folder, and everything is auto-tagged and navigated to based on bookmarks in the pdf and actual content.

I’m running a modified version of pdf.js and am trying to get progressive loading to work. Pdf.js accomplishes progressive loading with range request headers.

Right now I’m evaluating pre-processing pdfs into chunks that can be progressively loaded and just getting range request headers to work in Anki.would highly prefer the latter for simplicity.

Any advice aside from “I’m skeptical” is welcome.

This is a great point. I would assume that this has to do with how the document is parsed. Maybe the answer is to tweak the worker that parses the document to function progressively? My understanding is that the headers do this, but maybe I can move the process slightly downstream.

This will be my new next step before I try to implement my own completely new progressive loading system.

You said before that you want your solution to work cross-platform. Ankidroid reviewer as of now serves media using the file:// protocol that does not support fetching nor XMLHTTPRequests so most likely your pdf.js solution won’t work on android anyways.

AFAIK, it’s possible to fetch() media files on Android: Feature Request: Support for external (in /collection.media) javascript files · Issue #9098 · ankidroid/Anki-Android · GitHub

2 Likes

In response to your comments, it already works across all platforms, as I have already stated.

To solve the XMLHTTP requests on ankidroid, you need to dynamically insert the iframe so it points to the (edit: https://appassets.androidplatform.net) path as a webserver.

Also, fetch requests work within the iframe as long as the html page called in the iframe and the dynamic requests are in the same folder.

I think we may not have been on quite the same page here. When I mentioned “wide-spread use”, what I was trying to communicate is “is this something that will have a wide range of applications, or is it a change that will mainly only benefit your particular use-case?” You replied with “it will receive extremely wide-spread use” and I thought you were trying to convey the former, which I felt was unlikely given how we’ve lived without it up until now.

Thank you for elaborating on your idea. While it sounds like a clever technical hack, I think it would have a number of downsides:

  • You said that this works cross-platform. If you have tested this on AnkiWeb and found it to work, it will be doing so in quite an expensive fashion, as due to the way media files are stored by AnkiWeb, the entire file must be read in, even if a small part is desired. And if it’s not currently working, then this solution isn’t cross-platform.
  • Media files have a maximum size limit, which large pdfs could exceed.
  • It’s not the most efficient use of resources - parsing and rendering a pdf on each card transition is considerably more expensive than displaying an image. It will likely wear down mobile device’s batteries more quickly, and it’s more moving parts that could go wrong in the future.

Instead of an on-demand approach, have you considered pre-processing instead, where the user provides their own source data and it spits out the image files?

2 Likes

Ah, that makes much more sense. I was very confused. Either way, it would benefit only a few use cases but would theoretically be a very small change for a big payout. Honestly, I’m not even sure still how to implement the change needed. Learning about headers, etc required by different programs seems to have little documentation.

Direct responses to each downside:

  • With some tinkering, I did just get it to work on AnkiWeb. Although it does work, AnkiWeb is not a priority for me. “the entire file must be read in, even if a small part is desired” is the exact problem I am trying to solve here on other platforms though. Solutions I have already discussed would also work in AnkiWeb. I have not yet checked if the server that AnkiWeb is using supports range requests, which would mean that this is a non issue. It is likely that it does, since most modern servers do.
  • I have tested this on 500mb pdf files with over 20,000 pages. I’m not sure where the limit is, but it is unlikely it will ever be reached if I have not reached it in my testing. This seems like a pdf pre-processing problem (ie, linearization, optimization, etc). Also related to my current attempts to fetch only part of a pdf at a time with range requests.
  • I don’t really know how to respond to this without intense testing. I have been using this extensively myself, and have not noticed battery changes. The iframe isn’t loaded unless the user clicks a button to expand it. If the problem of range requests is solved, this would also be a non issue, since it would only be loading the portion of the pdf that the user wants.

I’ve iterated through pre-processing the pdf using an image-only version and pdf2html, which converts a pdf to a static background image and editable text. Both work fine but are arguably more complex in some ways and prone to breaking. The major benefit in using pdf.js is preserving the text for future editing and collaboration. Ie, I can literally highlight text on the page automatically that is associated with the card. I can integrate collaboration through a number of pdf collaboration tools. I can view comments and annotations made in the pdf. It also works with the Amboss addon. Many, many benefits to rendering the pdf as opposed to using some of these other workarounds.

The fact that pdf.js is standardized and working off of the immediate source material is a major benefit. It’s essentially like viewing a pdf on your computer. You click the pdf in your downloads folder, and it opens in a web browser. This is parsing the pdf in the same way.

One concept I have played with is getting the iframe to somehow maintain its last loaded state across card reviews, so the pdf would only have to be loaded once in a review session. I didn’t think this was possible honestly. If it is, please let me know.

Understood, but as the one who runs it and pays its bills, it is for me :slight_smile:

The frontend webserver may support range requests, but behind the scenes, AnkiWeb would have to fetch the entirety of that large file every time a small part of it is desired.

The maximum media file sync for syncing is 100MB. But ideally, files are a few megabytes at most, as if a file upload or download is terminated part-way through due to flaky internet, it has to start again at the start.

Once the user has successfully converted the pdf to one or more images, they no longer need either the converter or the source file, and as it’s just standard images that need to be displayed during review, there’s not much that can go wrong.

I can see the advantages, but my concerns are the downsides.

If the user has multiple PDFs, that sounds like the user would rapidly run out of available memory on devices with lower specs. This is another advantage of doing the conversion up-front: you can extract all the images from one PDF at a time on a computer (or web service), and don’t need to hold multiple files in memory at once for performance.

1 Like

While I am slightly confused at the turn this conversation has taken, from asking for help with an isolated issue to defending the entire solution, I hope that if I present my case well you will help me with my current issue. I would ask that although this response is very long, you read it in its entirety when you have time.

Problem
I will describe the problem so that we can evaluate the costs/benefits of each solution in context. Anki is an incredible tool for retention of information. The downside of Anki (and we can go back and forth about this, but this sentiment is shared by almost all of the students and faculty I have spoken to), is that it does not provide adequate context for learning and relearning material. The very nature of disjointed cards to test retention is antithetical to learning things by chunking them in context. Many counsel students to learn the material through lectures, etc, and then reinforce with Anki, but this does not solve the problem of cards that are not present in those lectures and relearning cards you get wrong (ie, going back and listening to a whole lecture for a single concept forgotten is not something people do). This means that functionally, learning cards without context is not as effective and relearning a card is not as effective as the first learn, and both are more likely to be forgotten.

This problem is, in my opinion, the main systemic problem that Anki faces. This is why the AnKing decks (with screenshotted context for most cards), amboss addon, pop-up dictionary addon, etc are so popular. Unfortunately, problems with copyright means that many of these solutions operate in a legal gray area. It is also extremely difficult to add in context from other sources, such as books, lectures, etc.

Solution
An ideal solution would be that a user can upload any document, slides from lectures, a textbook, novels, a study guide, etc, and Anki will automatically provide the part of that document that can give context to the card when a card is forgotten. This is essentially what I am trying to build. If a student is studying Spanish, I want them to be able to upload a copy of Don Quixote, automatically showing the page with an example of the phrase they are trying to learn. I want them to be able to upload a spanish textbook that will automatically show them a table of conjugations that shows them the conjugation usage in their notecard in context. I want them to be able to automatically parse through massive amounts of information and immediately show context for each specific card.

Implications of Ideal Solution
Now, with that context of the problem and solution, here are a few key points that I hope will now make sense:

  1. An add-on will not work for the viewer. This needs to be cross-platform, otherwise the user will not be able to obtain the context they need when relearning cards.
  2. The viewer cannot use plain images. The user will need to be able to scroll, search, edit, etc to contextualize the information needed to remember each card. The problem I am trying to solve here is context. Using plain images would fundamentally hinder this goal.
  3. The viewer needs to be built on a gold-standard base that is as lightweight and future proof as possible.

Addressing the Selective Rendering Problem
With this in mind, let me address what seems to be the only major non-addressable downside from your comments: computing cost of rendering the pdf on each card.

The main assumption here is that the process of rendering the pdfs through pdf.js is significantly greater than just rendering images through plain html. I believe that the downsides of using plain images are much greater, and the computing cost of rendering pdfs is much lower than you believe. Let me go in to exactly what I mean with a real-world example:

Boards and Beyond has 11,167 slides for step1 content. When these 11,167 images are parsed through multiple cycles of compression and optimization, the lowest file-size I can achieve is around 250mb for all of them. When I add overlayed text to these images to enable searching, copying, collaboration, etc, the total size jumps to about 750mb. The pdf of all 11,167 slides is 172mb including images, text, bookmarks, etc.

When I think about which of these three solutions fits within our ideal solution, images alone is eliminated for reasons expressed above. Images with overlayed text seems space inefficient, and more likely to put a load on ankiweb servers. The best solution seems to be to use pdfs.

Touching on the dynamic loading of pdfs, I agree that this is a major issue. That is why I started this thread in the first place. However, the same solutions that make images work can be applied to pdfs. I can fragment a pdf into one document for each page, then dynamically load it with XMLHTTP requests. This would make any difference in processing power between images and pdfs to be functionally negligible. The problem here is not the PDF format, which seems to be the best option in terms of space efficiency and implementing a gold standard solution (pdf.js is used by major web browsers and is backed by over 10 years of work and innumerable contributors). The problem is how to get the document to load in hopefully a better way than splitting it into individual pages, and to take advantage of the years of work that has gone in to getting pdf.js to work so seamlessly.

For example, if I can get the worker to render pages selectively, and eliminate the need for range requests, then there would be negligible increased load. This isn’t done in pdf.js, because it isn’t designed to be implemented with local pdf files outside of a server, so range requests are the only way to prevent it from downloading the whole file. However, we are not downloading anything. The file is local. The only operation is to render it.

This morning, I manually loaded my 172mb boards and beyond file without incremental loading onto my 6 year old $200 Moto G5. It rendered in under 3 seconds.

Potential Solutions to the Selective Rendering Problem
Below are solutions in order from most preferred to least preferred.

  1. Somehow enable range requests and run the files through the anki webservers (will disable support for ankiweb to avoid increasing your bill, or work to find a solution there).
  2. Alter pdf.js to render pages dynamically and load the local file in the iframe
  3. Split pdfs into multiple files by top-level bookmark, create a landing page pdf document that includes all of the bookmarks and links to each pdf within the viewer (simple using url hashes pointing to the different files, and this solution will likely be done even if another solution is implemented to overcome size limits, etc).
  4. Split the pdf into individual pages that are loaded dynamically with XMLHTTP requests if no other dynamic loading option exists (last resort).

Addressing Specific Responses

I hope that my potential solutions and problem context address this, but essentially I will either find a way to make it render dynamically or will take out support for ankiweb.

Addressed in “Potential Solutions to the Selective Rendering Problem”. Again, pdfs are still the right choice, it just depends on how they need to be pre-processed and how much of a pain that is.

As someone who has gone through this process with a few different resources, I can say that it is not as simple as this. Apart from plain images being fundamentally against the goals of this project (ie, providing navigable context), the process of updating, editing material, adjusting tags, etc is where the complexity is. Images are just not scalable. We would need to parse this through the AI tagging system, then convert to what could be thousands of images, then potentially overlay text? This will cost you more space on ankiweb for a less-robust solution. It’s also a non-issue, because for many reasons stated before, we aren’t converting or downloading pdfs, we are rendering them, and it isn’t that resource intensive. This becomes a complete non-issue the second we implement any working form of incremental loading.

This is a good point, and definitely something to think about with that concept. However, it does not support completely switching to images for the reasons above. We’re not converting anything, we are rendering it. Very similarly to how a web browser renders a video, image, audio clip, html page, etc. The argument that we are rendering more of it will become a non-issue once any form of incremental loading is implemented.

Conclusion
Anyway, I still don’t fully understand why we are having this conversation. I just made a thing that could solve a major problem, and ran into an issue that could potentially be solved with an anki-specific solution, so I posted here. The majority of these solutions are not anki-specific though, so I would imagine this whole conversation is moot unless you have been withholding some magical solution for range request headers that will be revealed if I can manage to convince you of the merits of this approach.

It sounds like you believe the goal of Anki to be a tool that helps you remember something (that has already been learnt and understood) to a tool that helps one with the initial learning of a subject or concept.

If that is the case, Anki as it stands will not serve you well. It is designed to help one in remembering and therein lies its strength.

It is clear you did not read my previous comment.

There are multiple issues with this that I have already reviewed in detail. If you would like to question these assumptions, question them directly, instead of vague statements like

As stated previously, there are cards that exist in premade decks that are not reviewed in lecture material or through another source, and context is also important for relearning cards, which is the entire point of anki, ie, identifying and relearning things you don’t remember to keep the remembering curve fresh.