Feature request: Inform users when an Anki addon communicates with third parties

joshdavham · September 29, 2024, 5:23pm

Anki addons are capable of communicating with third parties via api calls, etc, meaning that a user’s data could be shared inapropriately and without the user’s knowldege or consent.

Given this, I think it would be helpful if there could be a little info tag or something on an addon’s page that says something like ‘this addon communicates with third parties’ just to keep the user informed.

In order to create this feature, there would likely need to be something like a quick code scan of the addon to see if there’s something like an ‘import requests’, etc in the addon’s code.

Let me know what you think!

Shigeyuki · September 30, 2024, 1:47am

Maybe it is not possible to auto identify only communications with third parties.

E.g. I’m developing Anki Leaderboard (fork), which auto sends user statistics to the server as data. To simplify this part of the code, it looks like this:

import requests
url = "server URL"
data = "statistics data" 
response = requests.post(url, data=data, timeout=15)

This function is almost the same as when accessing a URL to retrieve data, e.g. if I want to only receive data without sending it looks like this:

import requests
response = requests.get(url, timeout=15)

So if we were to identify a function that sends data add-ons that merely receive data would be mis-detected, e.g. getting a definition from a dictionary, getting an image from Google, IPA, translating via google or DeepL, audio, using AI, etc. This is a very common feature in Anki add-ons.

Other challenging tasks such as these:

Developers can also use “urllib” instead of “requests”.

Developers can rename the module when importing.

import requests as req
response = req.post(url, data=data, timeout=15)

Developers can incorporate different apps that are not easily readable, e.g. my add-ons:
- AnkiRestart : There is a small app built in to restart the system.
- Bulk Image Downloader : There is a small app built in to convert png and jpeg to the webp. (cwebp)
Some of the add-ons have extremely large source code, so I don’t think they can be auto parsed by AnkiWeb’s server. (maybe AnkiWeb does not have the ability to parse the source code.)

Another way would be to read the code of the add-on, many add-ons have simple source code and can be easily read. The most commonly used programming language in Anki is Python which is the easiest to learn and recently you can also use AI such as ChatGPT.

However even if we completely check them I don’t think it is completely safe to do so, e.g. add-ons can be auto updated so even if you read the source code completely there is a possibility that features will be added later.

So I think a practical solution is to check if the developer is trustworthy, e.g. Anking and Migaku are reliable because they are third party organizations they hire professional programmers, some well-known individual add-ons developers on Anki are Glutanimate, Abdo, ijgnd, Tatsumoto, Arthur Milchior, etc. they have been active on Anki for many years and are very reliable.

eroscard · September 30, 2024, 4:53pm

I imagine it would be difficult to identify malicious code within addons, but it would be cool to have an addon like this, as it would provide more security to users.

Shige, you are also part of the history and select group of great addon creators.

My knowledge of databases is almost zero at the moment, but I hope to one day create an addon that works like a DB, updating the necessary information automatically.

The idea would be for users to communicate with each other within Anki, either through messages or through competition.

For example: in a study competition to see who can make the same deck the fastest, it would be fun to study like this and see each other’s level.

joshdavham · September 30, 2024, 8:51pm

Thanks for your thoughts!

To respond to a couple of your points:

I don’t think that Anki should only look for when data is being sent to third parties. To simplify things: I’d suggest adding the ‘third party’ tag if an addon so much as retrieves data from third parties (e.g., calling a dictionary api)
I understand that the ‘requests’ module isn’t the only python module to communicate with third parties and that this can be renamed, but I still think detection of importing these types of modules should be possible most of the time with good enough accuracy.
You bring up a good point about updating addons. That would require a bit of work, for example if an addon were to previously make calls to a locally installed dictionary, then later switch to a third party dicitonary.
When it comes to checking with the trustworthiness of the developers, I wouldn’t be so generous (even if they are organizations with professional programmers). I won’t name names, but I strongly distrust one of the names you mentioned.

Overall, while I don’t think adding this tag would make addons safe in itself, I do think it would encourage users to exercise more caution within the anki ecosystem and keep them more informed as well.

addons_zz · October 2, 2024, 1:18am

For Sublime Text packages (add-ons), a public repository is reviewed before a new package is added: Pull requests · wbond/package_control_channel · GitHub

Adding a community-driven review process for add-ons before submitting them could be a start. However, that does not help if the add-on developer later adds such third-party communication (i.e., after the add-on is accepted and published). This happened with one famous Sublime text addon with about 3 million downloads. The community spotted the ‘intrusive’ code in one update and removed the add-on some days later from the public channel. Later, the ‘intrusive’ code was removed, and the add-on was re-accepted for new downloads again.

A reliable solution could be an AI that analyzes each add-on update (code) before publishing and studies its code to look for bad actors. However, the AI should be prone to error; someone would have to supervise it and instruct it/correct its behavior when it goes wrong, not to mention the costs of running this AI for each add-on submission.

Given the current situation, someone looking for a trustworthy add-on should read its evaluations, who the author is, and, if possible, look into its source code (if they can understand it).

joshdavham · October 2, 2024, 4:18am

Given the current situation, someone looking for a trustworthy add-on should read its evaluations, who the author is, and, if possible, look into its source code (if they can understand it).

Agreed.

Adding a community-driven review process for add-ons before submitting them could be a start.

What could such a community-driven review process look like? I’ve personally never heard of anything like that.

sorata · October 2, 2024, 4:48am

It can probably be used alongside with community review. In Wikipedia, we have tools that rate contributions on whether they are damaging or not or whether they were made in good faith, etc. It helps those who are patrolling. I imagine it would be helpful if particular parts of the code can be flagged in a similar way.

addons_zz · October 3, 2024, 12:39am

A more restricted code review could be a GitHub repository; all addons would have to open a pull-request committing their code. Selected community members will review this pull request and, if approved, merge it. After the code is merged, Anki will be able to download this addon. This is entirely restricted, as any add-on update would have to submit a new pull request to be reviewed and approved again before being released to Anki users.

This is also the safest option for any Anki user, as it would significantly reduce the probability of malicious code being installed. However, it would perhaps burden the community, which would have to review pull requests and merge them constantly.

It seems like a good idea. A GitHub bot with AI capabilities could review these pull requests and pre-approve or reprove them, and later, some community members could merge or close them.

joshdavham · October 3, 2024, 4:01pm

A community review sounds ideal in many ways, but, as you said, it would likely burden the community. Not only would it be a lot of work for good open source devs to maintain, but I think it would also discourage more anki addon development.

Personally, I’m more in favor of some sort of an automated code scan - be it AI or otherwise, just to look for high risk code.

More generally however, it’s not like I think addons communicating with third parties is bad in any way, I just think that there is currently a dangerous level of trust (naivety?) in the Anki ecosystem. People actually do store sensitive information in their decks and they do put too much trust into the addons and decks they download (which aren’t vetted). Fundamentally, I was just thinking that the ‘communicates with third parties’ could work to make users just a bit more thoughtful(/paranoid) about what they download and thus a bit more safe.

Shigeyuki · October 5, 2024, 3:05pm

I think that malicious programs that work offline are also highly dangerous, e.g. malware built in from the start, corrupting the PC irretrievably, corrupting the deck schedule. So I think if we base it on sends and receives, there may be a possibility to miss something like that.

Basically decks do not contain any info about money or privacy, so even if the add-on is collecting and submitting data about Anki and decks there is little risk to the user (such data is usually collected for research and improvement purposes). If the add-on is trying to access data that has nothing to do with Anki it becomes suspicious.

Other, I think of reliability like this:

Does author develop a lot of volunteer work?
- Basically programming is a high paying job. If the author wants money, they can make more money by doing programming work that has nothing to do with Anki.If the author wants to make a profit, it is unreasonable to develop a lot of free add-ons or to volunteer, so an author who does a lot of volunteer work is more reliable.
Is author a professional programmer or not?
- A novice developer may accidentally cause significant problems even if there is no malicious intent. (e.g. making a critical operational error, incorporating dangerous code, developing an add-on that makes the user do something illegal). With experienced professional programmers such risks are quite low. Developers who contribute a lot with Anki for Desktop or AnkiDorid rather than personal add-ons are more reliable.
Author is not anonymous?
- An author can be arrested if they distribute illegal malware, so an author who discloses their name, photo, address, and place of business is more reliable.
Is author really the person?
- Even if a name or URL is listed on AnkiWeb, it may not be the same person, so a simple fact check is ideal.
Is add-on original or not?
- If there is a malicious developer they might embed malware in a popular add-on and distribute it, so an add-on by the original author is more trustworthy than a fork.
Are there many contributors?
- An add-on with many contributors means that many developers are reading the code. If there is malicious code it is more likely to be discovered by the developers, so the more contributors the more reliable.
Are there lots of users?
- The more users there are, the more chances there are for more developers to read the code, so popular add-ons that have been developed for many years are more reliable. New released add-ons are less reliable because no one may have checked them yet.
Is the code written for easy reading?
- Friendly developers add many comments to the code to make it easier for other developers to understand the add-ons. This makes the code more reliable because it is easier to understand all of it.
Is the add-on code simple and short?
- Simple and short add-ons are reliable because they can be read quickly, but they are often less convenient as well.
Does the author list the license correctly?
- Meticulous authors correctly list the license for the code and materials they use in their add-ons. If code is clearly copied but no mention of it or if material from an unknown source is used it is a bit suspicious.
Is the add-on publicly available on Github?
- If the code is publicly available on Github it will be easier for many developers to check out add-ons.
Does author actively interact with users?
- Sometimes authors actively interact with users on Github, Discord, Reddit, etc. In these cases I think it is unlikely that they are incorporating malicious code for money or mischief.

So in my opinion third parties and well known developers are very reliable, they fulfill many of these checkpoints, are develop for free in huge quantities and their code is very serious, so it is clear from the developer’s view that they are not interested in profit or mischief at all.

These developers’ activities are difficult for the average user to understand, e.g. the hard work of the developers does not change anything on the surface thus from average users’ views the developers seem to be doing nothing.(thus for the average users, third parties and monetizing authors look like scammers)

However as already explained even if all of these are checked they are not completely safe.

E.g. These are the risks:

Embedding maulware when updating an add-on.
Embedding malicious code in a sophisticated way that is not known to the average developer.
Using different code on AnkiWeb than what is publicly available on GitHub.
The account is real but has been hacked.
Photos and biographies are AI generated.
Author appears friendly but is actually a scammer.
Removing malicious code in some way after execution.

In short add-ons can be developed for anything so any malicious workaround can be developed, so if users want to be as safe as possible it is safest to use native Anki without add-ons, Anki for desktop is checked by official Anki and is read by many developers, so it is the most reliable. (Basically to develop add-ons developers need to read Anki’s code.)

So in my case measures are like this:

Make sure that if my PC is broken or infected with malware, it doesn’t matter.
- Simply put even if my PC is completely broken or hacked, if the contents are empty there is nothing to steal and no damage, so I keep backups of all important data and I do the important stuff on a different PC.
Toggle off suspicious add-ons and examine the code before running them.
- E.g. there are sometimes add-ons that have just been released and have no description and unknown purpose. It is dangerous if these add-ons contain malware, so at first I toggle it off and read the code.

However I think it is extremely unlikely that add-ons actually contain any kind of malicious malware, according to AnkiForums maybe there has been only one suspicious case so far. (Though it is possible that it was quickly removed by the official Anki, or possibly undetected.)

I guess the reason for this is that the number of users of add-ons is extremely small, e.g. according to the author’s page of my add-ons releases, the number of downloads of the usual add-ons (not so popular but still useful) is in the tens to hundreds, even the leaderboard of popular add-ons currently has only about 1700 active users.

This means that even if a malicious developer develops add-ons only tens or hundreds of them will be downloaded. So if the malware is for profit there is no benefit to developing add-ons at all, it would be easier and more reasonable to develop a Chrome extension with a large number of users instead or to send a lot of spam emails.

So I check the security of these as well just to be sure, but in reality I’m mainly trying to prevent errors and bugs in add-ons, like this:

Install and update add-ons one by one.
- If you do not know which add-ons are the cause of the error it can be quite troublesome, thus it is easier to identify errors if you update or download add-ons one by one while checking the working of each add-on instead of installing them in batches.
Wait a week or so to update add-ons.
- If a busy developer rushes to fix a problem, they may submit problematic code and break it even more. Waiting a week or so makes it easier to get a stable version.
See AnkiWeb page before updating.
- If there are critical errors in popular add-ons they are usually reported in the ratings, so if you are concerned it is relatively safe to read the AnkiWeb page to make sure it is working.
Toggle off add-ons that are only used occasionally.
- As already explained updates can be a workaround for malware, and they can also cause errors so you can reduce unexplained errors and security risks by toggling them on only when you use them.

If measures like this are taken for errors maybe it will help a little bit for security measures. (e.g. by delaying the update someone might discover the malicious code first and the add-on will be removed.)

Topic		Replies	Views
Is there any addon that collects user data and interacts with the user? Add-ons	1	369	January 15, 2022
Use a custom url protocol for add-ons instead of copy-pasting codes Suggestions	6	172	September 14, 2024
Accessing AnkiWeb w/ py requests Development	3	320	August 20, 2023
Is-it possible to download anki addon from an addon? Add-ons	5	473	November 25, 2021
Ankiweb endpoints for My Shared Suggestions	5	191	August 28, 2024

Feature request: Inform users when an Anki addon communicates with third parties

Related topics