Embed google images into note?

I have google images embedded into my back note, showing me images from {{Expression}}.
Is it possible to have it show me not the grid of images but the first image that the search gives me?
For example, if it googles Cheese, could I have it display the first image in the row bigger, so I don’t have to look at all the small images in a file? Perhaps someone here uses another search provider for this? I learn languages monolingually and my main approach is word->image, so this method is essential to me. Image downloading extensions work but I’d like to know if there’s a more dynamic approach. Maybe there’s another easier way to this, if there is please share it! Thanks for the help

Anyone?

How are you embedding Google Images? As far as I know Google doesn’t allow embedding Search other than with a Custom Search Engine.

This requires some JS skills:

You would have to access the content of the Google Images iframe and scrape the information you need. You’ll need to analyze the structure of the image results page and then query the images with document.querySelectorAll() (perhaps they all share the same class, they usually have cryptic names like “rg_i Q4LuWd”) and access the desired image of that Node array.

Best way to go about this is to use AnkiWebView Inspector - AnkiWeb.

Hey,
thus far I’ve been using bing like this:
<embed src="https://www.bing.com/images/search?q={{Expression}}&form=HDRSC2&first=1&scenario=ImageHoverTitle&cc=IT" style="width: 100%; height:100%;min-height: 500px">
But bing isn’t very good for things other than english most of the times.
I’ve been looking into embedding the website differently or finding an image provider for multiple language queries. For example japanese or korean expressions. Haven’t had any luck thus far.
I have no JS, or programming skills apart from fixing some code here and there, to speak of so it would be rather complicated for me to do.
Do you have any other suggestions as to how I could approach this problem?
Thanks a lot!

Maybe, I was thinking, one could simplify this situation by generating a script which fetches a random image from a specific url. Doesn’t have to be google or bing. Pinterest is wonderful for image learning. Would something like this be easier?

Here I forgot about the Same-origin policy, which prevents webpages within webpages to interact with one another. To avoid it, you need to pull the page’s content via an external API. Try this code:

<script>
   function scrapeGoogleImage(content) {
      // get first element of class "mimg"
      if((match = content.match(/<img class=\".+?alt.+?src=\"(.+?)\"/)) != null) {
          var img = document.createElement("img")
          img.src = match[1]
          document.getElementById("google-image").appendChild(img)
      }
   }
   
   function scrapeBingImage(content) {
      // get first element of class "mimg"
      if((match = content.match(/<img class=\"mimg.+?src=\"(.+?)\"/)) != null) {
          var img = document.createElement("img")
          img.src = match[1]
          document.getElementById("bing-image").appendChild(img)
      }
   }
   
   // pull content from page via API to avoid Same-origin policy
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent('https://www.google.com/search?tbm=isch&q={{Expression}}')}`)
   .then(response => {
   	if (response.ok) return response.json()
   	throw new Error('Network response was not ok.')
   })
   .then(data => scrapeGoogleImage(data.contents));
   
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent('https://www.bing.com/images/search?q={{Expression}}&form=HDRSC2&first=1&scenario=ImageHoverTitle&cc=IT')}`)
   .then(response => {
   	if (response.ok) return response.json()
   	throw new Error('Network response was not ok.')
   })
   .then(data => scrapeBingImage(data.contents));
   
</script>

<div id="bing-image">Bing Image:<br></div>
<br>
<div id="google-image">Google Image:<br></div>

Edit: I expanded the script to include the first Google image as well (in a very lazy way). I could set it up in a way that there are multiple results too, if you want?

1 Like

Understanding it a bit better now!
This script works as a standalone without the card template or do I need to edit anything else?

You can simply paste it into your card template. The HTML beneath the <script>:

is the place where the images land. You can adjust them to your liking (and ofc the links in the script as well).

Working! Two questions:
Is it possible to have the fetched images be bigger?
Gifs don’t seem to play, is this a limitation or an addition that needs to be made to the code?
Regarding to your question about multiple images, yes! Would be really cool to have that option :slight_smile:
Thanks so much for your help
Edit: Is it possible to make this work with Pinterest too? Tried adding an additional code block from what you gave me but nothing happens.

Hey, you must be very busy. Just wondering if you’ve had any luck?
Have a great day!

Hi Yannick!

This is not a trivial task, since the images you get from my script are just the thumbnails of the search page, not the full-size ones. To get to those, you would need to fetch the website to which the image links to and scrape that page for the full image. I’ve experimented a bit today, but so far I haven’t come up with a presentable solution.

This is also due to the fact that we’re only scraping thumbnails right now.

Here you go:

<div id="bing-image">Bing:<br></div>
<br>
<div id="google-image">Google:<br></div>

<script>

var sites = [{
      name: "google",
      url: "https://www.google.com/search?tbm=isch&q={{Expression}}",
      selector: "a > div > img",
      amount: 3
   },
   {
      name: "bing",
      url: "https://www.bing.com/images/search?q={{Expression}}&form=HDRSC2&first=1&scenario=ImageHoverTitle&cc=IT",
      selector: ".mimg",
      amount: 3
   }
]

function scrapeImages(content, site) {

   let container = document.createElement("div")
   container.innerHTML = content
   let i = 0

   while ((img = container.querySelector(site.selector)) != null && i++ < site.amount) {
      img.parentNode.removeChild(img)
      document.getElementById(site.name + "-image").appendChild(img)
   }
}

function fetchHTML(site) {
   // pull content from page via API to avoid Same-origin policy
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent(site.url)}`)
      .then(response => {
         if (response.ok) return response.json()
         throw new Error('Network response was not ok.')
      })
      .then(data => scrapeImages(data.contents, site))
}

// Initiate search
for (site of sites) {
   fetchHTML(site)
}

</script>

I think this code is relatively intuitive to understand, so you might be able to expand it for other search engines. You just need to find a proper CSS selector for the desired images.
To find a fitting selector, you need to Right-click -> Inspect on an image of the search page and look for a class or pattern in the source code.

With “amount” you set the number of images you want from each search page.

Pinterest is very restrictive, it forces a login page on every search url I’ve tried so far. I think Google Images and Bing are the only ones I can provide to you.


That’s all the effort I am willing to spend on this issue at the moment. University duties are catching up to me.

Perhaps I’ll post a guide on how to ingegrate a Google Programmable Search Engine into Anki templates some time in the summer. That might also be of interest to you. I’ll @ you should I ever create that guide.

Please let me know if the script works for you. Have a nice day!

1 Like

Thanks so much for your help and yes please! If you post that guide do let me know :slight_smile:
Edit: Script working flawlessly :smiley:

@kleinerpirat is it possible to have the images fetched at random? For example 3,6,8 or whatever number but each time it fetches them for a note/card to make them random and different?
Thanks!

/* Helper function
https://stackoverflow.com/a/19270021 */
function getRandom(arr, n) {
    var result = new Array(n),
        len = arr.length,
        taken = new Array(len)
        while (n--) {
            var x = Math.floor(Math.random() * len)
            result[n] = arr[x in taken ? taken[x] : x]
            taken[x] = --len in taken ? taken[len] : len
        }
    return result
}

var sites = [{
      name: "google",
      url: "https://www.google.com/search?tbm=isch&q={{Expression}}",
      selector: "a > div > img",
      amount: 3,
      random: true
   },
   {
      name: "bing",
      url: "https://www.bing.com/images/search?q={{Expression}}&form=HDRSC2&first=1&scenario=ImageHoverTitle&cc=IT",
      selector: ".imgpt > a.iusc > div > .mimg[alt]",
      amount: 3,
      random: true
   }
]

function scrapeImages(content, site) {

   let container = document.createElement("div")
   container.innerHTML = content
   let imageList = container.querySelectorAll(site.selector)
   let images = Array.from(imageList)

   let selected = (site.random ? getRandom(images.slice(0, 20), site.amount) : images.slice(0, site.amount))
   for (img of selected)  {
      img.parentNode.removeChild(img)
      document.getElementById(site.name + "-image").appendChild(img)
   }
}

function fetchHTML(site) {
   // pull content from page via API to avoid Same-origin policy
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent(site.url)}`)
      .then(response => {
         if (response.ok) return response.json()
         throw new Error('Network response was not ok.')
      })
      .then(data => scrapeImages(data.contents, site))
}

// Initiate search
for (site of sites) {
   fetchHTML(site)
}
1 Like

Love ya :slight_smile:

@kleinerpirat Hey, another question, but it’s probably gonna consume much of your time perhaps so don’t worry about answering until you’ve got free time!
I tried scraping baidu for my chinese deck, but apparently it doesn’t have the selectors you mention
It explains it here: https://edmundmartin.com/scraping-baidu-with-python/

using relatively simply BeautifulSoup selectors. Before appending the result to our results list.

Getting the Underlying URL

As previously mentioned the full underlying URL is not displayed anywhere in Baidu’s search results. This means we must write a couple of functions to extract the full underlying URL. There may be another way to get this URL, but I’m not aware of it. If you know how, please share the method with me in the comments.

If this would be possible to implement I’d be scraping images directly from chinese content, which would be wonderful.
Have a nice day :smiley:

Hey again!
Was wondering if it’s possible to have the different var sites not have a <br> between them?
What I mean is that each search seems to append a break so the next images come one line after the first ones. What I would like is to have them all together, different url’s of course, but without the seams. Is this possible?
Thanks for your help :smiley:

Just delete the <br>'s from the template :point_up_2:t2:
(and the search engine names too if you want. you only need the divs)

1 Like

Doesn’t seem to be doing it…both urls fetch images but they still have a <br>
Image:


Code:

<div id="google2-image"></div><div id="google-image"></div>
<script>
function getRandom(arr, n) {
    var result = new Array(n),
        len = arr.length,
        taken = new Array(len)
        while (n--) {
            var x = Math.floor(Math.random() * len)
            result[n] = arr[x in taken ? taken[x] : x]
            taken[x] = --len in taken ? taken[len] : len
        }
    return result
}

var sites = [{
      name: "google",
      url: "https://www.google.com/search?tbm=isch&q={{English}}+clipart",
      selector: "a > div > img",
      amount: 3,
      random: true
   },

{
      name: "google2",
      url: "https://www.google.com/search?tbm=isch&q={{japanese}}+クリップアート",
      selector: "a > div > img",
      amount: 3,
      random: true
   }
]

function scrapeImages(content, site) {

   let container = document.createElement("div")
   container.innerHTML = content
   let imageList = container.querySelectorAll(site.selector)
   let images = Array.from(imageList)

   let selected = (site.random ? getRandom(images.slice(0, 20), site.amount) : images.slice(0, site.amount))
   for (img of selected)  {
      img.parentNode.removeChild(img)
      document.getElementById(site.name + "-image").appendChild(img)
   }
}

function fetchHTML(site) {
   // pull content from page via API to avoid Same-origin policy
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent(site.url)}`)
      .then(response => {
         if (response.ok) return response.json()
         throw new Error('Network response was not ok.')
      })
      .then(data => scrapeImages(data.contents, site))
}

// Initiate search
for (site of sites) {
   fetchHTML(site)
}
</script>