Embed google images into note?

yaoberh · April 26, 2021, 11:01pm

I have google images embedded into my back note, showing me images from {{Expression}}.
Is it possible to have it show me not the grid of images but the first image that the search gives me?
For example, if it googles Cheese, could I have it display the first image in the row bigger, so I don’t have to look at all the small images in a file? Perhaps someone here uses another search provider for this? I learn languages monolingually and my main approach is word->image, so this method is essential to me. Image downloading extensions work but I’d like to know if there’s a more dynamic approach. Maybe there’s another easier way to this, if there is please share it! Thanks for the help

yaoberh · May 16, 2021, 11:53pm

Anyone?

kleinerpirat · May 17, 2021, 6:46am

How are you embedding Google Images? As far as I know Google doesn’t allow embedding Search other than with a Custom Search Engine.

This requires some JS skills:

You would have to access the content of the Google Images iframe and scrape the information you need. You’ll need to analyze the structure of the image results page and then query the images with document.querySelectorAll() (perhaps they all share the same class, they usually have cryptic names like “rg_i Q4LuWd”) and access the desired image of that Node array.

Best way to go about this is to use AnkiWebView Inspector - AnkiWeb.

yaoberh · May 17, 2021, 12:39pm

Hey,
thus far I’ve been using bing like this:
<embed src="https://www.bing.com/images/search?q={{Expression}}&form=HDRSC2&first=1&scenario=ImageHoverTitle&cc=IT" style="width: 100%; height:100%;min-height: 500px">
But bing isn’t very good for things other than english most of the times.
I’ve been looking into embedding the website differently or finding an image provider for multiple language queries. For example japanese or korean expressions. Haven’t had any luck thus far.
I have no JS, or programming skills apart from fixing some code here and there, to speak of so it would be rather complicated for me to do.
Do you have any other suggestions as to how I could approach this problem?
Thanks a lot!

yaoberh · May 17, 2021, 2:59pm

Maybe, I was thinking, one could simplify this situation by generating a script which fetches a random image from a specific url. Doesn’t have to be google or bing. Pinterest is wonderful for image learning. Would something like this be easier?

kleinerpirat · May 17, 2021, 6:02pm

Here I forgot about the Same-origin policy, which prevents webpages within webpages to interact with one another. To avoid it, you need to pull the page’s content via an external API. Try this code:

<script>
   function scrapeGoogleImage(content) {
      // get first element of class "mimg"
      if((match = content.match(/<img class=\".+?alt.+?src=\"(.+?)\"/)) != null) {
          var img = document.createElement("img")
          img.src = match[1]
          document.getElementById("google-image").appendChild(img)
      }
   }
   
   function scrapeBingImage(content) {
      // get first element of class "mimg"
      if((match = content.match(/<img class=\"mimg.+?src=\"(.+?)\"/)) != null) {
          var img = document.createElement("img")
          img.src = match[1]
          document.getElementById("bing-image").appendChild(img)
      }
   }
   
   // pull content from page via API to avoid Same-origin policy
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent('https://www.google.com/search?tbm=isch&q={{Expression}}')}`)
   .then(response => {
   	if (response.ok) return response.json()
   	throw new Error('Network response was not ok.')
   })
   .then(data => scrapeGoogleImage(data.contents));
   
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent('https://www.bing.com/images/search?q={{Expression}}&form=HDRSC2&first=1&scenario=ImageHoverTitle&cc=IT')}`)
   .then(response => {
   	if (response.ok) return response.json()
   	throw new Error('Network response was not ok.')
   })
   .then(data => scrapeBingImage(data.contents));
   
</script>

<div id="bing-image">Bing Image:<br></div>
<br>
<div id="google-image">Google Image:<br></div>

Edit: I expanded the script to include the first Google image as well (in a very lazy way). I could set it up in a way that there are multiple results too, if you want?

yaoberh · May 17, 2021, 6:16pm

Understanding it a bit better now!
This script works as a standalone without the card template or do I need to edit anything else?

kleinerpirat · May 17, 2021, 6:23pm

You can simply paste it into your card template. The HTML beneath the <script>:

is the place where the images land. You can adjust them to your liking (and ofc the links in the script as well).

yaoberh · May 18, 2021, 3:20pm

Working! Two questions:
Is it possible to have the fetched images be bigger?
Gifs don’t seem to play, is this a limitation or an addition that needs to be made to the code?
Regarding to your question about multiple images, yes! Would be really cool to have that option
Thanks so much for your help
Edit: Is it possible to make this work with Pinterest too? Tried adding an additional code block from what you gave me but nothing happens.

yaoberh · May 22, 2021, 3:30pm

Hey, you must be very busy. Just wondering if you’ve had any luck?
Have a great day!

kleinerpirat · May 22, 2021, 8:09pm

Hi Yannick!

This is not a trivial task, since the images you get from my script are just the thumbnails of the search page, not the full-size ones. To get to those, you would need to fetch the website to which the image links to and scrape that page for the full image. I’ve experimented a bit today, but so far I haven’t come up with a presentable solution.

This is also due to the fact that we’re only scraping thumbnails right now.

Here you go:

<div id="bing-image">Bing:<br></div>
<br>
<div id="google-image">Google:<br></div>

<script>

var sites = [{
      name: "google",
      url: "https://www.google.com/search?tbm=isch&q={{Expression}}",
      selector: "a > div > img",
      amount: 3
   },
   {
      name: "bing",
      url: "https://www.bing.com/images/search?q={{Expression}}&form=HDRSC2&first=1&scenario=ImageHoverTitle&cc=IT",
      selector: ".mimg",
      amount: 3
   }
]

function scrapeImages(content, site) {

   let container = document.createElement("div")
   container.innerHTML = content
   let i = 0

   while ((img = container.querySelector(site.selector)) != null && i++ < site.amount) {
      img.parentNode.removeChild(img)
      document.getElementById(site.name + "-image").appendChild(img)
   }
}

function fetchHTML(site) {
   // pull content from page via API to avoid Same-origin policy
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent(site.url)}`)
      .then(response => {
         if (response.ok) return response.json()
         throw new Error('Network response was not ok.')
      })
      .then(data => scrapeImages(data.contents, site))
}

// Initiate search
for (site of sites) {
   fetchHTML(site)
}

</script>

I think this code is relatively intuitive to understand, so you might be able to expand it for other search engines. You just need to find a proper CSS selector for the desired images.
To find a fitting selector, you need to Right-click -> Inspect on an image of the search page and look for a class or pattern in the source code.

With “amount” you set the number of images you want from each search page.

Pinterest is very restrictive, it forces a login page on every search url I’ve tried so far. I think Google Images and Bing are the only ones I can provide to you.

That’s all the effort I am willing to spend on this issue at the moment. University duties are catching up to me.

Perhaps I’ll post a guide on how to ingegrate a Google Programmable Search Engine into Anki templates some time in the summer. That might also be of interest to you. I’ll @ you should I ever create that guide.

Please let me know if the script works for you. Have a nice day!

yaoberh · May 23, 2021, 12:51pm

Thanks so much for your help and yes please! If you post that guide do let me know
Edit: Script working flawlessly

yaoberh · May 23, 2021, 2:34pm

@kleinerpirat is it possible to have the images fetched at random? For example 3,6,8 or whatever number but each time it fetches them for a note/card to make them random and different?
Thanks!

kleinerpirat · May 23, 2021, 7:57pm

/* Helper function
https://stackoverflow.com/a/19270021 */
function getRandom(arr, n) {
    var result = new Array(n),
        len = arr.length,
        taken = new Array(len)
        while (n--) {
            var x = Math.floor(Math.random() * len)
            result[n] = arr[x in taken ? taken[x] : x]
            taken[x] = --len in taken ? taken[len] : len
        }
    return result
}

var sites = [{
      name: "google",
      url: "https://www.google.com/search?tbm=isch&q={{Expression}}",
      selector: "a > div > img",
      amount: 3,
      random: true
   },
   {
      name: "bing",
      url: "https://www.bing.com/images/search?q={{Expression}}&form=HDRSC2&first=1&scenario=ImageHoverTitle&cc=IT",
      selector: ".imgpt > a.iusc > div > .mimg[alt]",
      amount: 3,
      random: true
   }
]

function scrapeImages(content, site) {

   let container = document.createElement("div")
   container.innerHTML = content
   let imageList = container.querySelectorAll(site.selector)
   let images = Array.from(imageList)

   let selected = (site.random ? getRandom(images.slice(0, 20), site.amount) : images.slice(0, site.amount))
   for (img of selected)  {
      img.parentNode.removeChild(img)
      document.getElementById(site.name + "-image").appendChild(img)
   }
}

function fetchHTML(site) {
   // pull content from page via API to avoid Same-origin policy
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent(site.url)}`)
      .then(response => {
         if (response.ok) return response.json()
         throw new Error('Network response was not ok.')
      })
      .then(data => scrapeImages(data.contents, site))
}

// Initiate search
for (site of sites) {
   fetchHTML(site)
}

yaoberh · May 23, 2021, 9:43pm

Love ya

yaoberh · May 23, 2021, 10:53pm

@kleinerpirat Hey, another question, but it’s probably gonna consume much of your time perhaps so don’t worry about answering until you’ve got free time!
I tried scraping baidu for my chinese deck, but apparently it doesn’t have the selectors you mention
It explains it here: https://edmundmartin.com/scraping-baidu-with-python/

using relatively simply BeautifulSoup selectors. Before appending the result to our results list.

Getting the Underlying URL

As previously mentioned the full underlying URL is not displayed anywhere in Baidu’s search results. This means we must write a couple of functions to extract the full underlying URL. There may be another way to get this URL, but I’m not aware of it. If you know how, please share the method with me in the comments.

If this would be possible to implement I’d be scraping images directly from chinese content, which would be wonderful.
Have a nice day

yaoberh · June 13, 2021, 12:31am

Hey again!
Was wondering if it’s possible to have the different var sites not have a <br> between them?
What I mean is that each search seems to append a break so the next images come one line after the first ones. What I would like is to have them all together, different url’s of course, but without the seams. Is this possible?
Thanks for your help

kleinerpirat · June 13, 2021, 2:30am

Just delete the <br>'s from the template
(and the search engine names too if you want. you only need the divs)

yaoberh · June 13, 2021, 7:44pm

Doesn’t seem to be doing it…both urls fetch images but they still have a <br>
Image:

Code:

<div id="google2-image"></div><div id="google-image"></div>
<script>
function getRandom(arr, n) {
    var result = new Array(n),
        len = arr.length,
        taken = new Array(len)
        while (n--) {
            var x = Math.floor(Math.random() * len)
            result[n] = arr[x in taken ? taken[x] : x]
            taken[x] = --len in taken ? taken[len] : len
        }
    return result
}

var sites = [{
      name: "google",
      url: "https://www.google.com/search?tbm=isch&q={{English}}+clipart",
      selector: "a > div > img",
      amount: 3,
      random: true
   },

{
      name: "google2",
      url: "https://www.google.com/search?tbm=isch&q={{japanese}}+クリップアート",
      selector: "a > div > img",
      amount: 3,
      random: true
   }
]

function scrapeImages(content, site) {

   let container = document.createElement("div")
   container.innerHTML = content
   let imageList = container.querySelectorAll(site.selector)
   let images = Array.from(imageList)

   let selected = (site.random ? getRandom(images.slice(0, 20), site.amount) : images.slice(0, site.amount))
   for (img of selected)  {
      img.parentNode.removeChild(img)
      document.getElementById(site.name + "-image").appendChild(img)
   }
}

function fetchHTML(site) {
   // pull content from page via API to avoid Same-origin policy
   fetch(`https://api.allorigins.win/get?url=${encodeURIComponent(site.url)}`)
      .then(response => {
         if (response.ok) return response.json()
         throw new Error('Network response was not ok.')
      })
      .then(data => scrapeImages(data.contents, site))
}

// Initiate search
for (site of sites) {
   fetchHTML(site)
}
</script>

Topic		Replies	Views
Fetching image results from a website Card Design	2	269	December 21, 2022
Add an image into a note type? Help	10	13948	May 1, 2023
Bulk add AI images? Add-ons	2	521	October 29, 2023
[Tip] Find Cards with Images Help	3	519	May 1, 2023
Embedding gifs/mp4's into template? Card Design	6	1653	May 1, 2023

Embed google images into note?

This requires some JS skills:

Getting the Underlying URL

Related topics