Skip to content Skip to sidebar Skip to footer

Extract The Number Of Results From Google Search

I am writing a web scraper to extract the number of results of searching in a google search which appears on the top left of the page of search results. I have written the code bel

Solution 1:

You're actually using the wrong url to query google's search engine. You should be using http://www.google.com/search?q=<query>.

So it'd look like this:

def pyGoogleSearch(word):
    address = 'http://www.google.com/search?q='
    newword = address + word
    page = requests.get(newword)
    soup = BeautifulSoup(page.content, 'html.parser')
    phrase_extract = soup.find(id="resultStats")
    print(phrase_extract)

You also probably just want the text of that element, not the element itself, so you can do something like

phrase_text = phrase_extract.text

or to get the actual value as an integer:

val = int(phrase_extract.text.split(' ')[1].replace(',',''))

Solution 2:

You could also try to see what output would be from div above. Sometimes it will show the output.

Also, make sure you're using user-agent since Google could treat your script as a tablet user-agent (of something different) with different .class, #id tags, and so on. This could be the reason why your output is empty [].

Here's the code and replit.com to see the number of search results:

from lxml import html
import requests

headers = {
    "User-Agent":
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}

response = requests.get('https://www.google.com/search?q=beautiful+cookies',
                        headers=headers,
                        stream=True)

response.raw.decode_content = True

tree = html.parse(response.raw)

# lxml is used to select element by XPath# Requests + lxml: https://stackoverflow.com/a/11466033/1291371# note: you can achieve it easily with bs4 as well by grabbing "#result-stats" id selector.
result = tree.xpath('//*[@id="result-stats"]/text()')[0]

print(result)

# About 3,890,000,000 results

Alternatively, you can use Google Search Engine Results API from SerpApi to achieve the same but in more easy fashion.

Part of JSON:

"search_information":{"organic_results_state":"Results for exact spelling","total_results":3890000000,"time_taken_displayed":0.65,"query_displayed":"beautiful cookies"}

Code to integrate:

import os
from serpapi import GoogleSearch

params = {
    "engine": "google",
    "q": "beautiful cookies",
    "api_key": os.getenv("API_KEY"),
}

search = GoogleSearch(params)
results = search.get_dict()

result = results["search_information"]['total_results']
print(result)

# 4210000000

Discrailmer, I work for SerpApi.

Post a Comment for "Extract The Number Of Results From Google Search"