Skip to content Skip to sidebar Skip to footer

Data Scraping From Vivino.com

Long time lurker here, and this community has been helping me a lot, thank you all. so I am trying to collect data from vivino.com and the DataFrame comes out empty, I can see that

Solution 1:

The previous answer is correct but it needs the user-agent header set:

import requests
import pandas as pd

r = requests.get(
    "https://www.vivino.com/api/explore/explore",
    params = {
        "country_code": "FR",
        "country_codes[]":"pt",
        "currency_code":"EUR",
        "grape_filter":"varietal",
        "min_rating":"1",
        "order_by":"price",
        "order":"asc",
        "page": 1,
        "price_range_max":"500",
        "price_range_min":"0",
        "wine_type_ids[]":"1"
    },
    headers= {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0"
    }
)
results = [
    (
        t["vintage"]["wine"]["winery"]["name"], 
        f'{t["vintage"]["wine"]["name"]}{t["vintage"]["year"]}',
        t["vintage"]["statistics"]["ratings_average"],
        t["vintage"]["statistics"]["ratings_count"]
    )
    for t in r.json()["explore_vintage"]["matches"]
]
dataframe = pd.DataFrame(results,columns=['Winery','Wine','Rating','num_review'])

print(dataframe)

You will need to increment the page field to iterate over the next results

Solution 2:

It is likely that your data is behind some JavaScript code; fortunately, the data is available as a JSON file. I checked the Network tab and found them.

import requests

url = "https://www.vivino.com/api/explore/explore?country_code=AU&country_codes[]=pt&currency_code=AUD&grape_filter=varietal&min_rating=1&order_by=price&order=asc&page=1&price_range_max=80&price_range_min=20&wine_type_ids[]=1"

r = requests.get(url)

# Your data:
r.json()

There are other JSON files; you can check the Network tab of the browser to access them.

Post a Comment for "Data Scraping From Vivino.com"