How To Specify Parameters On A Request Using Scrapy
How do I pass parameters to a a request on a url like this: Search here&e_author= How do I put the arguments on the structure
Solution 1:
Pass your GET parameters inside the URL itself:
return Request(url="")
You should probably define your parameters in a dictionary and then "urlencode" it:
from urllib.parse import urlencode
params = {
"action": "search",
"description": "My search here",
"e_author": ""
url = "" + urlencode(params)
return Request(url=url)
Solution 2:
You have to make url yourself with whatever parameters you have.
Python 3 or above
import urllib
params = {
'key': self.access_key,
'part': 'snippet,replies',
'videoId': self.video_id,
'maxResults': 100
url = f'{urllib.parse.urlencode(params)}'
request = scrapy.Request(url, callback=self.parse)
yield request
Python 3+ example Here I am trying to fetch all reviews for some youtube video using official youtube api. Reviews will come in paginated format. So see how I am constructing url from params to call it.
import scrapy
import urllib
import json
import datetime
from youtube_scrapy.items import YoutubeItem
name = 'youtube'
BASE_URL = ''def__init__(self):
self.access_key = 'you_yuotube_api_access_key'
self.video_id = 'any_youtube_video_id'defstart_requests(self):
params = {
'key': self.access_key,
'part': 'snippet,replies',
'videoId': self.video_id,
'maxResults': 100
url = f'{self.BASE_URL}/commentThreads/?{urllib.parse.urlencode(params)}'
request = scrapy.Request(url, callback=self.parse)
request.meta['params'] = params
return [request]
defparse(self, response):
data = json.loads(response.body)
# lets collect comment and reply
items = data.get('items', [])
for item in items:
created_date = item['snippet']['topLevelComment']['snippet']['publishedAt']
_created_date = datetime.datetime.strptime(created_date, '%Y-%m-%dT%H:%M:%S.000Z')
id = item['snippet']['topLevelComment']['id']
record = {
'created_date': _created_date,
'body': item['snippet']['topLevelComment']['snippet']['textOriginal'],
'creator_name': item['snippet']['topLevelComment']['snippet'].get('authorDisplayName', {}),
'id': id,
'url': f'{self.video_id}&lc={id}',
yield YoutubeItem(**record)
# lets paginate if next page is available for more comments
next_page_token = data.get('nextPageToken', None)
if next_page_token:
params = response.meta['params']
params['pageToken'] = next_page_token
url = f'{self.BASE_URL}/commentThreads/?{urllib.parse.urlencode(params)}'
request = scrapy.Request(url, callback=self.parse)
request.meta['params'] = params
yield request
Solution 3:
To create GET request with params, using scrapy, you can use the following example:
yield scrapy.FormRequest(
where 'params' is a dict with your parameters.
Solution 4:
Can use add_or_replace_parameters from w3lib.
from w3lib.url import add_or_replace_parameters
defabc(self, response):
url = ""# can be response.url or any
params = {
"action": "search",
"description": "My search here",
"e_author": ""
return Request(url=add_or_replace_parameters(url, prams))
Solution 5:
Scrapy doesn't offer this directly. What you are trying to do is to create a url, for which you can use the urlparse
Post a Comment for "How To Specify Parameters On A Request Using Scrapy"