Skip to content Skip to sidebar Skip to footer

Python Scrapy - Populate Start_urls From Mysql

I am trying to populate start_url with a SELECT from a MYSQL table using spider.py. When i run 'scrapy runspider spider.py' i get no output, just that it finished with no error. I

Solution 1:

A better approach is to override the start_requests method.

This can query your database, much like populate_start_urls, and return a sequence of Request objects.

You would just need to rename your populate_start_urls method to start_requests and modify the following lines:

forrowin rows:
    yieldself.make_requests_from_url(row[0])

Solution 2:

Write the populating in the __init__:

def__init__(self):
    super(ProductsSpider,self).__init__()
    self.start_urls = get_start_urls()

Assuming get_start_urls() returns the urls.

Post a Comment for "Python Scrapy - Populate Start_urls From Mysql"