Google Crawl 503 Service Unavailable
Solution 1:
Google have triggers to sniff out bots and other abuse of their Terms of Service, so they set a limit (or a "throttle") on the number of calls that the same i.p. address can make over a certain period of time. I believe it's something like 10 calls per minute. Case in point: If you paste your Url into a browser when it fails with a 503 error, you'll get a Captcha challenge from Google to prove you are not a bot.
I am using the pattern.web module to do essentially the same thing as you are doing (for harmless research purposes, of course!), and the documentation for that library shows the throttling limits for most popular APIs (Google, Bing, Twitter, Facebook...).
Try sending your requests every 15+ seconds or so, to avoid tripping up the throttle limit.
Post a Comment for "Google Crawl 503 Service Unavailable"