Selenium Python Pull Data From Dynamic Table That Refreshes Every 5 Seconds
Solution 1:
You could just use requests and get the page, then the data would be complete.
import requests
import time
while True:
url = "insert url here"
page = requests.get(url)
# Parse data
time.sleep(5)
Solution 2:
From the comments you have a couple of approaches. As you're unable to share you're site, the best i can do is describe what you need to do and how i got your equivalent site working.
Both approaches use http://www.emojitracker.com/ as an example site.
Approach 1 - get your data at the network layer:
- Go to your site in chrome.
- Open devtools
- Go to the network tab
- Find the call that pull down your data - you're looking for the GET
For the example site provided, i can see i have an entry called rankings
like so:
The HEADERS
tab describes the data you need. For this site there's no auth, there's nothing special and i don't need to send any payload. It's just the API and method that is needed:
Request URL: http://www.emojitracker.com/api/rankings
Request Method: GET
Couldn't be simpler to throw that into pyhton:
import requests
response = requests.get("http://www.emojitracker.com/api/rankings")
data = response.json()
for line in data:
print(line['id'])
print(line['score'])
That prints out the score and the ID from the json response. This is how we look when debugging:
Approach 2 - Hacking the javascript
- Go to the site, let the page load
- go to devtools
- go to the console
- select the source tab and pause the javascript (top right corner) - pay attention to where the cursor stops. Restart and pause a few times and note the different functions involved. Also look at what they do the discern other functions involved.
When you're ready - go to the console tab and type this.stop()
.
On the site you provided, this stops the update-calls.
This should give you the time you need to get your data.
From here, you have two choices to get your data going again.
- The simplest way is to just refresh the page. This will restart the page with new, streaming data. Do this with:
driver.refresh()
- The more fun way, read the js and figure out how to restart the stream! Use the console's intellisense to help you.
Reviewing the JS, where it paused (from steps above), and a bit trial and error I found:
this.startRawScoreStreaming()
It does this output
application.js:90Subscribing to score stream (raw)
ƒ (event) {
return incrementScore(event.data);
}
And the page start streaming again.
Finally, to run these JS snippets in selenium - you use .execute_script
driver.execute_script('this.stop()')
## do your stuff
driver.execute_script('this.startRawScoreStreaming()')
Post a Comment for "Selenium Python Pull Data From Dynamic Table That Refreshes Every 5 Seconds"