Lxml Web-scraping Is Returning Empty Values
I am trying to get all the food categories from this site https://www.walmart.com/cp/976759 here is snapshot of the category container this is my code I want to get the category
Solution 1:
This answer uses beautifulsoup
though the OP asked parsing using lxml
.
When you see the website, all of the data is loaded from the script tag. Hence, the complete data is stored in a script tag with id
as category
.
import requests, json
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (X11; CrOS x86_64 8172.45.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.64 Safari/537.36'}
res = requests.get("https://www.walmart.com/cp/976759", headers=headers)
soup = BeautifulSoup(res.text, "html.parser")
script = soup.find("script", {"id":"category"})
data = json.loads(script.get_text(strip=True))
with open("data.json", "w") as f:
json.dump(data, f)
The above script will save all the data to a json file. It's a big json. There you can get the links and images to the categories.
Output for categories in the big json.
[
{
"image": {
"alt": "Coffee",
"assetId": "24073832",
"assetName": "42227-230294--Food_Coffee_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/cp/coffee/1086446?povid=976759+%7C+2018-12-26+%7C+Food%20Coffee%20Shop%20by%20Category%20Tile%201",
"rawValue": "/cp/coffee/1086446",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Coffee%20Shop%20by%20Category%20Tile%201"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-4fec/k2-_d0c27367-0903-424d-9ed7-25ff31ed2078.v1.jpg",
"title": "Coffee",
"width": "320",
"size": "48986",
"contentType": "image/jpg",
"uid": "EVz8WxyK"
},
"uid": "KzzTghKO"
},
{
"image": {
"alt": "Meal Solutions, Grains & Pasta",
"assetId": "16511345",
"assetName": "41423-209368-Food-Meals_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/cp/meal-solutions-grains-pasta/976794?povid=976759+%7C+2018-12-26+%7C+Food%20Meal%20Shop%20by%20Category%20Tile%202",
"rawValue": "/cp/meal-solutions-grains-pasta/976794",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Meal%20Shop%20by%20Category%20Tile%202"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-b006/k2-_9c1d502f-c08d-4591-a734-b205d0ffe45b.v1.jpg",
"title": "Meal Solutions, Grains & Pasta",
"width": "320",
"size": "21747",
"contentType": "image/jpg",
"uid": "a0xEKGc1"
},
"uid": "Sa4hkgg8"
},
{
"image": {
"alt": "Snacks",
"assetId": "16511346",
"assetName": "41423-209369-Food_Snacks_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/cp/snacks-cookies-chips/976787?povid=976759+%7C+2018-12-26+%7C+Food%20Snack%20Shop%20by%20Category%20Tile%203",
"rawValue": "/cp/snacks-cookies-chips/976787",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Snack%20Shop%20by%20Category%20Tile%203"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-66f5/k2-_a622db4c-a789-4f03-bf16-440ad12efcd8.v1.jpg",
"title": "Snacks",
"width": "320",
"size": "22038",
"contentType": "image/jpg",
"uid": "iYpawUR8"
},
"uid": "KN0Y6XJk"
},
{
"image": {
"alt": "Beverages",
"assetId": "31886230",
"assetName": "42592-239546 Food Beverages Featured Category Tile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/cp/beverages/976782?povid=976759+%7C+2018-12-26+%7C+Food%20Beverages%20Shop%20by%20Category%20Tile%204",
"rawValue": "/cp/beverages/976782",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Beverages%20Shop%20by%20Category%20Tile%204"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-b691/k2-_95cdb69e-5175-408a-b18e-7c8a4902da65.v1.jpg",
"title": "Beverages",
"width": "320",
"size": "21411",
"contentType": "image/jpg",
"uid": "YDQP7Zs1"
},
"uid": "eronFjMz"
},
{
"image": {
"alt": "Chocolate, Candy & Gum",
"assetId": "16511348",
"assetName": "41423-209371-Food_Candy_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/cp/chocolate-candy-gum/1096070?povid=976759+%7C+2018-12-26+%7C+Food%20Candy%20Shop%20by%20Category%20Tile%205",
"rawValue": "/cp/chocolate-candy-gum/1096070",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Candy%20Shop%20by%20Category%20Tile%205"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-26df/k2-_0e1ed3ed-51c5-4d55-a4b3-64d5beab75c4.v1.jpg",
"title": "Chocolate, Candy & Gum",
"width": "320",
"size": "24819",
"contentType": "image/jpg",
"uid": "khYex7Z3"
},
"uid": "N57hxj54"
},
{
"image": {
"alt": "Condiments",
"assetId": "16511349",
"assetName": "41423-209372-Food_Condiments_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/cp/976786?povid=976759+%7C+2018-12-26+%7C+Food%20Condiments%20Shop%20by%20Category%20Tile%206",
"rawValue": "/cp/976786",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Condiments%20Shop%20by%20Category%20Tile%206"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-c487/k2-_0b0b1864-112c-4323-9474-9556739bf3b5.v1.jpg",
"title": "Condiments",
"width": "320",
"size": "12514",
"contentType": "image/jpg",
"uid": "rxFVAq08"
},
"uid": "Ych6vXbE"
},
{
"image": {
"alt": "Baking",
"assetId": "16511350",
"assetName": "41423-209373-Food_Baking_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/cp/baking/976780?povid=976759+%7C+2018-12-26+%7C+Food%20Baking%20Shop%20by%20Category%20Tile%207",
"rawValue": "/cp/baking/976780",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Baking%20Shop%20by%20Category%20Tile%207"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-26b5/k2-_7ad38e98-0ccd-479f-9bfa-1d4d4dfe90a2.v1.jpg",
"title": "Baking",
"width": "320",
"size": "18935",
"contentType": "image/jpg",
"uid": "pI4YqGyq"
},
"uid": "07562lCu"
},
{
"image": {
"alt": "Breakfast & Cereal",
"assetId": "16511351",
"assetName": "41423-209374-Food_Breakfast_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/cp/breakfast-food-cereal/976783?povid=976759+%7C+2018-12-26+%7C+Food%20Breakfast%20&%20Cereal%20Shop%20by%20Category%20Tile%208",
"rawValue": "/cp/breakfast-food-cereal/976783",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Breakfast%20&%20Cereal%20Shop%20by%20Category%20Tile%208"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-f53c/k2-_3a8d9006-e514-48b7-ad81-c4a70a8d39e9.v1.jpg",
"title": "Breakfast & Cereal",
"width": "320",
"size": "24847",
"contentType": "image/jpg",
"uid": "PQUXkqiQ"
},
"uid": "gJmzhaYu"
},
{
"image": {
"alt": "Food Gift Baskets",
"assetId": "16511356",
"assetName": "41423-209379-Food_GiftBaskets_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/browse/food/gift-baskets/976759_1089004?povid=976759+%7C+2018-12-26+%7C+Food%20Gift%20Baskets%20Shop%20by%20Category%20Tile%209&povid=976759+%7C+2018-12-26+%7C+Food%20Gift%20Baskets%20Shop%20by%20Category%20Tile%209",
"rawValue": "/browse/food/gift-baskets/976759_1089004?povid=976759+%7C+2018-12-26+%7C+Food%20Gift%20Baskets%20Shop%20by%20Category%20Tile%209",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Gift%20Baskets%20Shop%20by%20Category%20Tile%209"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-8e1b/k2-_3e651309-806d-4633-95f6-ec015c783759.v1.jpg",
"title": "Food Gift Baskets",
"width": "320",
"size": "19695",
"contentType": "image/jpg",
"uid": "nEwGTdfg"
},
"uid": "ommZYX3q"
},
{
"image": {
"alt": "Emergency Food",
"assetId": "16511354",
"assetName": "41423-209377-Food_EmergencyFood_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/browse/meal-solutions-grains-pasta/emergency-food/976759_976794_1094144?povid=976759+%7C+2018-12-26+%7C+Food%20Emergency%20Food%20Shop%20by%20Category%20Tile%2010&povid=976759+%7C+2018-12-26+%7C+Food%20Emergency%20Food%20Shop%20by%20Category%20Tile%2010",
"rawValue": "/browse/meal-solutions-grains-pasta/emergency-food/976759_976794_1094144?povid=976759+%7C+2018-12-26+%7C+Food%20Emergency%20Food%20Shop%20by%20Category%20Tile%2010",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Emergency%20Food%20Shop%20by%20Category%20Tile%2010"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-369d/k2-_5fc9cbf1-4b2e-47ba-a35c-d5016d80a0a1.v1.jpg",
"title": "Emergency Food",
"width": "320",
"size": "12594",
"contentType": "image/jpg",
"uid": "L0VEhGaa"
},
"uid": "EV4aR1IJ"
},
{
"image": {
"alt": "Organic Foods",
"assetId": "16511352",
"assetName": "41423-209375-Food_Organic_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/browse/food/organic-foods/976759_1228024?povid=976759+%7C+2018-12-26+%7C+Food%20Organic%20Foods%20Shop%20by%20Category%20Tile%2010&povid=976759+%7C+2018-12-26+%7C+Food%20Organic%20Foods%20Shop%20by%20Category%20Tile%2010",
"rawValue": "/browse/food/organic-foods/976759_1228024?povid=976759+%7C+2018-12-26+%7C+Food%20Organic%20Foods%20Shop%20by%20Category%20Tile%2010",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Organic%20Foods%20Shop%20by%20Category%20Tile%2010"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-e889/k2-_025af29b-a175-43d9-a7f1-8a41b7f595d8.v1.jpg",
"title": "Organic Foods",
"width": "320",
"size": "14996",
"contentType": "image/jpg",
"uid": "VodpOeXr"
},
"uid": "6bskKrLd"
},
{
"image": {
"alt": "Gluten-Free Foods",
"assetId": "16511353",
"assetName": "41423-209376-Food_Gluten-Free_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/browse/food/gluten-free-foods/976759_1228023?povid=976759+%7C+2018-12-26+%7C+Food%20Gluten%20Free%20Shop%20by%20Category%20Tile%2010&povid=976759+%7C+2018-12-26+%7C+Food%20Gluten%20Free%20Shop%20by%20Category%20Tile%2010",
"rawValue": "/browse/food/gluten-free-foods/976759_1228023?povid=976759+%7C+2018-12-26+%7C+Food%20Gluten%20Free%20Shop%20by%20Category%20Tile%2010",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Gluten%20Free%20Shop%20by%20Category%20Tile%2010"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-40b6/k2-_d4adeded-bc5a-4141-8ff5-484e5a57af7b.v1.jpg",
"title": "Gluten-Free Foods",
"width": "320",
"size": "11866",
"contentType": "image/jpg",
"uid": "-BPKl3mO"
},
"uid": "DxMgKndk"
},
{
"image": {
"alt": "Meal Delivery Services",
"assetId": "16511355",
"assetName": "41423-209378-Food_MealKits_FeaturedCategoryTile_V1.jpg",
"clickThrough": {
"type": "url",
"value": "/browse/food/meal-kits-specialty-food-boxes/976759_7123943?povid=976759+%7C+2018-12-26+%7C+Food%20Meal%20Delivery%20Services%20Shop%20by%20Category%20Tile%2010&povid=976759+%7C+2018-12-26+%7C+Food%20Meal%20Delivery%20Services%20Shop%20by%20Category%20Tile%2010",
"rawValue": "/browse/food/meal-kits-specialty-food-boxes/976759_7123943?povid=976759+%7C+2018-12-26+%7C+Food%20Meal%20Delivery%20Services%20Shop%20by%20Category%20Tile%2010",
"tag": "povid=976759+%7C+2018-12-26+%7C+Food%20Meal%20Delivery%20Services%20Shop%20by%20Category%20Tile%2010"
},
"height": "320",
"src": "https://i5.walmartimages.com/dfw/4ff9c6c9-bed9/k2-_b82a8177-43e5-45d2-bc92-1dccd94d1e5d.v1.jpg",
"title": "Meal Delivery Services",
"width": "320",
"size": "20720",
"contentType": "image/jpg",
"uid": "XAyoF2GU"
},
"uid": "MeJ6LK_Z"
}
]
Update:
In order to get category links and other info from the json:
Assume the json is stored in data
variable
for innerjson in data["category"]["presoData"]["modules"]["center"]:
if "moduleData" in innerjson and "title" in innerjson["moduleData"]["configs"] and innerjson["moduleData"]["configs"]["title"] == "Shop by Category":
print(innerjson["moduleData"]["configs"]["categories"])
Post a Comment for "Lxml Web-scraping Is Returning Empty Values"