0

I'm trying to scrape a website and get items list from it using python. I parsed the html using BeaufitulSoup and made a JSON file using json.loads(data). The JSON object looks like this:

{ ".1768j8gv7e8__0":{ 
    "context":{ 
       //some info
    },
    "pathname":"abc",
    "showPhoneLoginDialog":false,
    "showLoginDialog":false,
    "showForgotPasswordDialog":false,
    "isMobileMenuExpanded":false,
    "showFbLoginEmailDialog":false,
    "showRequestProductDialog":false,
    "isContinueWithSite":true,
    "hideCoreHeader":false,
    "hideVerticalMenu":false,
    "sequenceSeed":"web-157215950176521",
    "theme":"default",
    "offerCount":null
 },
 ".1768j8gv7e8.6.2.0.0__6":{ 
    "categories":[ 

    ],
    "products":{ 
       "count":12,
       "items":[ 
          { 
             //item info
          },
          { 
            //item info
          },
          { 
            //item info
          }
       ],
       "pageSize":50,
       "nextSkip":100,
       "hasMore":false
    },
    "featuredProductsForCategory":{ 

    },
    "currentCategory":null,
    "currentManufacturer":null,
    "type":"Search",
    "showProductDetail":false,
    "updating":false,
    "notFound":false
 }
}

I need the items list from product section. How can I extract that?

2

3 Answers 3

1

Just do:

products = jsonObject[list(jsonObject.keys())[1]]["products"]["items"]
Sign up to request clarification or add additional context in comments.

2 Comments

To get the items list, i tried this way: ``` items = products[list(products.keys())[1]]["items"] ``` But it's not working. I'm new in python. A little explanation would help a lot
I edited my solution, now you'll get the items list. It's a a list inside the dict products so you have to access it first.
1

import json packagee and map every entry to a list of items if it has any:

This solution is more universal, it will check all items in your json and find all the items without hardcoding the index of an element

import json

data = '{"p1": { "pathname":"abc" },  "p2": { "pathname":"abcd", "products": { "items" : [1,2,3]} }}'

# use json package to convert json string to dictionary
jsonData = json.loads(data)
type(jsonData) # dictionary

# use "list comprehension" to iterate over all the items in json file
# itemData['products']["items"] - select items from data
# if "products" in itemData.keys() - check if given item has products 
[itemData['products']["items"] for itemId, itemData in jsonData.items() if "products" in itemData.keys()]

Edit: added comments to code

2 Comments

can you kindly explain your code a bit? I'm new in python
added explanation comments
1

I'll just call the URL of the JSON file you got from BeautifulSoup "response" and then put in a sample key in the items array, like itemId:

import json
json_obj = json.load(response)
array = []
for i in json_obj['items']:
   array[i] = i['itemId']
print(array)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.