1

I need to read keys in the Json file to later use them as columns and insert/update with the values pertaining to those Json file keys. The problem is that my Json has the first element as a Json Object (see code below).

Json:

{
      "metadata": 
        {
          "namespace": "5.2.0",
          "message_id": "3c80151b-fcf3-4cc3-ada0-635be5b5c95f",
          "transmit_time": "2020-01-30T11:25:47.247394-06:00",
          "message_type": "pricing",
          "domain": "Pricing Service",
          "version": "1.0.0"
        }
      
      ,
      "prices": [
        {
          "price": 24.99,
          "effective_date": "2019-06-01T00:00:00-05:00",
          "strikethrough": 34.99,
          "expiration_date": "2019-06-01T00:00:00-05:00",
          "modified_date": "2019-08-30T02:14:39.044968-05:00",
          "base_price": 25.99,
          "sku_id": 341214,
          "item_number": 244312,
          "trade_base_price": 14.99,
          "competitive_price": 20.00
        },
        {
          "price": 24.99,
          "effective_date": "2019-06-01T00:00:00-05:00",
          "strikethrough": 34.99,
          "expiration_date": "2019-06-01T00:00:00-05:00",
          "modified_date": "2019-08-30T02:14:39.044968-05:00",
          "base_price": 25.99,
          "sku_id": 674523,
          "item_number": 279412,
          "trade_base_price": 14.99,
          "competitive_price": 20.00
        }
      ]
    }

So when I read the "metadata" using get_data function below

SQL Postgres Table:

DROP TABLE MyTable;

CREATE TABLE IF NOT EXISTS MyTable
(   
    price numeric(5,2), 
    effective_date  timestamp without time zone,
    strikethrough numeric(5,2), 
    expiration_date  timestamp without time zone,
    modified_date  timestamp without time zone, 
    base_price numeric(5,2), 
    sku_id integer CONSTRAINT PK_MyPK PRIMARY KEY NOT NULL,
    item_number integer, 
    trade_base_price numeric(5,2), 
    competitive_price numeric(5,2), 

    namespace character varying(50),
    message_id character varying(50),
    transmit_time  timestamp without time zone,
    message_type character varying(50),
    domain character varying(50),
    version character varying(50)
 )

Python 3.9:

import psycopg2
import json
# import the psycopg2 database adapter for PostgreSQL
from psycopg2 import connect, Error

with open("./Pricing_test.json") as arq_api:
    read_data = json.load(arq_api)
# converts Json oblect "metadata" to a Json Array of Objects/Python list
read_data["metadata"] = [{key:value} for key,value in read_data["metadata"].items()] #this dies not work properly as "post_gre" function below only reads the very last key in the Json Array of Objects
#print(read_data) 

data_pricing = []

def get_PricingData():
    list_1 = read_data["prices"]
    for dic in list_1:
        price = dic.get("price")
        effective_date = dic.get("effective_date")
        strikethrough = dic.get("strikethrough")
        expiration_date = dic.get("expiration_date")
        modified_date = dic.get("modified_date")
        base_price = dic.get("base_price")
        sku_id = dic.get("sku_id")
        item_number = dic.get("item_number")
        trade_base_price = dic.get("trade_base_price")
        competitive_price = dic.get("competitive_price")
        data_pricing.append([price, effective_date, strikethrough, expiration_date, modified_date, base_price, sku_id, item_number, trade_base_price, competitive_price, None, None, None, None, None, None])

get_PricingData()

data_metadata = []

def get_Metadata():
    list_2 = read_data["metadata"]
    for dic in list_2:
        namespace = dic.get("namespace")
        message_id = dic.get("message_id")
        transmit_time = dic.get("transmit_time")
        message_type = dic.get("message_type")
        domain = dic.get("domain")
        version = dic.get("version")
        #if len(namespace) == 0:
            #data_pricing.append([None, None, None, None, None, version])
        #else:
            #for sub_dict in namespace:
                #namespace = sub_dict.get("namespace")
                #message_id = sub_dict.get("message_id")
                #transmit_time = sub_dict.get("transmit_time")
                #message_type = sub_dict.get("message_type")
                #domain = sub_dict.get("domain")
                #data_pricing.append([group_id, group_name, subgrop_id, subgrop_name, None, None, None])

        data_metadata.append([namespace, message_id, transmit_time, message_type, domain, version])

get_Metadata()

conn = connect(
        host="MyHost",
        database="MyDB",
        user="MyUser",
        password="MyPassword",
        # attempt to connect for 3 seconds then raise exception
        connect_timeout = 3
    )

cur = conn.cursor()

cur.execute("TRUNCATE TABLE MyTable") #comment this one out to avoid sku_id PK violation error

def post_gre():
    for item in data_pricing:
        my_Pricingdata = tuple(item)
        cur.execute("INSERT INTO MyTable VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)", my_Pricingdata)

    #upades with metadata 
    for item2 in data_metadata:
        my_Metadata = tuple(item2)
        cur.execute("UPDATE MyTable SET namespace = %s, message_id = %s, transmit_time = %s, message_type = %s, domain = %s, version = %s", my_Metadata)

post_gre()

conn.commit()
conn.close()

it throughs me the following error:

namespace = dic.get("namespace") AttributeError: 'str' object has no attribute 'get'

But if I wrap the metadata Json object with array brackets [] (see pic below) it works perfectly fine - It reads every key in the metadata as a separate column (namespace, message_id, transmit_time, message_type, domain, version)

enter image description here

But since I should not modify the JSon source file itself I need to interpret "metadata" to a python List type, so that it could read the keys.

P.S. Almost right Solution:

read_data["metadata"] = [{key:value} for key,value in read_data["metadata"].items()]

Suggestion provided by Hi @Suraj works, but for some reason it inserts NULL for all "metadata" keys column (namespace, message_id, transmit_time, message_type, domain), except for "version". Any idea why? It does insert correct values when changing the Json by adding []. But should not do it.

I was able to narrow down the issue with not reading other keys in the "metadata", it basically reads only one very last key which happens to "Version", but if you change the order it would read the very last one whatever you change it to (eg.: "domain").

9
  • Why do you want/need to interpret the value keyed by 'metadata' as a list? Assuming read_data is a dictionary then read_data['metadata']['namespace'] is the style you need to use Commented Aug 16, 2021 at 16:38
  • where read_data is defined? share it please. Commented Aug 16, 2021 at 16:47
  • @balderman I've updated the complete code. Commented Aug 16, 2021 at 16:52
  • @enigma6205 we still can not see where read_data is defined. Commented Aug 16, 2021 at 16:54
  • @balderman, read_data does not need to be defined, instead the data_metadata is used in the Update stmt at the bottom of the code Commented Aug 16, 2021 at 16:59

1 Answer 1

1

How about now ?

import pandas as pd
import json
with open('stak_flow.json') as f:
    data = json.load(f)
data['metadata'] = [{key:value} for key,value in data['metadata'].items()]
print(data)

output

Sign up to request clarification or add additional context in comments.

5 Comments

Hi @Suraj, I don't think so. I just need the python dictionary (JSon object) to be interpreted as list, so that it could read keys in metadata
Hi @Suraj, it works, but for some reason it inserts NULL for all "metadata" keys column (namespace, message_id, transmit_time, message_type, domain), except for "version". Any idea why? It does not do it when changing the Json by adding []. But should not do it.
Hi @Suraj Tripathi, I was able to narrow down the issue with not reading other keys in the "metadata", it basically reads only one very last key which happens to Version, but if you change the order it would read the very last one whatever you change it to.
I've updated the code. The final version is in my post. @Suraj Tripathi
That's great. Seems like the above code was a bit of help to you. If you wish, you can accept this as the answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.