1

I'm new to this forum, kindly excuse if the question format is not very good.

I'm trying to fetch rows from database table in mysql and print the same after processing the cols (one of the cols contains json which needs to be expanded). Below is the source and expected output. Would be great if someone can suggest an easier way to manage this data.

Note: I have achieved this with lots of looping and parsing but the challenges are.
1) There is no connection between col_names and data and hence when I am printing the data I don't know the order of the data in the resultset so there is a mismatch in the col title that I print and the data, any means to keep this in sync ?
2) I would like to have the flexibility of changing the order of the columns without much rework.

What is best possible way to achieve this. Have not explored the pandas library as I was not sure if it is really necessary.

Using python 3.6

Sample Data in the table

id, student_name, personal_details, university
1, Sam, {"age":"25","DOL":"2015","Address":{"country":"Poland","city":"Warsaw"},"DegreeStatus":"Granted"},UAW
2, Michael, {"age":"24","DOL":"2016","Address":{"country":"Poland","city":"Toruń"},"DegreeStatus":"Granted"},NCU

I'm querying the database using MySQLdb.connect object, steps below

query = "select * from student_details"
cur.execute(query)
res = cur.fetchall()  # get a collection of tuples 
db_fields = [z[0] for z in cur.description]  # generate list of col_names

Data in variables:

>>>db_fields
['id', 'student_name', 'personal_details', 'university']
>>>res
((1, 'Sam', '{"age":"25","DOL":"2015","Address":{"country":"Poland","city":"Warsaw"},"DegreeStatus":"Granted"}','UAW'),
 (2, 'Michael', '{"age":"24","DOL":"2016","Address":{"country":"Poland","city":"Toruń"},"DegreeStatus":"Granted"}','NCU'))

Desired Output:

 id, student_name, age, DOL, country, city, DegreeStatus, University
 1, 'Sam', 25, 2015, 'Poland', 'Warsaw', 'Granted', 'UAW'
 2, 'Michael', 24, 2016, 'Poland', 'Toruń', 'Granted', 'NCU'

1 Answer 1

1

A not-too-pythonic way but easy to understand (and maybe you can write a more pythonic soltion) might be:

def unwrap_dict(_input):
    res = dict()
    for k, v in _input.items():
        # Assuming you know there's only one nested level
        if isinstance(v, dict):
            for _k, _v in v.items():
                res[_k] = _v
            continue
        res[k] = v
    return res


all_data = list()
for row in result:
    res = dict()
    for field, data in zip(db_fields, row):
        # Assuming you know personal_details is the only JSON column
        if field == 'personal_details':        
            data = json.loads(data)
        if isinstance(data, dict):
            extra = unwrap_dict(data)
            res.update(extra)
            continue
        res[field] = data

    all_data.append(res)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.