0

I am working on building a test software as a side project. I have been given questions for the test in a JSON format. I intend to parse the JSON and store it into an SQL table with the following schema:-

TABLE NAME - QUESTIONS
QUESTION_NO - INT (PRIMARY KEY) - AUTO_INCREMENT,
QUESTION_DESC - VARCHAR(255),
OPTA - VARCHAR(255),
OPTB - VARCHAR(255),
OPTC - VARCHAR(255),
OPTD - VARCHAR(255),
CORRECTOPT - VARCHAR(1) [Should be 'A','B','C','D']

The JSON is in the following format:-

[
    {
        "1": "Total number of ATP produced during Kreb's cycle",
        "2": "what is referred to as reference carbohydrate?"
    },
    {
        "1": {
            "a": "8",
            "b": "11",
            "c": "12",
            "d": "36"
        },
        "2": {
            "a": "glucose",
            "b": "glyceraldehde",
            "c": "fructose",
            "d": "lactose"
        }
    },
    {
        "1": "12",
        "2": "glyceraldehyde"
    }
]

I initially tried writing a Python code to parse the JSON which is as follows:-

import json

with open('BIOset1.json') as f:
    data = json.load(f)

print(data)

Here BIOset1.json is the name of the JSON file I am trying to parse. But, I get the following error:-

raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Can someone please help me to parse this JSON file and retrieve the data in the following format so that I can insert the data into the SQL table?

I come from a non-programming background and I am trying to bring out a genuine change in my university through these questions.

Any help would be much appreciated.

[{
        "QUESTION_DESC": "Total number of ATP produced during Kreb's cycle",
        "OPTA": "8",
        "OPTB": "11",
        "OPTC": "12",
        "OPTD": "36",
        "CORRECTOPT": "C"
    },
    {
        "QUESTION_DESC": "what is referred to as reference carbohydrate?",
        "OPTA": "glucose",
        "OPTB": "fructose",
        "OPTC": "lactose",
        "OPTD": "aldehyde",
        "CORRECTOPT": "B"
    }
]

2 Answers 2

1
#Create a function that accepts json and outputs the required format 
#code tested on python3
    def format_json(data):
        #solution list
        final_list = []
        ###questions formatting
        ques_desc_1 = data[0].get("1")
        ques_desc_2 = data[0].get("2")
        ###answer
        answ_1 = data[1].get("1")
        answ_2 = data[1].get("2")
        ###correct answers
        corr_1 = data[2].get("1")
        corr_2 = data[2].get("2")
        ####
        #dictionary = for question 1
        dict1 = {"QUESTION_DESC": ques_desc_1,
                 }
        # "CORRECTOPT": data[2].get("1")
        #loop through options available for answer
        option_dict =[]
        for item in answ_1.items():
            #append
            option_dict.append(item)
        dict1.update({
            "OPTA": option_dict[0][1],
            "OPTB": option_dict[1][1],
            "OPTC": option_dict[2][1],
            "OPTD": option_dict[3][1],
        })

        ##
        dict1.update({"CORRECTOPT": dict1.get("OPTC")})

        # dictionary = for question 2
        dict2 = {"QUESTION_DESC": ques_desc_2,
                 }
        # "CORRECTOPT": data[2].get("1")
        # loop through options available for answer
        option_dict2 = []
        for item in answ_2.items():
            # append
            option_dict2.append(item)
        #update
        dict2.update({
            "OPTA": option_dict2[0][1],
            "OPTB": option_dict2[1][1],
            "OPTC": option_dict2[2][1],
            "OPTD": option_dict2[3][1],
        })
        ##
        dict2.update({"CORRECTOPT": dict2.get("OPTC")})
        #########
        final_list.append(dict1)
        final_list.append(dict2)
        ###
        print(final_list)

    if __name__ == '__main__':
        #sample list
        list_dict = [
        {
            "1": "Total number of ATP produced during Kreb's cycle",
            "2": "what is referred to as reference carbohydrate?"
        },
        {
            "1": {
                "a": "8",
                "b": "11",
                "c": "12",
                "d": "36"
            },
            "2": {
                "a": "glucose",
                "b": "glyceraldehde",
                "c": "fructose",
                "d": "lactose"
            }
        },
        {
            "1": "12",
            "2": "glyceraldehyde"
        }]
        #function to format data
        format_json(list_dict)
Sign up to request clarification or add additional context in comments.

1 Comment

How can I modify this program to work for n questions instead of just 2 questions?
1

First, I see no error in your JSON other than you have a spelling error ("b": "glyceraldehde"). So there must be some other issue. The following almost does the job:

import json

data = """[
    {
        "1": "Total number of ATP produced during Kreb's cycle",
        "2": "what is referred to as reference carbohydrate?"
    },
    {
        "1": {
            "a": "8",
            "b": "11",
            "c": "12",
            "d": "36"
        },
        "2": {
            "a": "glucose",
            "b": "glyceraldehyde",
            "c": "fructose",
            "d": "lactose"
        }
    },
    {
        "1": "12",
        "2": "glyceraldehyde"
    }
]"""

data = json.loads(data)

results = [{"QUESTION_DESC": data[0][k],
            "OPTA": data[1][k]["a"],
            "OPTB": data[1][k]["b"],
            "OPTC": data[1][k]["c"],
            "OPTD": data[1][k]["d"],
            "CORRECTOPT": data[2][k]} for k in data[0].keys()]

for result in results:
    print(result)

Prints:

{'QUESTION_DESC': "Total number of ATP produced during Kreb's cycle", 'OPTA': '8', 'OPTB': '11', 'OPTC': '12', 'OPTD': '36', 'CORRECTOPT': '12'}
{'QUESTION_DESC': 'what is referred to as reference carbohydrate?', 'OPTA': 'glucose', 'OPTB': 'glyceraldehyde', 'OPTC': 'fructose', 'OPTD': 'lactose', 'CORRECTOPT': 'glyceraldehyde'}

The problem is that the value for CORRECTOP needs to be changed to the actual option letter that contains that value. So we need a post adjustment:

for result in results:
    correctopt = result["CORRECTOPT"]
    for opt in ["A", "B", "C", "D"]:
        if correctopt == result["OPT" + opt]:
            result["CORRECTOPT"] = opt
            break

for result in results:
    print(result)

Prints:

{'QUESTION_DESC': "Total number of ATP produced during Kreb's cycle", 'OPTA': '8', 'OPTB': '11', 'OPTC': '12', 'OPTD': '36', 'CORRECTOPT': 'C'}
{'QUESTION_DESC': 'what is referred to as reference carbohydrate?', 'OPTA': 'glucose', 'OPTB': 'glyceraldehyde', 'OPTC': 'fructose', 'OPTD': 'lactose', 'CORRECTOPT': 'B'}

You can convert this back to JSON with: json.dumps(result)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.