0

I have python list of lists which I want to convert into pandas Dataframe. I want to create dataframe in the following format:

table_id           created     Mb (etc.)
1 NetworkClicks      2018-10-26  0.22
2 NetworkImpressions 2018-10-26  1519.24

(total 6 rows based on list sample below)

Column names are inside each list , e.g. Mb, created, modified, table_id.

List sample:

ls_all = [
    [(u'Mb', u'928.11'), (u'created', datetime.date(2018, 10, 25)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'4,378'), (u'table_id', u'NetworkActiveViews'), (u'Tb', u'0.91')],
    [(u'Mb', u'800.67'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'3,577'), (u'table_id', u'NetworkBackfillActiveViews'), (u'Tb', u'0.78')],
    [(u'Mb', u'2.44'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'11'), (u'table_id', u'NetworkBackfillClicks'), (u'Tb', u'0.00')],
    [(u'Mb', u'1190.52'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'5,269'), (u'table_id', u'NetworkBackfillImpressions'), (u'Tb', u'1.16')],
    [(u'Mb', u'0.22'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'1'), (u'table_id', u'NetworkClicks'), (u'Tb', u'0.00')],
    [(u'Mb', u'1519.24'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'7,089'), (u'table_id', u'NetworkImpressions'), (u'Tb', u'1.48')]
]

I tried df = pd.DataFrame(ls_all, columns=ls_all[0])

but it's giving me this dataframe:

    (Mb, 928.11)  ...  (Tb, 0.91)
0   (Mb, 928.11)  ...  (Tb, 0.91)
1   (Mb, 800.67)  ...  (Tb, 0.78)
2     (Mb, 2.44)  ...  (Tb, 0.00)
3  (Mb, 1190.52)  ...  (Tb, 1.16)
4     (Mb, 0.22)  ...  (Tb, 0.00)
5  (Mb, 1519.24)  ...  (Tb, 1.48)

2 Answers 2

3

Use list of dictionaries rather than list of list of tuple.

list_of_dicts = [dict(x) for x in ls_all]

df = pd.DataFrame(list_of_dicts)

        Mb Rows_Mil    Tb     created    modified                    table_id
0   928.11    4,378  0.91  2018-10-25  2019-04-18          NetworkActiveViews
1   800.67    3,577  0.78  2018-10-26  2019-04-18  NetworkBackfillActiveViews
2     2.44       11  0.00  2018-10-26  2019-04-18       NetworkBackfillClicks
3  1190.52    5,269  1.16  2018-10-26  2019-04-18  NetworkBackfillImpressions
4     0.22        1  0.00  2018-10-26  2019-04-18               NetworkClicks
Sign up to request clarification or add additional context in comments.

2 Comments

Interesting! So on a list of tuples calling dict takes first item as keys?
1

I like the list of dictionaries above, here’s another way:

Get data from lists

lists = []

for list in ls_all:
    temp = [x[1] for x in list]
    lists.append(temp)

Get column names

columns = [x[0] for x in ls_all[0]]

Load into DataFrame

df = pd.DataFrame(lists, columns=columns)

Result

        Mb     created    modified Rows_Mil                    table_id    Tb
0   928.11  2018-10-25  2019-04-18    4,378          NetworkActiveViews  0.91
1   800.67  2018-10-26  2019-04-18    3,577  NetworkBackfillActiveViews  0.78
2     2.44  2018-10-26  2019-04-18       11       NetworkBackfillClicks  0.00
3  1190.52  2018-10-26  2019-04-18    5,269  NetworkBackfillImpressions  1.16
4     0.22  2018-10-26  2019-04-18        1               NetworkClicks  0.00
5  1519.24  2018-10-26  2019-04-18    7,089          NetworkImpressions  1.48

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.