How to format an array to a particular data frame format in pandas?

Question

I have an array that looks like this:

{'loc.1': array([  1,2,3,4,7,5,6]),'loc.2': array([  3,4,3,7,7,8,6]),'loc.3': array([  1,4,3,1,7,8,6]).....}

After a = pd.DataFrame(array) it looks like this:

loc.1    loc.2  loc.3
1        3      1
2        4      4
3        3      3
4        7      1
7        7      7
5        8      8
6        6      6

This is however what I want:

Col1.    Col.2 
loc.1    1,2,3,4,7,5,6
loc.2    3,4,3,7,7,8,6
loc.3    1,4,3,1,7,8,6

I need it in this particular format as I wish to concatenate subsequently with another table. Pandas would be my preferred solution..

Thank you, and apologies if this is a silly question.

jezrael · Accepted Answer · 2018-04-17 15:53:44Z

3

First in dictionary comprehension need join values.

Then for Series use:

a = pd.Series({k:','.join(v.astype(str)) for k, v in array.items()})
print (a)
loc.1    1,2,3,4,7,5,6
loc.2    3,4,3,7,7,8,6
loc.3    1,4,3,1,7,8,6
dtype: object

And for DataFrame:

d = {k:','.join(v.astype(str)) for k, v in array.items()}
a = pd.DataFrame({'a': list(d.keys()), 'b': list(d.values())})

Alternative solution is create tuples:

L = [(k, ','.join(v.astype(str))) for k, v in array.items()]
a = pd.DataFrame(L, columns=['a','b'])

print (a)
       a              b
0  loc.1  1,2,3,4,7,5,6
1  loc.2  3,4,3,7,7,8,6
2  loc.3  1,4,3,1,7,8,6

If need arrays in column remove join with cast to strings:

L = [(k, v) for k, v in array.items()]
a = pd.DataFrame(L, columns=['a','b'])
print (a)
       a                      b
0  loc.1  [1, 2, 3, 4, 7, 5, 6]
1  loc.2  [3, 4, 3, 7, 7, 8, 6]
2  loc.3  [1, 4, 3, 1, 7, 8, 6]

edited Apr 17, 2018 at 15:53

answered Apr 17, 2018 at 15:46

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Alex Trevylan Over a year ago

Thank you - how can I remove the index on the left-hand side and make a the index?

jezrael Over a year ago

You can use a.set_index('a')

jezrael Over a year ago

Or a = pd.Series({k:','.join(v.astype(str)) for k, v in array.items()}).to_frame('b')

jpp · Accepted Answer · 2018-04-17 15:55:37Z

1

A couple of options depending on what format you require:

d = {'loc.1': np.array([  1,2,3,4,7,5,6]),
     'loc.2': np.array([  3,4,3,7,7,8,6]),
     'loc.3': np.array([  1,4,3,1,7,8,6])} 

res1 = pd.DataFrame([[x] for x in d.values()], index=d.keys())

#                            0
# loc.1  [1, 2, 3, 4, 7, 5, 6]
# loc.2  [3, 4, 3, 7, 7, 8, 6]
# loc.3  [1, 4, 3, 1, 7, 8, 6]

res2 = pd.DataFrame([', '.join(list(map(str, x))) for x in d.values()], index=d.keys())

#                          0
# loc.1  1, 2, 3, 4, 7, 5, 6
# loc.2  3, 4, 3, 7, 7, 8, 6
# loc.3  1, 4, 3, 1, 7, 8, 6

answered Apr 17, 2018 at 15:55

jpp

166k37 gold badges301 silver badges362 bronze badges

Comments

max · Accepted Answer · 2018-04-17 15:56:46Z

1

a = {'loc.1': [1,2,3,4,7,5,6],'loc.2': [3,4,3,7,7,8,6],'loc.3': [1,4,3,1,7,8,6]}
import pandas as pd
df = pd.DataFrame(a).transpose()
df['lists'] = df[[0,1,2,3,4,5,6]].values.tolist()
df = df['lists']

Output:

loc.1    [1, 2, 3, 4, 7, 5, 6]
loc.2    [3, 4, 3, 7, 7, 8, 6]
loc.3    [1, 4, 3, 1, 7, 8, 6]
Name: lists, dtype: object

edited Apr 17, 2018 at 15:56

answered Apr 17, 2018 at 15:47

max

4,5817 gold badges30 silver badges65 bronze badges

Comments

BENY · Accepted Answer · 2018-04-17 16:02:13Z

1

You can using stack with groupby

df.stack().astype(str).groupby(level=1).apply(','.join)
Out[738]: 
loc.1    1,2,3,4,7,5,6
loc.2    3,4,3,7,7,8,6
loc.3    1,4,3,1,7,8,6
dtype: object

answered Apr 17, 2018 at 16:02

BENY

324k22 gold badges176 silver badges250 bronze badges

Collectives™ on Stack Overflow

How to format an array to a particular data frame format in pandas?

4 Answers 4

3 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related