numpy array to data frame and vice versa

Question

I'm a noob in python!

I'd like to get sequences and anomaly together like this:
and sort only normal sequence.(if a value of anomaly column is 0, it's a normal sequence)
turn normal sequences to numpy array (without anomaly column)

each row(Sequence) is one session. so in this case their are 6 independent sequences. each element represent some specific activity.

'''

sequence = np.array([[5, 1, 1, 0, 0, 0],
       [5, 1, 1, 0, 0, 0],
       [5, 1, 1, 0, 0, 0],
       [5, 1, 1, 0, 0, 0],
       [5, 1, 1, 0, 0, 0],
       [5, 1, 1, 300, 200, 100]])

anomaly = np.array((0,0,0,0,0,1))

''' i got these two variables and have to sort only normal sequences.

Here is the code i tried: '''

# sequence to dataframe
empty_df = pd.DataFrame(columns = ['Sequence'])
empty_df.reset_index()

for i in range(sequence.shape[0]):
  empty_df = empty_df.append({"Sequence":sequence[i]},ignore_index = True) #

#concat anomaly

anomaly_df = pd.DataFrame(anomaly)
df = pd.concat([empty_df,anomaly_df],axis = 1)
df.columns = ['Sequence','anomaly']
df

'''

I didn't want to use pd.DataFrame because it gives me this:

pd.DataFrame(sequence)

anyways, after making df, I tried to sort normal sequences

#sorting normal seq

normal = df[df['anomaly'] == 0]['Sequence'] 
# back to numpy. only sequence column.
normal = normal.to_numpy()
normal.shape

''' and this numpy gives me different shape1 from the variable sequence. sequence.shape: (6,6) normal.shape =(5,)

I want to have (5,6). Tried reshape but didn't work.. Can someone help me with this? If there are any unspecific explanation from my question, plz leave a comment. I appreciate it.

what do you mean by sorting? It seems it is sorted from lowest to highest — Onyambu
– Onyambu, Commented Nov 20, 2020 at 8:54

Onyambu · Accepted Answer · 2020-11-20 08:52:34Z

2

I am not quite sure of what you need but here you could do:

import pandas as pd
df = pd.DataFrame({'sequence':sequence.tolist(), 'anomaly':anomaly})
df

                  sequence  anomaly
0        [5, 1, 1, 0, 0, 0]        0
1        [5, 1, 1, 0, 0, 0]        0
2        [5, 1, 1, 0, 0, 0]        0
3        [5, 1, 1, 0, 0, 0]        0
4        [5, 1, 1, 0, 0, 0]        0
5  [5, 1, 1, 300, 200, 100]        1

answered Nov 20, 2020 at 8:52

Onyambu

80.3k3 gold badges29 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Pygirl · Accepted Answer · 2020-11-20 08:43:01Z

1

Convert it into list then create an array. Try:

normal = df.loc[df['anomaly'].eq(0), 'Sequence']
normal = np.array(normal.tolist())
print(normal.shape)

# (5,6)

answered Nov 20, 2020 at 8:43

Pygirl

13.4k6 gold badges36 silver badges48 bronze badges

4 Comments

data_minD Over a year ago

That worked perfectly! Thank you for your succinct and correct answer! But would you be able to explain why the way you tried work? Like.. how come I got (5,) and you got (5,6) by turning dataframe to list and then to array

data_minD Over a year ago

or just simply you can tell me the steps you've taken..!

Pygirl Over a year ago

type(np.array(normal.to_numpy())[0]) --> list and type(np.array(normal.tolist())[0]) --> numpy.ndarray considering only the first row. One give me a list object and another one give a numpy array.

Pygirl Over a year ago

(5,6) you will get when you have a 2D array. not a 1D array having values as list which will give you (5,). I passed list of list to numpy array which gave me an 2D array.

Collectives™ on Stack Overflow

numpy array to data frame and vice versa

2 Answers 2

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related