1

I am trying to do the following task; I have a dataframe in python with N columns. For each pair of column, I want to create one single column with the ratio between the values of the second column and the previous one. I think I found the proper way to do it but I can't find the way to display the result into a new Dataframe. My input dataframe looks like this:

Name  1A  1B  2A  2B  3A  3B  536A  536B ...
name1 x1  x2  x3  x4  x5  x6  x7  x8 
name2 ........
namN  ........

So for each pair, let's take the first one for example, I want to create a column defined by 1B/1A, than one defined by 2B/2A ecc. This is the code that I tried:

l = []
for i in np.arange(0,536,2):
    dic1={}
    dic1[i+1] = df.iloc[:,i]/df.iloc[:,i+1]
    l.append(dic1)

But after I tried:

pd.DataFrame(l)

I got a confused dataframe in which values of multiple columns are stored in the same cell. I report here the result.

enter image description here

I guess that is because I did not define the name of the columns that I created with the ratio, but I can't figure it out. Do you have any suggestions? Thank you!

1
  • it looks like your cells contain the whole row, while the rest of the cells in that row are empty. Lemme see if i can help. Commented Feb 9, 2020 at 13:53

1 Answer 1

1

Singular implementation:

df['1B/1A'] = df.apply(lambda x: x['1B']/x['1A'], axis=1)

What's happening here:

  • .apply() function lets you apply a method over your whole series/dataframe.
  • lambda x: lets you iterate over each row, and here you can access each column by referencing it.

or even simpler:

df['1B/1A'] = df['1B'] / df['1A']

UPDATE: Generic Implementation

cols = df.columns
a_cols = [col for col in cols if 'A' in col]
b_cols = [col for col in cols if 'B' in col]
for a, b in zip(a_cols, b_cols):
   df[b+'/'+a] = df[b] / df[a]

Hope this helps!

Sign up to request clarification or add additional context in comments.

4 Comments

no need for apply. Just do df['1B/1A'] = df['1B'] / df['1A']. Same goes for df['2B/2A']
@RamshaSiddiqui Siddiqui thank you, but it gives me the following error: 'str' object has no attribute 'contains'
fixed. try now!
@RamshaSiddiqui Thank you! I also found another way: for i in np.arange(0,536,2): df[i+1] = df.iloc[:,i]/luo.iloc[:,i+1]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.