0

I am trying to study the probability of having a zero value in my data and I have developed a code that outputs the value of a column of data when the other is zero which is what I need. But having to do that for each column vs all other 28 of my 577by29 dataframe is difficult so I decided to create a for loop that does that for me where I have this:

import numpy as np
import pandas as pd
allchan = pd.read_csv('allchan.csv',delimiter = ' ')
allchanarray = np.array(allchan)
dfallchan = pd.DataFrame(allchanarray,range(1,578),dtype=float)
y = pd.DataFrame()
x = pd.DataFrame()
for n in range(0,29):
    x[n] = dfallchan[(dfallchan[0]>0) & (dfallchan[n]==0)][0]
    y[n] = x[n].count()
x.to_excel('n.xlsx', index=False, sheet_name='ValForOtherZero')
y.to_excel('v.xlsx', index=False, sheet_name='CountOfZeroVlas')

The problem that is that the loop for some reason goes properly through the lines:

 x[n] = dfallchan[(dfallchan[0]>0) & (dfallchan[n]==0)][0]
 y[n] = x[n].count()

but it repeats the value of n=6 for the second condition:

(dfallchan[n]==0)

the output of the code should return different values of the first channel as the zeros are randomly distributed in my input file, but my output is correct for the data until the the 6th column -as my columns(0-5) should be empty- where it repeats the output for all other columns! output: output 1

you can see that the code loops correctly as the output data frame has n=29 columns but not for the condition specified above.

Please help, Thanks!

4
  • 1
    That isn't an error, it's a warning. See more about it here: stackoverflow.com/questions/20625582/… Commented Aug 8, 2017 at 16:41
  • I am have read the warning and it seams that the type of the variable x was inappropriate.... Commented Aug 8, 2017 at 22:06
  • I am now running into another issue and have edited the question! Commented Aug 8, 2017 at 22:07
  • 1
    Check this: stackoverflow.com/questions/31674557/… Commented Aug 8, 2017 at 22:08

2 Answers 2

0

Finally Got it!

This code does exactly what I want!

# In[9]:

import numpy as np
import pandas as pd


# In[10]:

allchan = pd.read_csv('allchan.csv',delimiter = ' ')


# In[11]:

allchanarray = np.array(allchan)


# In[12]:

dfallchan = pd.DataFrame(allchanarray,range(1,578),dtype=float)


# In[13]:

v = pd.DataFrame(columns=range(0,29))
y = pd.DataFrame()
k = pd.DataFrame(columns=range(0,29))


# In[14]:

for n in range(0,29):
    x = dfallchan[(dfallchan[0]>0) & (dfallchan[n]==0)][0]
    y = y.append(x)
    v = y.transpose()
    k = v.count()


# In[15]:

v.columns=range(0,29)
k = k.values.reshape(1,29)


# In[16]:

v.to_excel("Chan1-OthersZeroVals.xlsx", index=False)
pd.DataFrame(k).to_excel("Chan1-OtherZeroCount.xlsx", index=False)
Sign up to request clarification or add additional context in comments.

2 Comments

Appending to dataframes is inefficient. Instead make the whole data first then make a dataframe from the final data.
look at my answer above. Ask if you face any issue.
0

This will more efficient.

all_values = []
for n in range(0,29):
    condition = (dfallchan[0]>0) & (dfallchan[n]==0)
    count = condition.sum()
    vals = dfallchan[condition][0].values
    all_values.append(vals)

all_values_df = pd.DataFrame(all_values).transpose()

Here, I am first creating a list of lists and appending all the values to it. Then at the end I am creating the dataframe and transposing it.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.