Iterating rows with a for loop to a csv file with Pandas and Numpy Python

Question

The code below is meant to iterate [Val1, Val2, Val3, Val4] into a csv file. It saves each iteration to the csv code with the dataframe.to_csv("sales.csv", index=False, mode='a', header=False) code. However the code makes a separate row for each Val value as visualized in the Ouput. I want to make it so that val1-4 are printed row by row for each iteration. How could I do that so I could get the Expected output. as the resultant.

from numpy import random
import pandas
Values = random.randint(100, size=(100000))
Number_array = random.randint(100, size=(1000))
for n in range(len(Values)):
    val1 = np.sum(Number_array) + Values[n] * len(Number_array)
    val2 = np.sum([Number_array])
    val3 = val1 * val2
    val4 = n * 2
    data =[Val1, Val2, Val3, Val4]
    dataframe = pandas.DataFrame(data)
    dataframe.to_csv("input.csv", index=False, mode='a', header=False)

input.csv file:

Val1, Val2, Val3, Val4

Output:

Val1, Val2, Val3, Val4
49793 
48793 
-1865417447 
0
82793
48793
-255248447 
2

Expected output

Val1, Val2, Val3, Val4
49793,48793, -1865417447, 0
82793, 48793, -255248447, 2

try dataframe = pandas.DataFrame([data]) instead of dataframe = pandas.DataFrame(data) ? Just guessing looking at your code, haven't tried. Looks like you want to create 4 columns but you pass a single list instead of list of the list which is why you end up with 1 column instead of 4. — anky
– anky, Commented Mar 28, 2021 at 6:16
@anky please look at time consumption at each step as you are dealing with file-object append which take much time then append to list and then at to file in one step — Davinder Singh
– Davinder Singh, Commented Mar 28, 2021 at 6:32
@ExplooreX Got you. :) I commented the why is it not working part only, may be if we are appending as a list we dont need the mode = 'a' in your answer..? — anky
– anky, Commented Mar 28, 2021 at 6:35

Davinder Singh · Accepted Answer · 2021-03-28 06:44:37Z

Check out this code :

l = []
for n in range(len(Values)):
    val1 = np.sum(Number_array) + Values[n] * len(Number_array)
    val2 = np.sum([Number_array])
    val3 = val1 * val2
    val4 = n * 2
    data =[val1, val2, val3, val4]
    # dataframe = pandas.DataFrame([data])
    l.append(data)
dataframe = pandas.DataFrame(l)
dataframe.to_csv("input.csv", index=False, mode='a', header=False)

Method-2 If you want to go with @anky user comment you have to look at len at len(Values) which is about 100k and we are dealing with dataframe and file at each iteration which take much time then appending it to list and then add to file, as here file-process is only one step

Just change:

   dataframe = pandas.DataFrame(data)

to:

   dataframe = pandas.DataFrame([data])

Time-consumption analysis

1st-case:
0.38395023345947266 s ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


2nd-case:  
350.7548952102661 s ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

ThePyGuy · Accepted Answer · 2021-03-28 06:16:53Z

2

That's because your dataframe looks like that

>>data
[1, 2, 3, 4]
>>dataframe = pd.DataFrame(data)
>>dataframe
   0
0  1
1  2
2  3
3  4

You can just Transpose the dataframe to get in the form you want

>>dataframe.T
   0  1  2  3
0  1  2  3  4

Or you can just nest the data list to another list as:

>>dataframe = pd.DataFrame([data])
>>dataframe
   0  1  2  3
0  1  2  3  4

answered Mar 28, 2021 at 6:16

ThePyGuy

18.5k5 gold badges24 silver badges55 bronze badges

Collectives™ on Stack Overflow

Iterating rows with a for loop to a csv file with Pandas and Numpy Python

2 Answers 2

Time-consumption analysis

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Time-consumption analysis

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related