How to add a new column to a CSV file?

Question

I have several CSV files that look like this:

Input
Name        Code
blackberry  1
wineberry   2
rasberry    1
blueberry   1
mulberry    2

I would like to add a new column to all CSV files so that it would look like this:

Output
Name        Code    Berry
blackberry  1   blackberry
wineberry   2   wineberry
rasberry    1   rasberry
blueberry   1   blueberry
mulberry    2   mulberry

The script I have so far is this:

import csv
with open(input.csv,'r') as csvinput:
    with open(output.csv, 'w') as csvoutput:
        writer = csv.writer(csvoutput)
        for row in csv.reader(csvinput):
            writer.writerow(row+['Berry'])

(Python 3.2)

But in the output, the script skips every line and the new column has only Berry in it:

Output
Name        Code    Berry
blackberry  1   Berry

wineberry   2   Berry

rasberry    1   Berry

blueberry   1   Berry

mulberry    2   Berry

possible duplicate of Copy one column to another but with different header — Martijn Pieters
– Martijn Pieters, Commented Jun 17, 2012 at 10:12
is it possible you only have 'Berry' in your last column because you are only writing 'Berry' to the file? (row+['Berry']) What did you expect to write? — Dhara
– Dhara, Commented Jun 17, 2012 at 10:16
@Dhara: I would like to have Berry as a header and Name column value as row value for the Berry. See above. — fairyberry
– fairyberry, Commented Jun 17, 2012 at 10:31

joaquin · Accepted Answer · 2016-07-01 05:51:05Z

111

This should give you an idea of what to do:

>>> v = open('C:/test/test.csv')
>>> r = csv.reader(v)
>>> row0 = r.next()
>>> row0.append('berry')
>>> print row0
['Name', 'Code', 'berry']
>>> for item in r:
...     item.append(item[0])
...     print item
...     
['blackberry', '1', 'blackberry']
['wineberry', '2', 'wineberry']
['rasberry', '1', 'rasberry']
['blueberry', '1', 'blueberry']
['mulberry', '2', 'mulberry']
>>>

Edit, note in py3k you must use next(r)

Thanks for accepting the answer. Here you have a bonus (your working script):

import csv

with open('C:/test/test.csv','r') as csvinput:
    with open('C:/test/output.csv', 'w') as csvoutput:
        writer = csv.writer(csvoutput, lineterminator='\n')
        reader = csv.reader(csvinput)

        all = []
        row = next(reader)
        row.append('Berry')
        all.append(row)

        for row in reader:
            row.append(row[0])
            all.append(row)

        writer.writerows(all)

Please note

the lineterminator parameter in csv.writer. By default it is set to '\r\n' and this is why you have double spacing.
the use of a list to append all the lines and to write them in one shot with writerows. If your file is very, very big this probably is not a good idea (RAM) but for normal files I think it is faster because there is less I/O.
As indicated in the comments to this post, note that instead of nesting the two with statements, you can do it in the same line:

with open('C:/test/test.csv','r') as csvinput, open('C:/test/output.csv', 'w') as csvoutput:

edited Jul 1, 2016 at 5:51

answered Jun 17, 2012 at 10:32

joaquin

86k31 gold badges146 silver badges155 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

fairyberry Over a year ago

thanks for the note. I tried and it gives me attribute error: '_csv.reader' object has no attribute 'next'. Do you have any idea?

joaquin Over a year ago

I see you are in py3k. then you must use next(r) instead of r.next()

fairyberry Over a year ago

＠joaquin: OMG. Thanks for the bonus!!

Caumons Over a year ago

Note: instead of nesting with statements, you can do it at the same line separating them with a comma e.g.: with open(input_filename) as input_file, open(output_filename, 'w') as output_file

joaquin Over a year ago

@Caumons You are right and this would be nowadays the way to go. Note my answer tried to keep the OP code structure to focus on the solution to his problem.

|

Jough Dempsey · Accepted Answer · 2017-02-27 18:39:43Z

86

I'm surprised no one suggested Pandas. Although using a set of dependencies like Pandas might seem more heavy-handed than is necessary for such an easy task, it produces a very short script and Pandas is a great library for doing all sorts of CSV (and really all data types) data manipulation. Can't argue with 4 lines of code:

import pandas as pd
csv_input = pd.read_csv('input.csv')
csv_input['Berries'] = csv_input['Name']
csv_input.to_csv('output.csv', index=False)

Check out Pandas Website for more information!

Contents of output.csv:

Name,Code,Berries
blackberry,1,blackberry
wineberry,2,wineberry
rasberry,1,rasberry
blueberry,1,blueberry
mulberry,2,mulberry

edited Feb 27, 2017 at 18:39

Jough Dempsey

6639 silver badges22 bronze badges

answered Dec 27, 2015 at 23:26

Blairg23

12.2k7 gold badges77 silver badges75 bronze badges

13 Comments

Ankit Maheshwari Over a year ago

How to update or add new column in same csv?? input.csv??

Blairg23 Over a year ago

@AnkitMaheshwari, change the name of output.csv in this example to input.csv. It will do the same thing, but output to input.csv.

Blairg23 Over a year ago

@AnkitMaheshwari Yes... that is the intended functionality. You want to replace the old content (the content with Name and Code) with the new content which has the same two columns from the old content PLUS a new column with Berries, as the OP asked.

pedrostrusso Over a year ago

A word of caution: Pandas is great for decently sized files. This answer will load all the data into memory, which can be troublesome for large files.

Blairg23 Over a year ago

@pedrostrusso But unless you're loading 4-16 gb files, you should be good on RAM. Unless you use a potato.

|

jgritty · Accepted Answer · 2012-06-17 10:30:23Z

18

import csv
with open('input.csv','r') as csvinput:
    with open('output.csv', 'w') as csvoutput:
        writer = csv.writer(csvoutput)

        for row in csv.reader(csvinput):
            if row[0] == "Name":
                writer.writerow(row+["Berry"])
            else:
                writer.writerow(row+[row[0]])

Maybe something like that is what you intended?

Also, csv stands for comma separated values. So, you kind of need commas to separate your values like this I think:

Name,Code
blackberry,1
wineberry,2
rasberry,1
blueberry,1
mulberry,2

edited Jun 17, 2012 at 10:30

answered Jun 17, 2012 at 10:25

jgritty

12k3 gold badges41 silver badges62 bronze badges

2 Comments

jgritty Over a year ago

Create a new question on stack overflow.

pedrostrusso Over a year ago

This should be the accepted answer, since it doesn't put all of the input rows into memory at once.

Tpk43 · Accepted Answer · 2019-02-06 07:50:23Z

8

Yes Its a old question but it might help some

import csv
import uuid

# read and write csv files
with open('in_file','r') as r_csvfile:
    with open('out_file','w',newline='') as w_csvfile:

        dict_reader = csv.DictReader(r_csvfile,delimiter='|')
        #add new column with existing
        fieldnames = dict_reader.fieldnames + ['ADDITIONAL_COLUMN']
        writer_csv = csv.DictWriter(w_csvfile,fieldnames,delimiter='|')
        writer_csv.writeheader()


        for row in dict_reader:
            row['ADDITIONAL_COLUMN'] = str(uuid.uuid4().int >> 64) [0:6]
            writer_csv.writerow(row)

answered Feb 6, 2019 at 7:50

Tpk43

3331 gold badge5 silver badges25 bronze badges

3 Comments

Nikos Alexandris Over a year ago

Any comment on the use of the uuid?

Tpk43 Over a year ago

Just to add some random data to the column, no specifications!!!

Ahmad Over a year ago

thanks, It was useful in the case that columns have new values (not from existing rows), so it's a general solution.

enigma · Accepted Answer · 2017-12-08 07:14:42Z

I used pandas and it worked well... While I was using it, I had to open a file and add some random columns to it and then save back to same file only.

This code adds multiple column entries, you may edit as much you need.

import pandas as pd

csv_input = pd.read_csv('testcase.csv')         #reading my csv file
csv_input['Phone1'] = csv_input['Name']         #this would also copy the cell value 
csv_input['Phone2'] = csv_input['Name']
csv_input['Phone3'] = csv_input['Name']
csv_input['Phone4'] = csv_input['Name']
csv_input['Phone5'] = csv_input['Name']
csv_input['Country'] = csv_input['Name']
csv_input['Website'] = csv_input['Name']
csv_input.to_csv('testcase.csv', index=False)   #this writes back to your file

If you want that cell value doesn't gets copy, so first of all create a empty Column in your csv file manually, like you named it as Hours then, Now for this you can add this line in above code,

csv_input['New Value'] = csv_input['Hours']

or simply we can, without adding the manual column, we can

csv_input['New Value'] = ''    #simple and easy

I Hope it helps.

RealThingMayNotBeElegant · Accepted Answer · 2022-05-03 16:08:27Z

4

You may just write:

import pandas as pd
import csv
df = pd.read_csv('csv_name.csv')
df['Berry'] = df['Name']
df.to_csv("csv_name.csv",index=False)

Then you are done. To check it, you may run:

h = pd.read_csv('csv_name.csv') 
print(h)

If you want to add a column with some arbitrary new elements(a,b,c), you may replace the 4th line of the code by:

df['Berry'] = ['a','b','c']

edited May 3, 2022 at 16:08

answered May 3, 2022 at 15:51

RealThingMayNotBeElegant

1013 bronze badges

Comments

Ashwaq · Accepted Answer · 2018-04-19 04:36:08Z

3

This code will suffice your request and I have tested on the sample code.

import csv

with open(in_path, 'r') as f_in, open(out_path, 'w') as f_out:
    csv_reader = csv.reader(f_in, delimiter=';')
    writer = csv.writer(f_out)

    for row in csv_reader:
    writer.writerow(row + [row[0]]

answered Apr 19, 2018 at 4:36

Ashwaq

4691 gold badge8 silver badges17 bronze badges

Comments

dna-data · Accepted Answer · 2020-11-05 20:59:47Z

For adding a new column to an existing CSV file(with headers), if the column to be added has small enough number of values, here is a convenient function (somewhat similar to @joaquin's solution). The function takes the

Existing CSV filename
Output CSV filename (which will have the updated content) and
List with header name&column values

def add_col_to_csv(csvfile,fileout,new_list):
    with open(csvfile, 'r') as read_f, \
        open(fileout, 'w', newline='') as write_f:
        csv_reader = csv.reader(read_f)
        csv_writer = csv.writer(write_f)
        i = 0
        for row in csv_reader:
            row.append(new_list[i])
            csv_writer.writerow(row)
            i += 1

Example:

new_list1 = ['test_hdr',4,4,5,5,9,9,9]
add_col_to_csv('exists.csv','new-output.csv',new_list1)

Existing CSV file:

Output(updated) CSV file:

manicphase · Accepted Answer · 2012-06-17 10:36:24Z

2

I don't see where you're adding the new column, but try this:

    import csv
    i = 0
    Berry = open("newcolumn.csv","r").readlines()
    with open(input.csv,'r') as csvinput:
        with open(output.csv, 'w') as csvoutput:
            writer = csv.writer(csvoutput)
            for row in csv.reader(csvinput):
                writer.writerow(row+","+Berry[i])
                i++

answered Jun 17, 2012 at 10:36

manicphase

6287 silver badges9 bronze badges

Comments

M. Perier--Dulhoste · Accepted Answer · 2020-09-02 18:58:31Z

In case of a large file you can use pandas.read_csv with the chunksize argument which allows to read the dataset per chunk:

import pandas as pd

INPUT_CSV = "input.csv"
OUTPUT_CSV = "output.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(OUTPUT_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

If you want to update the file inplace, you can use a temporary file and erase it at the end

import pandas as pd

INPUT_CSV = "input.csv"
TMP_CSV = "tmp.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(TMP_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

os.replace(TMP_CSV, INPUT_CSV)

Dilip Kumar Choudhary · Accepted Answer · 2020-07-24 11:31:51Z

1

Append new column in existing csv file using python without header name

  default_text = 'Some Text'
# Open the input_file in read mode and output_file in write mode
    with open('problem-one-answer.csv', 'r') as read_obj, \
    open('output_1.csv', 'w', newline='') as write_obj:
# Create a csv.reader object from the input file object
    csv_reader = reader(read_obj)
# Create a csv.writer object from the output file object
    csv_writer = csv.writer(write_obj)
# Read each row of the input csv file as list
    for row in csv_reader:
# Append the default text in the row / list
        row.append(default_text)
# Add the updated row / list to the output file
        csv_writer.writerow(row)

Thankyou

answered Jul 24, 2020 at 11:31

Dilip Kumar Choudhary

4695 silver badges6 bronze badges

Collectives™ on Stack Overflow

How to add a new column to a CSV file?

11 Answers 11

6 Comments

13 Comments

2 Comments

3 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

6 Comments

13 Comments

2 Comments

3 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest