0

hey I'm trying to extract certain row from a CSV file with content in this form:

POS,Transaction id,Product,Quantity,Customer,Date
1,E100,TV,1,Test Customer,2022-09-19
2,E100,Laptop,3,Test Customer,2022-09-20
3,E200,TV,1,Test Customer,2022-09-21
4,E300,Smartphone,2,Test Customer,2022-09-22
5,E300,Laptop,5,New Customer,2022-09-23
6,E300,TV,1,New Customer,2022-09-23
7,E400,TV,2,ABC,2022-09-24
8,E500,Smartwatch,4,ABC,2022-09-25

the code I wrote is the following

def csv_upload_view(request):
    print('file is being uploaded')

    if request.method == 'POST':
        csv_file = request.FILES.get('file')
        obj = CSV.objects.create(file_name=csv_file)

        with open(obj.file_name.path, 'r') as f:
            reader = csv.reader(f)
            reader.__next__()
            for  row in reader:
                data = "".join(row)
                data = data.split(";")
                #data.pop()
                print(data[0], type(data))
                transaction_id = data[0]
                product = data[1]
                quantity = int(data[2])
                customer = data[3]
                date = parse_date(data[4])

In the console then I get the following output:

Quit the server with CONTROL-C.
[22/Sep/2022 15:16:28] "GET /reports/from-file/ HTTP/1.1" 200 11719
file is being uploaded
1E100TV1Test Customer2022-09-19 <class 'list'>

So that I get the correct row put everything concatenated. If instead I put in a space in the " ".join.row I get the entire row separated with empty spaces - what I would like to do is access this row with

transaction_id = data[0]
                product = data[1]
                quantity = int(data[2])
                customer = data[3]
                date = parse_date(data[4])

but I always get an

IndexError: list index out of range

I also tried with data.replace(" ",";") but this gives me another error and the data type becomes a string instead of a list:

ValueError: invalid literal for int() with base 10: 'E'

Can someone please show me what I'm missing here?

2 Answers 2

2

I'm not sure why you are joining/splitting the row up. And you realize your split is using a semicolon?

I would expect something like this:

import csv
from collections import namedtuple

Transaction = namedtuple('Transaction', ['id', 'product', 'qty', 'customer', 'date'])

f_name = 'data.csv'
transactions = []  # to hold the result
with open(f_name, 'r') as src:
    src.readline()  # burn the header row
    reader = csv.reader(src)   # if you want to use csv reader
    for data in reader:
        #print(data)  <-- to see what the csv reader gives you...
        t = Transaction(data[1], data[2], int(data[3]), data[4], data[5])
        transactions.append(t)

for t in transactions:
    print(t)

The above "catches" results with a namedtuple, which is obviously optional. You could put them in lists, etc.

Also csv.reader will do the splitting (by comma) by default. I edited my previous answer.

As far as your question goes... You mention extracting a "certain row" but you gave no indication how you would find such row. If you know the row index/number, you could burn lines with readline or such, or just keep a counter while you read. If you are looking for keyword in the data, just pop a conditional statement in either before or after splitting up the line.

Sign up to request clarification or add additional context in comments.

4 Comments

thanks for your reply, just got back to my laptop, when I do this, I get an attribute error saying: AttributeError: 'list' object has no attribute 'strip' after iterating through with for row in reader: data = row.strip().split(',') print(data[0], type(data))
Yeah, I dorked that up. I forgot that csv reader spits out lists and that is already done for you. I usually don't use csv reader and DIY it, but reader is good. Anyhow, answer has been modified above.
with data = str(row).strip().split(',') it actually works
well sure, but you are undoing what has already been done, which is nonsensical. If you use csv reader, don't bother with that line....
1

This way you can split the rows (and find which row you want based on some provided value)

with open('data.csv') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter = ',')

    line_count = 0

    for row in csv_reader:
        # Line 0 is the header

        if line_count == 0:
            print(f'Column names are {", ".join(row)}')
            line_count += 1
        
        else:
            line_count += 1
            # Here you can check if the row value is equal what you're finding
            # row[0] = POS
            # row[1] = Transaction id
            # row[2] = Product
            # row[3] = Quantity
            # row[4] = Customer
            # row[5] = Date

            if row[2] = "TV":
                #If you want to add all variables into a single string:
                data = ",".join(row) 

                # Make each row into a single variable:
                transaction_id = row[0]
                product = row[1]
                quantity = row[2]
                customer = row[3]
                date = row[4]

1 Comment

I think it's working now, need to check later with more time if everything works as intended but I can already access the items thank you :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.