0

There are many StackOverflow questions about this error when reading from a CSV file. My problem is occurring while reading from STDIN.

[Most SO solutions talk about tweaking the open() command which works for opening CSV files - not for reading them through STDIN]. My problem is with reading through STDIN. So please don't mark this as a duplicate.

My python code is:

import sys , csv
def main(argv): 
    reader = csv.reader(sys.stdin, delimiter=',')
    for line in reader:
        print line

and the returned error is:

Traceback (most recent call last):
File "mapper.py", line 19, in <module>
    main(sys.argv) 
File "mapper.py", line 4, in main
    for line in reader:
_csv.Error: line contains NULL byte

It would suffice me to simply ignore that line where the NULL byte occurs (if that is possible) in the for loop.

2
  • And what did you pipe in then? The Python CSV reader doesn't support UTF-16 or UTF-32 data, for example. Commented Nov 24, 2014 at 9:54
  • i piped in a CSV file like cat log.csv | python mapper.py. I opened the CSV file in sublime and looked at the line number where the problem occurred. That line contained the following: 2013-12-18,2013-12-18 08:19:15.0,2778,1003,1328,6112,116.68.205.197,http://jobs.example.com/jobapply_confirm.asp?mclcIbl[f^S_d[=am]np_&%604^35db__j^8NUL]21ad]_bacs_%20%60uaic.^M]e%60a]p%60a^1[5a1=^iapa&a1b3%602a=ci_ua8,;Firefox;25.0;Windows XP;;;,1024x768,view,438,2q8cfvt4,Undefined,BD,2q92rpej,returning,1382929479,1387351637,0,0,0 Notice the substring NUL in it? Commented Nov 24, 2014 at 10:07

1 Answer 1

1

i solved it by handling CSV exception

import sys , csv    
def main(argv): 
    reader      = csv.reader(sys.stdin, delimiter=',')
    lineCount   = 0
    errorCount  = 0
    while True:
        # keep iterating indefinitely until exception is raised for end of the reader (an iterator)
        try:
            lineCount += 1
            line = next(reader)
            print "%d - %s" % (lineCount , line)
        except csv.Error: 
            # this exception is raised when a malformed CSV is encountered... ignore it and continue
            errorCount += 1
            continue
        except StopIteration: 
            # this exception is raised when next() reaches the end of the iterator
            lineCount -= 1
            break
    print "total line: %d" % lineCount
    print "total error: %d" % errorCount
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.