There are many StackOverflow questions about this error when reading from a CSV file. My problem is occurring while reading from STDIN.
[Most SO solutions talk about tweaking the open() command which works for opening CSV files - not for reading them through STDIN]. My problem is with reading through STDIN. So please don't mark this as a duplicate.
My python code is:
import sys , csv
def main(argv):
reader = csv.reader(sys.stdin, delimiter=',')
for line in reader:
print line
and the returned error is:
Traceback (most recent call last):
File "mapper.py", line 19, in <module>
main(sys.argv)
File "mapper.py", line 4, in main
for line in reader:
_csv.Error: line contains NULL byte
It would suffice me to simply ignore that line where the NULL byte occurs (if that is possible) in the for loop.
cat log.csv | python mapper.py. I opened the CSV file in sublime and looked at the line number where the problem occurred. That line contained the following:2013-12-18,2013-12-18 08:19:15.0,2778,1003,1328,6112,116.68.205.197,http://jobs.example.com/jobapply_confirm.asp?mclcIbl[f^S_d[=am]np_&%604^35db__j^8NUL]21ad]_bacs_%20%60uaic.^M]e%60a]p%60a^1[5a1=^iapa&a1b3%602a=ci_ua8,;Firefox;25.0;Windows XP;;;,1024x768,view,438,2q8cfvt4,Undefined,BD,2q92rpej,returning,1382929479,1387351637,0,0,0Notice the substringNULin it?