1

This should have been such an easy thing... buy I can't for the life of me figure out how to parse a CSV file that doesn't seem to have a specific encoding.

File.open(Rails.root.join('data', 'mike/test-csv.csv'), 'rb') { |f| f.read }
=> "ID,\x00Q\x00u\x00a\x00n\x00t\x00i\x00t\x00y\n\x006\x00e\x005\x004\x009\x001\x00e\x007\x00-\x007\x00f\x001\x005\x00-\x004\x001\x007\x00d\x00-\x00a\x004\x000\x003\x00-345\x00,\x00\x005\x000\x00.\x000\x000\x000\x000\x000\x000\x000\x000\x00\n"

Here's a gist of it, can't figure out a way to post the specific CSV.

All I get from checking the encoding of the file is that it's in binary format, any thoughts on how I could get it into a normal csv?

Note: This is a downloaded CSV so converting it to another encoding via opening it in excel and exporting (or something like that) is not an option :)

Thanks!

Updating with attempted solution 1:

path = Rails.root.join('data', 'mike/test-csv.csv')
CSV.read(path, {:headers  => true, :encoding => 'utf-8'}).each do |d| 
  puts d 
end
Result: 6e5491e7-7f15-417d-a403-345,50.00000000

While this is correct, it ONLY works with puts, for example:

CSV.read(path, {:headers  => true, :encoding => 'utf-8'}).map { |row| row }
=> [#<CSV::Row "ID":"\u00006\u0000e\u00005\u00004\u00009\u00001\u0000e\u00007\u0000-\u00007\u0000f\u00001\u00005\u0000-\u00004\u00001\u00007\u0000d\u0000-\u0000a\u00004\u00000\u00003\u0000-345\u0000" "\u0000Q\u0000u\u0000a\u0000n\u0000t\u0000i\u0000t\u0000y":"\u0000\u00005\u00000\u0000.\u00000\u00000\u00000\u00000\u00000\u00000\u00000\u00000\u0000">]

CSV.read(path, {:headers  => true, :encoding => 'utf-8'}).map(&:to_s)
=> ["\u00006\u0000e\u00005\u00004\u00009\u00001\u0000e\u00007\u0000-\u00007\u0000f\u00001\u00005\u0000-\u00004\u00001\u00007\u0000d\u0000-\u0000a\u00004\u00000\u00003\u0000-345\u0000,\u0000\u00005\u00000\u0000.\u00000\u00000\u00000\u00000\u00000\u00000\u00000\u00000\u0000\n"]

It's unfortunately still not the correct string :(

Final Solution (via @ashmaroli below):

path = Rails.root.join('data', 'mike/test-csv.csv')
csv_text = ''

File.open(path, 'r') do |csv|
  csv.each_line do |line|
    csv_text << line.gsub(/\u0000/, '')
  end
end

CSV.parse(csv_text, headers:true).map do |row| row end

Result:

[#<CSV::Row "ID":"6e5491e7-7f15-417d-a403-345" "Quantity":"50.00000000">]

Github Gist

Download Example CSV File

1 Answer 1

1
path = Rails.root.join('data', 'mike/test-csv.csv')
file = ""

File.open(path, 'r') do |csv|
  csv.each_line do |line|
    file << line.gsub(/\u0000/, '')
  end
end
print file
print file.inspect # same as above just wraps the string in a
                   # single line with "\n" chars
Sign up to request clarification or add additional context in comments.

6 Comments

thanks for the comment! Unfortunately, this doesn't work :( I'll update my post with the results
if the puts command outputs the right string, then perhaps the following should work as well: CSV.read(path, {:headers => true, :encoding => 'utf-8'}).map(&:to_s)
appreciate the effort! Just pasted that output as an edit to my post. Seems to still have an encoding on it. The gist I pasted in there should have the same encoding (if the github gist maintains it) if you want to test that in irb?
Also added a link to the actual file in there too!
added another update.. this time, using File and gsub
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.