1

Need to import csv data inside a zip file to my product model using activerecord-import and rubyzip gem.

This code works (download the zip and display the csv name)

desc "Import products data from web"
task import_product: :environment do
    url = "https://example.com"
    dir = "db/example_zip.zip"

    File.open(dir, "wb") do |f|
        f.write HTTParty.get(url).body
    end

    Zip::File.open(dir) do |zip|
        zip.each do |entry|
            entry.name
        end
    end
end

In the "zip.each loop" I tried this :

items = []
CSV.foreach(entry, headers: true) do |row|
  items << Item.new(row.to_h)
end
Item.import(items)

I have the following error TypeError: no implicit conversion of Zip::Entry into String

According this tutorial: https://mattboldt.com/importing-massive-data-into-rails/

What is the best way to refresh my product model data with this csv? Do I have to read the file into memory (entry.get_input_stream.read) or save the file then import it?

Thanks for your help

2 Answers 2

2

The exception TypeError: no implicit conversion of Zip::Entry into String raised because CSV.foreach method accepts a file path (which is a String object) as argument but you send it a Zip::Entry object instead.

You can simply extract the zip file and load its content directly into memory:

Zip::File.open(dir) do |zip|
  zip.each do |entry|
    items = []
    CSV.new(entry.get_input_stream.read, headers: true).each do |row|
      items << Item.new(row.to_h)
    end
    Item.import(items)
  end
end

Or if the csv file is too big, you can persist the decompressed files, then use CSV.foreach to load these files:

Zip::File.open(dir) do |zip|
  zip.each do |entry|
    csv_file = File.join(File.dirname(dir), entry.name)
    entry.extract(csv_file)
    items = []
    CSV.foreach(csv_file, headers: true) do |row|
      items << Item.new(row.to_h)
    end
    Item.import(items)
  end
end

You can read more in these documentation:

Sign up to request clarification or add additional context in comments.

Comments

0

Finally, here is my code to download a zip file and import data to my product model

require 'zip'
require 'httparty'
require 'active_record'
require 'activerecord-import'

namespace :affiliate_datafeed do
    desc "Import products data from Awin"
    task import_product_awin: :environment do
        url = "https://productdata.awin.com"
        dir = "db/affiliate_datafeed/awin.zip"

        File.open(dir, "wb") do |f| 
            f.write HTTParty.get(url).body
        end

        zip_file = Zip::File.open(dir)
        entry = zip_file.glob('*.csv').first
        csv_text = entry.get_input_stream.read
        products = []

        CSV.parse(csv_text, :headers=>true).each do |row|
            products << Product.new(row.to_h)
        end
        Product.import(products)
  end
end

But the next question is, how to update the product db only if the product doesn't exist or if there is a new date in the last_updated field? What is the best way to refresh a large db? Thanks

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.