0

I've got a CSV file where there a column fill with Array data, written as ['foo', 'bar', 'baz'] in the file.

Manufacturer;EngineCount;ImagesURL
Boeing,2,['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']
Airbus,4,['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']

I can't find a way to import it as an Array in MongoDB (4.2+), it's always imported as string "['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']". Maybe it's not possible with .csv file.

So I found a solution on stackoverflow where we update the fields to parse them as an array using JSON.parse function. But I new to MongoDB and I can't success to update the field with old value in MongoDB :

I tried db.planes.updateMany({}, {$set: {ImagesURL:JSON.parse("$ImagesURLS")}}).
It's return error SyntaxError: Unexpected token $ in JSON at position 0 Where is my mistake ?

Thank you.

1 Answer 1

1

Option 1: Transform the input abit and use the option --useArrayIndexFields as follow:

input.csv: ( remove the brackets and add index in the ImagesURL header per array field )

 Manufacturer,EngineCount,ImagesURL.0,ImagesURL.1,ImagesURL.2
 Boeing,2,'https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3'
 Airbus,4,'https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3'

And do:

 mongoimport --db=myDB --col=planes --type=csv --headerline --useArrayIndexFields --file=input.csv

And you will have the documents imported as:

{Manufacturer:"Boeing",EngineCount:2,ImagesURL:['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']}

{Manufacturer:"Airbus",EngineCount:4,ImageURL:['https://url.com/image1', 'https://url.com/image2', 'https://url.com/image3']}

Option 2: Update after the import based on field type is string

// the document inserted as string: "["https://url.com/image1","https://url.com/image2"]"

   var ops = [];
   db.planes.find({ "ImagesURL": { "$type": 2} }).forEach(doc => {
   var ImagesURL = doc.ImagesURL.split(',').map( e => e.replace(/"|\[|\]|\\/gm,'').toString() );
   ops.push({
"updateOne": {
  "filter": { "_id": doc._id },
  "update": { "$set": { "ImagesURL": ImagesURL } }
  }
 });

     if ( ops.length >= 100 ) {
        db.planes.bulkWrite(ops);
        ops = [];
        }             
  });

if ( ops.length > 0 ) {
 db.planes.bulkWrite(ops);
 ops = [];
}
Sign up to request clarification or add additional context in comments.

3 Comments

I tried the first solution, changing my CSV format but it doesn't work. My csv use semicolon separator (because there commas in the csv datas), and it seem that mongoimport can't use another separator than comma (instead of Mongo Compass that I used). I will try to bypass it. Also, --col isn't recognised by mongoimport, should be --collection
In this case you can use another trick , you can replace the semicolon with tabs and import with the tsv option: tr ";" "\t" < file.csv | mongoimport --type tsv ...
It works using the .tsv files. Thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.