Query related to sqoop-import?

Question

Scenario:

I have imported the data from SQl Server to HDFS. The data stored in HDFS directory in a multiple files as:

part-m-00000
part-m-00001
part-m-00002
part-m-00003

Question:

My question is that While reading this stored data from HDFS directory we have to read all file (part-m-00000,01,02,03) or just part-m-00000. Because when I read that data, I found that the data inside HDFS is little bit missing. So, is it happens or something I missed out?

Harel Ben Attia · Accepted Answer · 2012-02-15 09:44:48Z

2

You need to read all the files, not just 00000. The reason there are multiple files is that sqoop works in a map-reduce fashion, splitting the "import" work to multiple parts. The output from each part is put in a separate file.

RL

answered Feb 15, 2012 at 9:44

Harel Ben Attia

4944 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Subash · Accepted Answer · 2017-12-18 07:33:30Z

1

Sqoop is running the import with no reducers.As a result,there is no consolidation for the part files which were processed by the mappers.Hence you will see part files depending upon the number of mappers you have set in the sqoop command as --m4 or --num-4.So if you provide sqoop import --connect jdbc:mysql://localhost/db --username <>--table <>--m1 then it will create only one part file.

edited Dec 18, 2017 at 7:33

answered Dec 18, 2017 at 5:42

Subash

8951 gold badge8 silver badges19 bronze badges

Comments

JAy PaTel · Accepted Answer · 2017-12-19 04:40:34Z

0

If your result size is huge, then Hive will store the result in chunks. And If you want to Read those all files using CLI, then execute below command.

$ sudo cat part-m-*

It will give you final result without any of missing part.

answered Dec 19, 2017 at 4:40

JAy PaTel

753 silver badges16 bronze badges

Collectives™ on Stack Overflow

Query related to sqoop-import?

Scenario:

Question:

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

Scenario:

Question:

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related