Merging a Dataframe with an empty Dataframe having a column header in Python

Question

I want to merge two dataframes together; one which is an empty dataframe having a column header and the other one is a dataframe of size 18 x 600.

What I tried:

userQuestionVector1 = pd.read_csv("embedding1_3.csv")
userQuestionVector2 = pd.read_csv("embedding2_3.csv")
userQuestionVector = pd.concat([userQuestionVector1,userQuestionVector2],axis=1)
new_df = pd.DataFrame(columns=[vector])
df_userQuestionVector = new_df.append(userQuestionVector)
print(df_userQuestionVector)

Over here, vector is a list of 600 strings.

['word2vec_q1_1', 'word2vec_q1_2', 'word2vec_q1_3', ..., 'word2vec_q1_300', 'word2vec_q2_1', ..., 'word2vec_q2_300']

Dimension of new_df is 0 x 600.

Dimension of userQuestionVector1 and userQuestionVector2 are 18 x 300.

Dimension of userQuestionVector is 18 x 600.

The output df_userQuestionVector is 18 x 1200 in dimension i.e., it is merging the two dataframes side by side leaving second half with NaN values.

  value1_1 value1_2 value1_3 ... value1_300 string1 string2 string3 ... string300
0 value2_1 value2_2 value2_3 ... value2_300  NaN     NaN     NaN   ...     NaN
1 value3_1 value3_2 value3_3 ... value3_300  NaN     NaN     NaN   ...     NaN
2 value4_1 value4_2 value4_3 ... value4_300  NaN     NaN     NaN   ...     NaN
.   .       .       .            .       .       .            .
.   .       .       .            .       .       .            .

The expected output should be 18 X 600 in dimension i.e., df_userQuestionVector should merge below new_df.

   string1  string2  string3  ... string300
0  value1_1 value1_2 value1_3 ... value1_300
1  value2_1 value2_2 value2_3 ... value2_300
2  value3_1 value3_2 value3_3 ... value3_300
.   .       .       .            .       .    
.   .       .       .            .       .

I also tried:

frames=[new_df, userQuestionVector]
df_userQuestionVector = pd.concat(frames,axis=0)

But this gives me same result.

How should I solve this problem? Thank you.

What's in vector? Why not just using append with the 2 dataframes? — gionni
– gionni, Commented Jul 17, 2017 at 11:59
@gionni vector is a list of 600 strings. Look at my updated question. — K. K.
– K. K., Commented Jul 17, 2017 at 12:05

Bharath M Shetty · Accepted Answer · 2017-07-17 12:13:40Z

2

While reading the csv set the header to None and Instead of creating a new_df dataframe set the userQuestionVector dataframe columns to vector i.e change the code to

userQuestionVector1 = pd.read_csv("embedding1_3.csv", header= None)
userQuestionVector2 = pd.read_csv("embedding2_3.csv", header = None)
userQuestionVector = pd.concat([userQuestionVector1,userQuestionVector2],axis=1)
userQuestionVector.columns = vector

Hope this helps.

answered Jul 17, 2017 at 12:13

Bharath M Shetty

30.6k6 gold badges65 silver badges111 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Merging a Dataframe with an empty Dataframe having a column header in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related