def preprocess(numerical , categorical):
imputer = SimpleImputer()
x_num = imputer.fit_transform(numerical)
scaler = StandardScaler()
x_num = scaler.fit_transform(x_num)
one_hot = OneHotEncoder()
x_cat = one_hot.fit_transform(categorical)
print('X_num Shape : ' , x_num.shape)
print('X_cat Shape : ' , x_cat.shape)
return np.concatenate((x_num,x_cat),axis = 1)
[Output] X_num Shape : (889, 2)
X_cat Shape : (889, 22)
The Error it shows at the end is ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 0 dimension(s)
I want the output to be of shape (889,24)
last sentence ( array at index 1 has 0 dimensions ) drives me to think that the problem is related to the weird numpy arrays of shape (n,) and (,n) but that shouldn't be a problem as dimensions are shown to not be that way but I think there's something I'm missing
I've also tried using a lot of different functions np.hstack , np.vstack , np.column_stack but they either dont give the desired output or show this error message ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 2 and the array at index 1 has size 1
axis,try withaxis = 0ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 0 dimension(s)np.concatenate((x,y),axis=1)works fine for my dummy array,check your dtypes of the arraysnp.concatenate. Look atnp.array(X_cat).shape. There's your 1d array.one-hotis producing a sparse matrix (check its default parameters). Either change that sparse setting, make the result dense, or usesparse.hstack. This is a tricky error (I've seen it a few times before), but ultimately it comes down to too casual reading of the documentation.