How to append numpy array to numpy array of different size?

Question

I have 2 arrays to concatenate:

X_train's shape is (3072, 50000) y_train's shape is (50000,)

I'd like to concatenate them so I can shuffle the indices all in one go. I have tried the following, but neither works:

np.concatenate([X_train, np.transpose(y_train)])
np.column_stack([X_train, np.transpose(y_train)])

How can I concatenate them?

Concatenate to what? You got input-dimensions, what output-dimension do you want? (from a ML-perspective i don't see this making sense) — sascha
– sascha, Commented Feb 5, 2018 at 16:37
@DavidG Yes, thanks! Btw, why do I get (50000,) in the first place? Is that a numpy array? Seems like it's some kind of vector or list, idk. I'm new to numpy — Monty _s Flying Circus
– Monty _s Flying Circus, Commented Feb 5, 2018 at 16:41
In numpy 1-d arrays are just as useful as 2-d (or higher). — hpaulj
– hpaulj, Commented Feb 5, 2018 at 17:06

sascha · Accepted Answer · 2018-02-05 16:46:36Z

2

To give you some recommendation targeting the task, not your problem: don't do this!

Assuming X are your samples / observations, y are your targets:

Just generate a random-permutation and create views (nothing copied or modified) into those, e.g. (untested):

import numpy as np

X = np.random.random(size=(50000, 3072))
y = np.random.random(size=50000)

perm = np.random.permutation(X.shape[0])  # assuming X.shape[0] == y.shape[0]
X_perm = X[perm]  # views!!!
y_perm = y[perm]

Reminder: your start-shapes are not compatible to most python-based ml-tools as the usual interpretation is:

first-dim / rows: samples
second-dim / cols: features

As #samples need to be the same as #target-values y, you will see that my example is correct in regards to this, while yours need a transpose on X

answered Feb 5, 2018 at 16:46

sascha

33.7k6 gold badges80 silver badges117 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Monty _s Flying Circus · Accepted Answer · 2018-02-05 16:39:45Z

0

As DavidG said, I realized the answer is that y_train has shape (50000,) so I needed to reshape it before concat-ing

np.concatenate([X_train,         
     np.reshape(y_train, (1, 50000))])

Still, this evaluated very slowly in Jupyter. If there's a faster answer, I'd be grateful to have it

answered Feb 5, 2018 at 16:39

Monty _s Flying Circus

1,0372 gold badges15 silver badges34 bronze badges

Collectives™ on Stack Overflow

How to append numpy array to numpy array of different size?

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related