0

I am trying to do K- means clustering with scipy following this tutorial: http://glowingpython.blogspot.no/2012/04/k-means-clustering-with-scipy.html

The problem is that he uses vstack to make the arbitrary datapoints, which in turn returns an ndarray. I have two lists: lengths and breadths. How do I combine them to an ndarray so I can use his example?

lengths = [300.0, 300.0, 300.0, 300.0, 303.0, 300.0]
breadths = [9.6, 9.7, 9.8, 10.3, 6.8, 9.4]

1 Answer 1

1

Numpy's vstack will just accept these as lists fine:

In [23]: np.vstack((lengths, breadths))
Out[23]:
array([[ 300. ,  300. ,  300. ,  300. ,  303. ,  300. ],
       [   9.6,    9.7,    9.8,   10.3,    6.8,    9.4]])

If you want to explicitely convert it to an array, you can do:

In [24]: np.array(lengths)
Out[24]: array([ 300.,  300.,  300.,  300.,  303.,  300.])

However, I think in the case of this example kmeans expects the observations as different rows, so you need the transpose: np.vstack((lengths, breadths)).T

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your answer. It seems like something wrong is happening, as his data is an array of tuples where each tuple consist of each point [ [2 2] [2 4] [2 7] ] etc while this result gives two long arrays.
His data is a 2D array, where the columns are the different features (eg x, y, or length, breadth in your case). See my update.
I know gratitude should be shown by up-voting and accepting (which I have done) - but I just want also express my gratitude in words as well. Thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.