1

I have a features array that contains values of different types:

>>> features = train_df.values
>>> [x for x in features]

[True,
 array([2, 0, 0, ..., 0, 0, 0]),
 False,
 False,
 17,
 1,
 10,
 array([0, 0, 0, ..., 0, 0, 0])]

I would like to produce a single python array that contains a concatenation of the all of the above features, i.e.

np.array([True, 2, 0, 0, ..., 0, 0, 0, False, False, 17, 1, 10, 0, 0, 0, ..., 0, 0, 0])

My goal is to train sklearn LogisticRegression with the above feature vector. What is the best way to do this in python?

1 Answer 1

3

You could do this with a simple list comprehension.

>>> x
[True, array([2, 0, 0, 0, 0, 0]), False, False, 17, 1, 10, array([0, 0, 0, 0, 0, 0])]

>>> [j for i in x for j in (i if isinstance(i, np.ndarray) else (i, ))]
[True, 2, 0, 0, 0, 0, 0, False, False, 17, 1, 10, 0, 0, 0, 0, 0, 0]

>>> np.array(_, dtype='O')
array([True, 2, 0, 0, 0, 0, 0, False, False, 17, 1, 10, 0, 0, 0, 0, 0, 0], dtype=object)

If you don't add dtype='O', your bools will be casted to integers. It's upto you whether you want that or not. Working with object arrays are usually frowned upon, since they provide no vectorisation/efficiency benifits.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.