Split array at value in numpy

Question

I have a file containing data in the format:

0.0 x1
0.1 x2
0.2 x3
0.0 x4
0.1 x5
0.2 x6
0.3 x7
...

The data consists of multiple datasets, each starting with 0 in the first column (so x1,x2,x3 would be one set and x4,x5,x6,x7 another one). I need to plot each dataset separately so I need to somehow split the data. What would be the easiest way to accomplish this?

I realize I could go through the data line-by-line and split the data every time I encounter a 0 in the first column but this seems very inefficient.

eat · Accepted Answer · 2011-03-11 15:28:36Z

28

I actually liked Benjamin's answer, a slightly shorter solution would be:

B= np.split(A, np.where(A[:, 0]== 0.)[0][1:])

answered Mar 11, 2011 at 15:28

eat

7,5401 gold badge21 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Benjamin Over a year ago

If there is one thing I know for sure, it's that no matter what you write in Python, there is always a shorter way of doing it!

eat Over a year ago

@bafcu: I honestly think the credit really should go to Benjamin (instead to me). I was merely 'fine tuning' his answer. Thanks

Benjamin · Accepted Answer · 2011-03-11 15:10:28Z

17

Once you have the data in a long numpy array, just do:

import numpy as np

A = np.array([[0.0, 1], [0.1, 2], [0.2, 3], [0.0, 4], [0.1, 5], [0.2, 6], [0.3, 7], [0.0, 8], [0.1, 9], [0.2, 10]])
B = np.split(A, np.argwhere(A[:,0] == 0.0).flatten()[1:])

which will give you B containing three arrays B[0], B[1] and B[2] (in this case; I added a third "section" to prove to myself that it was working correctly).

answered Mar 11, 2011 at 15:10

Benjamin

12k13 gold badges75 silver badges120 bronze badges

Comments

Paul · Accepted Answer · 2011-03-11 15:14:38Z

1

You don't need a python loop to evaluate the locations of each split. Do a difference on the first column and find where the values decrease.

import numpy

# read the array
arry = numpy.fromfile(file, dtype=('float, S2'))

# determine where the data "splits" shoule be
col1 = arry['f0']
diff = col1 - numpy.roll(col1,1)
idxs = numpy.where(diff<0)[0]

# only loop thru the "splits"
strts = idxs
stops = list(idxs[1:])+[None]
groups = [data[strt:stop] for strt,stop in zip(strts,stops)]

answered Mar 11, 2011 at 15:14

Paul

43.9k17 gold badges112 silver badges126 bronze badges

Comments

Hugh Bothwell · Accepted Answer · 2011-03-11 15:09:50Z

0

def getDataSets(fname):
    data_sets = []
    data = []
    prev = None
    with open(fname) as inf:
        for line in inf:
            index,rem = line.strip().split(None,1)
            if index < prev:
                data_sets.append(data)
                data = []
            data.append(rem)
            prev = index
        data_sets.append(data)
    return data_sets

def main():
    data = getDataSets('split.txt')
    print data

if __name__=="__main__":
    main()

results in

[['x1', 'x2', 'x3'], ['x4', 'x5', 'x6', 'x7']]

answered Mar 11, 2011 at 15:09

Hugh Bothwell

57k9 gold badges91 silver badges103 bronze badges

Collectives™ on Stack Overflow

Split array at value in numpy

4 Answers 4

2 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related