Python/numpy array partitioning

Question

I'm using Python 3.6 and numpy.

From an hdf5 file I read a column of a table that is a 2D array.

Each row of the array holds the ID's of the nodes of a finite element.

The table is structured such that it holds both lower and higher order elements in the same table (which sucks, but is not a degree of freedom I can change)

So the array looks something like this (except that it has potentially millions of rows)

[[1,2,3,4,0,0,0,0],           #<- 4 Node quad data packed with zeros
 [3,4,5,6,0,0,0,0],             
 [7,8,9,10,11,12,13,14],      #<- 8 node quad in the same table as 4 node quad
 [15,16,17,18,19,20,21,22]]

I need to separate this info into two separate arrays - one for the 4 node an done for 8 node rows.

[[1,2,3,4],          
 [3,4,5,6]]

[[7,8,9,10,11,12,13,14], 
 [15,16,17,18,19,20,21,22]]

Right now I'm iterating over the 2D array, checking the value of the 5th value in each row and creating two index arrays - one identifying the 4 node rows and one the 8 node rows.

for element in elements:
    if element[5] == 0:
        tet4indices.append(index)
    else:        
        tet10indices.append(index)
    index+=1

Then I use index array slicing to get the two arrays

tet4s=elements[tet4indices, 0:5]
tet10s=elements[tet10indices,0:10]

The above works, but seems kinda ugly.

If anyone has a better solution, I'd be grateful to hear about it.....

Thanks in advance,

Doug

Are the 4 & 8 element rows in separate groups or mixed?

hpaulj
– hpaulj

2017-12-14 06:49:26 +00:00
Commented Dec 14, 2017 at 6:49 — hpaulj
– hpaulj, Commented Dec 14, 2017 at 6:49
The 4 and 8 element rows can be mixed

max375
– max375

2017-12-14 07:24:45 +00:00
Commented Dec 14, 2017 at 7:24 — max375
– max375, Commented Dec 14, 2017 at 7:24
@max375 I added a generic answer which handles your case.

kmario23
– kmario23

2017-12-14 07:41:08 +00:00
Commented Dec 14, 2017 at 7:41 — kmario23
– kmario23, Commented Dec 14, 2017 at 7:41

hpaulj · Accepted Answer · 2017-12-14 07:02:54Z

2

In an array it's easy to find rows where the 5th element is 0, or not 0:

In [75]: arr = np.array(alist)
In [76]: arr
Out[76]: 
array([[ 1,  2,  3,  4,  0,  0,  0,  0],
       [ 3,  4,  5,  6,  0,  0,  0,  0],
       [ 7,  8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21, 22]])
In [77]: arr[:,5]
Out[77]: array([ 0,  0, 12, 20])
In [78]: eights = np.where(arr[:,5])[0]
In [79]: eights
Out[79]: array([2, 3], dtype=int32)
In [80]: arr[eights,:]
Out[80]: 
array([[ 7,  8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21, 22]])
In [81]: fours = np.where(arr[:,5]==0)[0]
In [82]: arr[fours,:]
Out[82]: 
array([[1, 2, 3, 4, 0, 0, 0, 0],
       [3, 4, 5, 6, 0, 0, 0, 0]])

Or with a boolean mask

In [83]: mask = arr[:,5]>0
In [84]: arr[mask,:]
Out[84]: 
array([[ 7,  8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21, 22]])
In [85]: arr[~mask,:]
Out[85]: 
array([[1, 2, 3, 4, 0, 0, 0, 0],
       [3, 4, 5, 6, 0, 0, 0, 0]])

You are lucky, in a sense, to have this clear 0 marker. Some finite element code duplicates node numbers to reduce the number, e.g. [1,2,3,3] for a 3 node element in a 4 node system. But in those cases the rest of the math works fine, even when you merge 2 nodes into one.

edited Dec 14, 2017 at 7:02

answered Dec 14, 2017 at 6:54

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

max375 Over a year ago

Thanks hpaulj for these solutions. They both seem to have the same speed and reduced the time spent in this function from 6.5 seconds to 1 second for an array 6,500,000 tet10 elements!!!

cosmic_inquiry · Accepted Answer · 2017-12-14 07:05:45Z

0

This works for me:

a=np.split(your_numpy_array,[4],1)
tet4s=np.vstack([a[0][i,:] for i in range(len(a[0])) if np.sum(a[1][i,:])==0])
tet10s=np.vstack([np.hstack((a[0][i,:],a[1][i,:])) for i in range(len(a[0])) if np.sum(a[1][i,:])>0])

answered Dec 14, 2017 at 7:05

cosmic_inquiry

2,68415 silver badges25 bronze badges

Comments

kmario23 · Accepted Answer · 2017-12-14 07:49:11Z

This code is generic enough to handle your use case. i.e. even if your rows are mixed up. Example for both cases are given below.

An example where the rows are in order:

In [41]: arr
Out[41]: 
array([[ 1,  2,  3,  4,  0,  0,  0,  0],
       [ 3,  4,  5,  6,  0,  0,  0,  0],
       [ 7,  8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21, 22]])

# extract first half
In [85]: zero_rows = arr[~np.all(arr, axis=1), :]

In [86]: zero_rows
Out[86]: 
array([[1, 2, 3, 4, 0, 0, 0, 0],
       [3, 4, 5, 6, 0, 0, 0, 0]])

# to trim the trailing zeros in all the rows
In [84]: np.apply_along_axis(np.trim_zeros, 1, zero_rows)
Out[84]: 
array([[1, 2, 3, 4],
       [3, 4, 5, 6]])



# to extract second half
In [42]: mask_nzero = np.all(arr, axis=1)

In [43]: arr[mask_nzero, :]
Out[43]: 
array([[ 7,  8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21, 22]])

An example where the rows are mixed-up:

In [98]: mixed
Out[98]: 
array([[ 3,  4,  5,  6,  0,  0,  0,  0],
       [ 7,  8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21, 22],
       [ 1,  2,  3,  4,  0,  0,  0,  0]])

In [99]: zero_rows = mixed[~np.all(mixed, axis=1), :]

In [100]: zero_rows
Out[100]: 
array([[3, 4, 5, 6, 0, 0, 0, 0],
       [1, 2, 3, 4, 0, 0, 0, 0]])

In [101]: np.apply_along_axis(np.trim_zeros, 1, zero_rows)
Out[101]: 
array([[3, 4, 5, 6],
       [1, 2, 3, 4]])

In [102]: mask_nzero = np.all(mixed, axis=1)

In [103]: mixed[mask_nzero, :]
Out[103]: 
array([[ 7,  8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21, 22]])

Collectives™ on Stack Overflow

Python/numpy array partitioning

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related