Python multi dimensional sparse array

Question

I am working on a project where I need to deal with 3 dimensional large array. I was using numpy 3d array but most of my entries are going to be zero, so it's lots of wastage of memory. Scipy sparse seems to allow only 2D matrix. Is there any other way I can store 3D sparse array?

An out of the box implementation still doesn't exist in 2022.. — dor00012
– dor00012, Commented Jan 3, 2022 at 19:56
chunky3d works well for sparse 3D data. Documentation is scarce but there are some neat functionalities in it. — alex
– alex, Commented May 14, 2024 at 7:46

hpaulj · Accepted Answer · 2015-04-26 21:24:08Z

8

scipy.sparse has a number of formats, though only a couple have an efficient set of numeric operations. Unfortunately, those are the harder ones to extend.

dok uses a tuple of the indices as dictionary keys. So that would be easy to generalize from 2d to 3d or more. coo has row, col, data attribute arrays. Conceptually then, adding a third depth(?) is easy. lil probably would require lists within lists, which could get messy.

But csr and csc store the array in indices, indptr and data arrays. This format was worked out years ago by mathematicians working with linear algebra problems, along with efficient math operations (esp matrix multiplication). (The relevant paper is cited in the source code).

So representing 3d sparse arrays is not a problem, but implementing efficient vector operations could require some fundamental mathematical research.

Do you really need the 3d layout to do the vector operations? Could you, for example, reshape 2 of the dimensions into 1, at least temporarily?

Element by element operations (*,+,-) work just as well with the data of a flattened array as with the 2 or 3d version. np.tensordot handles nD matrix multiplication by reshaping the inputs into 2D arrays, and applying np.dot. Even when np.einsum is used on 3d arrays, the product summation is normally over just one pair of dimensions (e.g. 'ijk,jl->ikl')

3D representation can be conceptually convenient, but I can't think of a mathematical operation that requires it (instead of 2 or 1d).

Overall I think you'll get more speed from reshaping your arrays than from trying to find/implement genuine 3d sparse operations.

answered Apr 26, 2015 at 21:24

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

denis Over a year ago

In 2024, the doc for scipy 1.14.0 .sparse says "This package is switching to an array interface ... use bsr_array etc." But I'd expect rough edges => wait for really thorough testing. Have you used scipy sparse arrays, any comments ? Thanks

hpaulj Over a year ago

@denis, last I looked sparse arrays were still 2d like the matruces. Mainly a change in name, and array like use of @ and * and **.

Community · Accepted Answer · 2017-05-23 12:33:56Z

3

You're right; it doesn't look like there are established tools for working with n-dimensional sparse arrays. If you just need to access elements from the array there are options using a dictionary keyed on tuples. See:

sparse 3d matrix/array in Python?

If you need to do operations on the sparse 3d matrix, it gets harder- you may have to do some of the coding yourself.

edited May 23, 2017 at 12:33

CommunityBot

11 silver badge

answered Apr 25, 2015 at 23:54

chepyle

9961 gold badge9 silver badges16 bronze badges

2 Comments

Naman Over a year ago

I didn't want to use dictionary because I need to perform vectorized operation over the array. I don't know if they will be fast with dictionary?

hpaulj Over a year ago

Add some examples of the vector operations you want to your question. That may suggest which sparse format is best, and whether it can be adapted.

Javed Was · Accepted Answer · 2022-03-07 17:00:28Z

See the python library: https://sparse.pydata.org/en/stable/construct.html

An example in 2D is given below - copied from the page provided above. However, it also works for more than 2 dimensions. I am currently working with a variable number of dimensions using this library. Just as a small note: in case of a sparse matrix with more than 2 dimensions, a conversion to the known SciPy sparse matrices is not possible.

import sparse

coords = [[0, 1, 2, 3, 4],

          [0, 1, 2, 3, 4]]

data = [10, 20, 30, 40, 50]

s = sparse.COO(coords, data, shape=(5, 5))

s.todense()
array([[10,  0,  0,  0,  0],
       [ 0, 20,  0,  0,  0],
       [ 0,  0, 30,  0,  0],
       [ 0,  0,  0, 40,  0],
       [ 0,  0,  0,  0, 50]])

chao zhang · Accepted Answer · 2023-03-08 04:53:10Z

0

take a look at github/sparse or pytorch-documentation/sparse. see code snippet (pytorch one) below.

import numpy as np
import torch

N0,N1,N2 = 3,4,5
np_rng = np.random.default_rng()
np0 = np_rng.normal(size=(N0,N1,N2)) * (np_rng.uniform(size=(N0,N1,N2))>0.8) #some fake 3d data

index = np.stack(np.nonzero(np0)) #(np,int64,(3,nnz))
value = np0[index[0], index[1], index[2]] #(np,float64,nnz)

torch0_coo = torch.sparse_coo_tensor(index, value, (N0,N1,N2))
torch0_coo.shape #(3,4,5)
torch0_coo.dtype #float64
torch0_coo.indices() #(torch,int64,(ndim,nnz))
torch0_coo.values() #(torch,float64,nnz)

answered Mar 8, 2023 at 4:53

chao zhang

214 bronze badges

Comments

loki · Accepted Answer · 2023-10-30 07:49:39Z

0

You can create your own n-dimensional sparse arrays using np.nditer. The following example creates a dictionary-of-keys (DOK) sparse array, but the underlying method with np.nditer(..., flags=["multi_index"]) can be used to create all kinds of sparse representations using pure numpy:

with np.nditer(nd_histogram, flags=["multi_index"]) as it:
    sparse_dok = {it.multi_index: cell_value for cell_value in it if cell_value != 0}

answered Oct 30, 2023 at 7:49

loki

10.4k7 gold badges63 silver badges86 bronze badges

Collectives™ on Stack Overflow

Python multi dimensional sparse array

5 Answers 5

2 Comments

2 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related