0

I am pretty new to Python and have been wondering if there an easy way so that I could form a sparse n-dimensional array M in Python3 with following 2 conditions mainly required (along the lines of SciPy COO_Matrix):

  1. M[dim1,dim2,dim3,...] = 1.0
  2. Like SciPy COO_Matrix M: M.row, M.col, I may be able to get all the row and column indices for which non-zero entries exist in the matrix. In N-dimension, this generalizes to calling: M.1 for 1st dimension, M.2 for 2nd dimension and so on...

For 2-dimension (the 2 conditions):

 1.
     for u, i in data:
        mat[u, i] = 1.0

 2. def get_triplets(mat):
        return mat.row, mat.col

Can these 2 conditions be generalized in N-dimensions? I searched and came across this:

sparse 3d matrix/array in Python?

But here 2nd condition is not satisfied: In other words, I can't get the all the nth dimensional indices in a vectorized format.

Also this: http://www.janeriksolem.net/sparray-sparse-n-dimensional-arrays-in.html works for python and not python3.

Is there a way to implement n-dimensional arrays with above mentioned 2 conditions satisfied? Or I am over-complicating things? I appreciate any help with this :)

4
  • You could certainly create a data structure modeled on either coo (column per dimension) or dok. And you could fill it in a way that meets your conditions. But whether you can do anything useful with it (multiplication, display, etc) without doing a lot of coding is a tougher question. For a start, demonstrate your conditions using the scipy.sparse 2d code. Commented Mar 16, 2017 at 22:53
  • Scipy has different ways to initialize sparse matrices- and they can be converted into each other. But you seem to look for a matrix that is sparse but with 1 being the value of any sparse element. That won't work at all as the optimizations of sparse matrices are based on the fact that sparse cells are zero. Commented Mar 16, 2017 at 22:54
  • @hpaulj, I edited based on your comment. Hope its a bit clearer now. I am reading your answer now. Commented Mar 16, 2017 at 23:39
  • @RuDevel, I did not mean that. M is sparse N-dimensional sparse array with non-zero values as 1.0 and all other values as 0.0 Commented Mar 16, 2017 at 23:45

1 Answer 1

1

In the spirit of coo format I could generate a 3d sparse array representation:

In [106]: dims = 2,4,6
In [107]: data = np.zeros((10,4),int)
In [108]: data[:,-1] = 1
In [112]: for i in range(3):
     ...:     data[:,i] = np.random.randint(0,dims[i],10)

In [113]: data
Out[113]: 
array([[0, 2, 3, 1],
       [0, 3, 4, 1],
       [0, 0, 1, 1],
       [0, 3, 0, 1],
       [1, 1, 3, 1],
       [1, 0, 2, 1],
       [1, 1, 2, 1],
       [0, 2, 5, 1],
       [0, 1, 5, 1],
       [0, 1, 2, 1]])

Does that meet your requirements? It's possible there are some duplicates. sparse.coo sums duplicates before it converts the array to dense for display, or to csr for calculations.

The corresponding dense array is:

In [130]: A=np.zeros(dims, int)
In [131]: for row in data:
     ...:     A[tuple(row[:3])] += row[-1]

In [132]: A
Out[132]: 
array([[[0, 1, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 1],
        [0, 0, 0, 1, 0, 1],
        [1, 0, 0, 0, 1, 0]],

       [[0, 0, 1, 0, 0, 0],
        [0, 0, 1, 1, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0]]])

(no duplicates in this case).

A 2d sparse matrix using a subset of this data is

In [118]: sparse.coo_matrix((data[:,3],(data[:,1],data[:,2])),(4,6)).A
Out[118]: 
array([[0, 1, 1, 0, 0, 0],
       [0, 0, 2, 1, 0, 1],
       [0, 0, 0, 1, 0, 1],
       [1, 0, 0, 0, 1, 0]])

That's in effect the sum over the first dimension.


I'm assuming that

M[dim1,dim2,dim3,...] = 1.0

means the non-zero elements of the array must have a data value of 1.

Pandas has a sparse data series and data frame format. That allows for a non-zero 'fill' value. I don't know if the multi-index version can be thought of as higher than 2d or not. There have been a few SO questions about converting the Pandas sparse arrays to/from the scipy sparse.

Convert Pandas SparseDataframe to Scipy sparse csc_matrix

http://pandas-docs.github.io/pandas-docs-travis/sparse.html#interaction-with-scipy-sparse

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.