Fastest way to find first ZERO element of a sparse matrix

Question

When using sparse matrices, it's easy to find non-zero entries quickly (as these are the only elements stored). However, what's the best way to find the first ZERO entry? Using approaches like find(X==0,1) or find(~X,1) tend to be slow as these require the negation of the whole matrix. It doesn't feel like that should be necessary -- is there a better way?

For instance, naively iterating over the array seems to be slightly faster than using find(X==0,1):

% Create a sparse array [10% filled]
x = sparse(5000,1);
x(randsample(5000,500)) = 1;
nRuns = 1e5; 
% Find the first element, a million times
idx = zeros(nRuns,1);
tic
for n=1:nRuns
    idx(n) = find(x==0,1);
end
toc


%%
% Create a sparse array [10% filled]
x = sparse(5000,1);
x(randsample(5000,500)) = 1;
nRuns = 1e5; 
% Find the first element, a million times
idx = zeros(nRuns,1);
tic
for n=1:nRuns
    for kk = 1:numel(x)
        [ii,jj] = ind2sub(size(x), kk);
        if x(ii,jj)==0; idx(n) = ii + (jj-1)*n; break; end
    end
end
toc

But what is the best way to do this?

Sparse arrays store non-zero elements in order. Just look through those until you find a missing element. — Cris Luengo
– Cris Luengo, Commented Feb 18 at 0:12
i.e. the naive loop I have above (in the second part of the snippet) -- or do you mean something else? Is that really the fastest way? — magnesium
– magnesium, Commented Feb 18 at 0:46
find(X==0,1) compares the whole matrix to zero (maybe even producing a full matrix?), then looks for the first non-zero element. In the loop you don’t touch most of the matrix. And it being a sparse matrix, you likely have mostly zero elements (if not, don’t use a sparse matrix), so the loop should terminate really quickly. Note that idx(n) = ii + (jj-1)*n is the same as idx(n) = kk. And x(ii,jj)==0 is the same as x(kk)==0. So removing the call to ind2sub should simplify and hopefully speed up your code. — Cris Luengo
– Cris Luengo, Commented Feb 18 at 1:37
But I was thinking of looking through the data as stored: a sparse matrix stores indices to non-zero elements and their values. You should be able to iterate faster over just the indices of the non-zero elements. Except I don’t know how to get that data in MATLAB. In a MEX-file this would be simple: mathworks.com/help/matlab/apiref/mxgetir.html — indexing into a sparse array is more expensive than indexing into a full array. — Cris Luengo
– Cris Luengo, Commented Feb 18 at 1:42
How big is your actual array? Because 5000x1 is tiny and it's not worth using a sparse array for. You need much, much larger arrays to make the sparse array overhead worth while. — Cris Luengo
– Cris Luengo, Commented Feb 18 at 6:07

rahnema1 · Accepted Answer · 2025-02-18 20:19:38Z

2

For positive arrays min is probably the fastest solution:

x = sparse(5000, 1);
x(randsample(5000, 500)) = 1; 
[~, idx] = min(x);

ismember can also be used:

[~, ind] = ismember(0, x);

edited Feb 18 at 20:19

answered Feb 18 at 7:18

rahnema1

15.9k3 gold badges17 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Wolfie Feb 18 at 8:07

not sure what the overhead would be using min(abs(x)) to make this work with any real array

rahnema1 Feb 18 at 9:58

logical(x) is another way.

Wolfie Feb 18 at 10:14

Indeed, I did some quick benchmarks and looks like abs does have sufficient overhead to makes the loop (without sub2ind as suggested by Cris in the comments) faster for a generic array, while this might be quicker for positive arrays

magnesium Feb 18 at 23:45

Thanks all. So far min is the fastest. I have tested on 1000*1000 sparse logical matrices but will actually end up using matrices up to 30k*30k.

John Bofarull Guix · Accepted Answer · 2025-02-25 23:31:30Z

Appended a 3rd way to catch zeros with while loops, reducing x10 time required.

x = sparse(5000,1);                     % Create sparse array [10% filled]
x(randsample(5000,500)) = 1;
nRuns = 1e5; 

idx = zeros(nRuns,1);                   % Find first element, a million times
tag1=tic;
for n=1:nRuns
    idx(n) = find(x==0,1);
end
t1=toc(tag1)


%%
% x = sparse(5000,1);                     % Create a sparse array [10% filled]
% x(randsample(5000,500)) = 1;
% nRuns = 1e5; 

idx = zeros(nRuns,1);                  % Find the first element, a million times
tag2=tic;
for n=1:nRuns
    for kk = 1:numel(x)
        [ii,jj] = ind2sub(size(x), kk);
        if x(ii,jj)==0; 
            idx(n) = ii + (jj-1)*n; 
            break; 
        end
    end
end
t2=toc(tag2)

%%

tag3=tic;
idx=[];
nx_max=prod(size(x));
nx=1;
while nx<nx_max
while ~x(nx) 
   nx=nx+1;
   idx=[idx nx];
   if nx>=nx_max
       break;
   end
end

nx=nx+1;
end

t3=toc(tag3)

resulting :

t1 =
   0.563633000000000
t2 =
   0.420241800000000
t3 =
   0.017343400000000

To compare delays there's no need to generate a different sparse matrix x for each case, on the contrary, using the same matrix x one makes sure that the comparison is correct because it's on exactly the same matrix.

Collectives™ on Stack Overflow

Fastest way to find first ZERO element of a sparse matrix

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related