2

Is there value in attempting to pre-allocate an array of structs when the size of the fields is variable? For example:

A.x = randn(1,randi(100));
A.y = randn(1,randi(100));

for k = 2:1000
    A(k).x = randn(1,randi(100));
    A(k).y = randn(1,randi(100));
end

I could create the first entry and then use repmat, but MATLAB would still have to deal with the unknown field lengths. In my tests there is little/no improvement compared to just letting it grow dynamically. Incidentally, growing it with brackets (e.g. A = [A nextEntry]) is much slower.

Is there a clever way to do a pre-alloc to speed this up?

2
  • Maybe this post will help: stackoverflow.com/questions/28664640/… Commented Jun 28, 2016 at 21:32
  • You don't have to initialize the value of the fields just that there are fields. The values are stored elsewhere in memory. Commented Jun 28, 2016 at 21:35

1 Answer 1

1

The way that MATLAB stores struct arrays is that the meta-data about the struct (the dimensions, the fieldnames, etc.) is stored in one place in memory and the contents (values) of each field are stored separately and pointers to their location are inserted into the meta-data so that they can be located when requested.

For this reason, if you want to initialize a struct you can initialize it with all the contents set to []. You only need to ensure that the number of fields and the dimensions of the initial struct are the correct size so that we have enough space to store all of the pointers to the data that it will eventually contain.

Then you can fill in the fields as-needed, their value will be assigned to fresh memory, and their pointer will be stored in the meta-data in the pre-allocated location.

A relevant article from Loren's blog

So in your case you can simply pre-allocate your struct with:

A = struct('x', cell(1, 1000), 'y', cell(1, 1000));

And fill it with:

for k = 1:numel(A)
    A(k).x = randn(1, randi(100));
    A(k).y = randn(1, randi(100));
end

As far as why growing A using [A newA] is slower. This causes us to have to "grow" the meta-data component of the struct each time through the loop, which actually requires an entire copy of the meta-data to be made to perform the expansion each time.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the reply. I guess part of my question is "why bother", when MATLAB will have to go find memory for the contents anyway. It can pre-allocate space for the pointers, but the pointer could be to a large array, which it needs to deal with in real time. Or do I have that wrong?
@user2364295 The pointers (stored in the meta-data) are the same size regardless of the size of data they point to. If you don't store space for all of these pointers up-front, MATLAB would have to move them meta-data every time you add a new element (which is why you saw a performance hit when you did that). And yes, when you assign data it has to allocate the data for that item, but you don't want to add that to the need to re-allocate the entire meta-data structure. You don't save any time up-front pre-allocating the items.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.