3

Why can't I dynamically initialize a 2d array like initializing a 1d array? For example:

class ClassName
{
    int* array1d;
    int** array2d;

    ClassName(int size1, int size2)
    {
        array1d = new int[size1];        // work.
        array2d = new int[size1][size2]; // doesn't work.
    }
}

I searched on the Internet and found this question, and I found out that you can only do this if you specify the row length to be a constant, like array2d = new[size1][5].

I want to understand why.

My assumptions is that this is because compilers treat a 2d array as an 'array' that holds 'arrays', so they need to know the size of the data type 'the inner array'. So they will allocate the memory for it, based on the length of the outer array and the memory size of the inner array length of outer * sizeof(inner array), so the inner array size should be constant.

Are my assumptions true?

14
  • 4
    Language limitations, it is just not part of the language. But you can allocate 1D memory and apply a multidimensional view on it using std::mdpan. Side note, avoid using "new" in current C++, it is usually a sign you need a container like std::vector Commented Mar 14 at 14:43
  • 1
    No do NOT use unbound "C" arrays or pointers to arrays in C++. You're setting yourself up for countless bugs. There is a reason C++ has std::array/std::vector and now std::mdspan it is to avoid a LOT of bugs. I really wonder who or what is teaching you C++. Commented Mar 14 at 14:52
  • 1
    That someone or something should really watch this : 2015: Kate Gregory “Stop Teaching C" Commented Mar 14 at 14:54
  • 2
    array2d is not a 2D array (although it can be used to represent one) - but rather a pointer to a pointer to int. Commented Mar 14 at 14:54
  • 2
    @FOXDeveloper In any case this statement array2d = new int[size1][size2]; is incorrect even if size2 was a constant because returned pointer will be have type int ( * )[size2] instead of int **. Commented Mar 14 at 15:02

2 Answers 2

10

Short Explanation

C++'s type system simply doesn't support it. In any case int** is not the correct type for a pointer to a dynamically-allocated 2D array. What you want is int (*)[N]: Pointer to an array of length N. Types only exist at compile-time though, so N must be a compile-time constant for that type to be valid.

In theory, the language could support dynamically-allocated arrays of arrays of dynamic bound, but such a thing can't be done with simple pointers, since data about the size of each dynamic bound has to be stored somewhere so that the array can be traversed. Since C++ has classes, and classes can be used to handle that job, there has never been a need to add language-level support.


Longer explanation

First, consider what a 1D array of ints looks like in memory: just a contiguous block of ints:

 ┌───┬───┬─────┬───┐
 │ 0 │ 1 │ ... │ N │
 └───┴───┴─────┴───┘

Now, an int* can point to any of those elements:

 int* array1d
   │
 ┌─▼─┬───┬─────┬───┐
 │ 0 │ 1 │ ... │ N │
 └───┴───┴─────┴───┘

And since the compiler knows how big an int is, you can navigate from element to element simply by offsetting the pointer by that size. That is, to get from array1d to array1d[N], you simply shift the pointer N * sizeof(int) bytes.


Now, consider how a 2D array is laid out in memory. It's a continuous block of 1D array elements, each of which is a contiguous block of some number of ints:

┌───────────────────┬───────────────────┬───────────────────┬───────────────────┐
│┌───┬───┬─────┬───┐│┌───┬───┬─────┬───┐│┌─────────────────┐│┌───┬───┬─────┬───┐│
││ 0 │ 1 │ ... │ N │││ 0 │ 1 │ ... │ N │││       ...       │││ 0 │ 1 │ ... │ N ││
│└───┴───┴─────┴───┘│└───┴───┴─────┴───┘│└─────────────────┘│└───┴───┴─────┴───┘│
└───────────────────┴───────────────────┴───────────────────┴───────────────────┘

You've declared array2d to be a pointer to a pointer to an int, so it needs to point to an int*, but there is no int* here for it to point to, so int** must be the wrong type.

Lets use something like int (*)[]: pointer to an array of unknown bounds of int. That could conceivably point to the first element of our array of arrays of int:

int (*array2d)[]                                                                 
   │                                                                             
┌──▼────────────────┬───────────────────┬───────────────────┬───────────────────┐
│┌───┬───┬─────┬───┐│┌───┬───┬─────┬───┐│┌─────────────────┐│┌───┬───┬─────┬───┐│
││ 0 │ 1 │ ... │ N │││ 0 │ 1 │ ... │ N │││       ...       │││ 0 │ 1 │ ... │ N ││
│└───┴───┴─────┴───┘│└───┴───┴─────┴───┘│└─────────────────┘│└───┴───┴─────┴───┘│
└───────────────────┴───────────────────┴───────────────────┴───────────────────┘

But there's a problem with this setup. The compiler doesn't know how big the array pointed to by array2d is. Thus, when you ask for array2d[1] it doesn't know how far to offset the pointer to find the next element in the array. It can't shift the pointer sizeof(int[]) bytes, because sizeof(int[]) isn't known.

Information about the size of each array element doesn't exist until runtime, but the compiler needs to generate machine code, at compile time, to do the correct offset.

And so, only arrays of arrays of constant bound can be allocated dynamically via the new keyword. It then returns an int (*)[N] (or int (*)[N][M], etc). Since N is a compile-time constant, the compiler knows how far to offset the pointer to find each subsequent element in the array.

Sign up to request clarification or add additional context in comments.

Comments

3

I would discourage use of raw pointers to manage dynamically allocated memory and use the standard library container std::vector that was provided for this purpose:

#include <vector>

class ClassName {
    std::vector<int> array1d;
    std::vector<std::vector<int>> array2d;

  public:
    ClassName(int size, int size1, int size2) :
        array1d(size),
        array2d(size1, std::vector<int>(size2))
    {}
};

In you can even use std::mdspan to create a 2d view of a more efficiently allocated flat vector:

#include <mdspan>
#include <vector>

class ClassName {
    std::vector<int> array1d;
    std::vector<int> array2d;
    std::mdspan<int, std::dextents<int, 2>> view2d;

  public:
    ClassName(int size, int size1, int size2) :
        array1d(size),
        array2d(size1 * size2),
        view2d(array2d.data(), size1, size2)
    {}
};

See on Compiler Explorer.

3 Comments

Yes, using a vector is indeed a better solution, but you didn't address the OP's actual question at all. They want to know why their original code doesn't work, and you didn't even try to answer that.
@RemyLebeau thanks. I think the other answer does a good job addressing that. I don't think I should need to duplicate that effort. Given that this is an XY problem, instead I thought it would be valuable to show how this could be done in a more idiomatic way in C++, given that future readers will likely be looking for a solution rather than just an explanation on why their less-than-ideal approach isn't very easy to express within the limitations of the language.
"I think the other answer does a good job addressing that" - yes, but that's not the point. "I thought it would be valuable to show how this could be done in a more idiomatic way in C++" - that's fine. But this is a Q&A site. There can be multiple responses, but they should always be first-and-foremost an answer to the OP's question. Additional suggestions can be added on to the reply, but be sure to address the OP's concern, too.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.