Convert an object array of object arrays to a two dimensional array of object

Question

I have a third party library returning an object array of object arrays that I can stuff into an object[]:

object[] arr = myLib.GetData(...);

The resulting array consists of object[] entries, so you can think of the return value as some kind of recordset with the outer array representing the rows and the inner arrays containing the field values where some fields might not be filled (a jagged array). To access the individual fields I have to cast like:

int i = (int) ((object[])arr[row])[col];//access a field containing an int

Now as I'm lazy I want to access the elements like this:

int i = (int) arr[row][col];

To do this I use the following Linq query:

object[] result = myLib.GetData(...);
object[][] arr = result.Select(o => (object[])o ).ToArray();

I tried using a simple cast like object[][] arr = (object[][])result; but that fails with a runtime error.

Now, my questions:

Is there a simpler way of doing this? I have the feeling that some nifty cast should do the trick?
Also I am worried about performance as I have to reshape a lot of data just to save me some casting, so I wonder if this is really worth it?

EDIT: Thank you all for the speedy answers.
@James: I like your answer wrapping up the culprit in a new class, but the drawback is that I always have to do the Linq wrapping when taking in the source array and the indexer needs both row and col values int i = (int) arr[row, col]; (I need to get a complete row as well like object[] row = arr[row];, sorry didn't post that in the beginning).
@Sergiu Mindras: Like James, i feel the extension method a bit dangerous as it would apply to all object[] variables.
@Nair: I chose your answer for my implementation, as it does not need using the Linq wrapper and I can access both individual fields using int i = (int) arr[row][col]; or an entire row using object[] row = arr[row];
@quetzalcoatl and @Abe Heidebrecht: Thanks for the hints on Cast<>().

Conclusion: I wish I could choose both James' and Nair's answer, but as I stated above, Nair's solution gives me (I think) the best flexibility and performance. I added a function that will 'flatten' the internal array using the above Linq statement because I have other functions that need to be fed with such a structure.

Here is how I (roughly) implemented it (taken from Nair's solution:

public class CustomArray { private object[] data; public CustomArray(object[] arr) { data = arr; }

        //get a row of the data
        public object[] this[int index]
        { get { return (object[]) data[index]; } }

        //get a field from the data
        public object this[int row, int col]
        { get { return ((object[])data[row])[col]; } }

        //get the array as 'real' 2D - Array
        public object[][] Data2D()
        {//this could be cached in case it is accessed more than once
            return data.Select(o => (object[])o ).ToArray()
        }

        static void Main()
        {
            var ca = new CustomArray(new object[] { 
                      new object[] {1,2,3,4,5 },
                      new object[] {1,2,3,4 },
                      new object[] {1,2 } });
            var row = ca[1]; //gets a full row
            int i = (int) ca[2,1]; //gets a field
            int j = (int) ca[2][1]; //gets me the same field
            object[][] arr = ca.Data2D(); //gets the complete array as 2D-array
        }

    }

So - again - thank you all! It always is a real pleasure and enlightenment to use this site.

The most expensive operation here is the unboxing from object to int (and other types), which seems inevitable since your lib only returns object[]. Are you sure that it doesn't provide a typed interface? — Andre Calil
– Andre Calil, Commented Jun 26, 2013 at 14:22
what does var[] arr = myLib.GetData(...); give you in this case? — Bit
– Bit, Commented Jun 26, 2013 at 14:28
@Andre: The returned data is made up of different types, and, no, there is not typed interface as the function basically returns the result of a select statement that can contain many differently typed fields. — AstaDev
– AstaDev, Commented Jun 27, 2013 at 8:56

James · Accepted Answer · 2013-06-26 14:33:48Z

7

You could create a wrapper class to hide the ugly casting e.g.

public class DataWrapper
{
    private readonly object[][] data;

    public DataWrapper(object[] data)
    {
        this.data = data.Select(o => (object[])o ).ToArray();
    }

    public object this[int row, int col]
    {
        get { return this.data[row][col]; }
    }
}

Usage

var data = new DataWrapper(myLib.GetData(...));
int i = (int)data[row, col];

There is also the opportunity to make the wrapper generic e.g. DataWrapper<int>, however, I wasn't sure if your data collection would be all of the same type, returning object keeps it generic enough for you to decide what data type cast is needed.

edited Jun 26, 2013 at 14:33

answered Jun 26, 2013 at 14:24

James

82.4k19 gold badges172 silver badges245 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Andre Calil Over a year ago

An idea: with your current solution, everytime the user calls data[1, 1], an unboxing will be computed. So, why don't you convert ahead the object[] to object[][] using the code OP presented?

quetzalcoatl Over a year ago

I'd argue about that unboxing. If the items were read many times, it in fact will speed up the overall usage. But, if the set of the items are just read once and immediatelly processed, the pre-unboxing will hit the performance with a possible higher memory usage for no real gain. Think about datastream generated on the fly as the data is fetched from the database. Iterate over and cache million of object[] just not to unbox them twice..? That's an optimization that should be strictly tailored to the exact use case. Please do not suggest that "just because it's better".

James Over a year ago

@Quetzalcoatl it's a fair point, however, going on the assumption that the OP is going to be reading all the information then it's probably the right way to go. Let me update the solution so it's flexible for both scenarios...

quetzalcoatl Over a year ago

I meant read-once (not so worth the effort) versus read-manytimes (worth the effort Nth times). I've noticed that the toplevel dataobject is a object[], so all the data is in memory already, but caching effectively doubles the memory for toplevel array. That's the only warning I wanted to add! As I already wrote, I like that solution.

James Over a year ago

@Quetzalcoatl "But, if the set of items are just read once and immediately processed, the pre-unboxing will hit the performance" - surely if all the items are going to be read at least once the pre-unboxing would be better? Otherwise you would be unboxing per index.

|

S.N · Accepted Answer · 2013-06-26 14:30:47Z

There are few similar answer posted which does something similar. This differ only if you want to acess like

int i = (int) arr[row][col];

To demonstrate the idea

   public class CustomArray
        {
            private object[] _arr;
            public CustomArray(object[] arr)
            {
                _arr = arr;
            }

            public object[] this[int index]
            {
                get
                {
                    // This indexer is very simple, and just returns or sets 
                    // the corresponding element from the internal array. 
                    return (object[]) _arr[index];
                }
            }
            static void Main()
            {
                var c = new CustomArray(new object[] { new object[] {1,2,3,4,5 }, new object[] {1,2,3,4 }, new object[] {1,2 } });
                var a =(int) c[1][2]; //here a will be 4 as you asked.
            }

        }

quetzalcoatl · Accepted Answer · 2013-06-26 14:28:02Z

(1) This probably could be done in short and easy form with dynamic keyword, but you'll use compile-time checking. But considering that you use object[], that's a small price:

dynamic results = obj.GetData();
object something = results[0][1];

I've not checked it with a compiler though.

(2) instead of Select(o => (type)o) there's a dedicated Cast<> function:

var tmp = items.Select(o => (object[])o).ToArray();
var tmp = items.Cast<object[]>().ToArray();

They are almost the same. I'd guess that Cast is a bit faster, but again, I've not checked that.

(3) Yes, reshaping in that way will affect the performance somewhat, depending mostly on the amount of items. The impact will be the larger the more elements you have. That's mostly related to .ToArray as it will enumerate all the items and it will make an additional array. Consider this:

var results = ((object[])obj.GetData()).Cast<object[]>();

The 'results' here are of type IEnumerable<object[]> and the difference is that it will be enumerated lazily, so the extra iteration over all elements is gone, the temporary extra array is gone, and also the overhead is minimal - similar to manual casting of every element, which you'd do anyways.. But - you lose the ability to index over the topmost array. You can loop/foreach over it, but you cannot index/[123] it.

EDIT:

The James's wrapper way is probably the best in terms of overall performance. I like it the most for readability, but that's personal opinion. Others may like LINQ more. But, I like it. I'd suggest James' wrapper.

Sergiu Mindras · Accepted Answer · 2013-06-27 11:01:29Z

1

You could use extension method:

static int getValue(this object[] arr, int col, int row)
{
    return (int) ((object[])arr[row])[col];
}

And retrieve by

int requestedValue = arr.getValue(col, row);

No idea for arr[int x][int y] syntax.

EDIT

Thanks James for your observation

You can use a nullable int so you don't get an exception when casting.

So, the method will become:

static int? getIntValue(this object[] arr, int col, int row)
{
    try
    {
    int? returnVal = ((object[])arr[row])[col] as int;
    return returnVal;
    }
    catch(){ return null; }
}

And can be retrieved by

int? requestedValue = arr.getIntValue(col, row);

This way you get a nullable object and all encountered exceptions force return null

edited Jun 27, 2013 at 11:01

answered Jun 26, 2013 at 14:29

Sergiu Mindras

1951 silver badge17 bronze badges

1 Comment

James Over a year ago

I'd argue that this is misuse of an extension method. It should be generic enough to use on any instance of object[], in this scenario you are assuming all object[] will contain an inner object[] and be of type int.

Abe Heidebrecht · Accepted Answer · 2013-06-26 14:27:17Z

0

You can use LINQ Cast operator instead of Select...

object[][] arr = result.Cast<object[]>().ToArray()

This is a little less verbose, but should be nearly identical performance wise. Another way is to do it manually:

object[][] arr = new object[result.Length][];
for (int i = 0; i < arr.Length; ++i)
    arr[i] = (object[])result[i];

answered Jun 26, 2013 at 14:27

Abe Heidebrecht

30.6k7 gold badges67 silver badges67 bronze badges

Collectives™ on Stack Overflow

Convert an object array of object arrays to a two dimensional array of object

5 Answers 5

6 Comments

Comments

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

6 Comments

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related