3

In C#, I have an array of structs and I need to assign values to each. What is the most efficient way to do this? I could assign each field, indexing the array for each field:

array[i].x = 1;
array[i].y = 1;

I could construct a new struct on the stack and copy it to the array:

array[i] = new Vector2(1, 2);

Is there another way? I could call a method and pass the struct by ref, but I'd guess the method call overhead would not be worth it.

In case the struct size matters, the structs in question have 2-4 fields of type float or byte.

In some cases I need to assign the same values to multiple array entries, eg:

Vector2 value = new Vector2(1, 2);
array[i] = value;
array[i + 1] = value;
array[i + 2] = value;
array[i + 3] = value;

Does this change which approach is more efficient?

I understand this is quite low level, but I'm doing it millions of times and I'm curious.

Edit: I slapped together a benchmark:

this.array = new Vector2[100];
Vector2[] array = this.array;
for (int i = 0; i < 1000; i++){
    long startTime, endTime;
    startTime = DateTime.Now.Ticks;
    for (int x = 0; x < 100000000; x++) {
        array[0] = new Vector2(1,2);
        array[1] = new Vector2(3,4);
        array[2] = new Vector2(5,6);
        array[3] = new Vector2(7,8);
        array[4] = new Vector2(9,0);
        array[5] = new Vector2(1,2);
        array[6] = new Vector2(3,4);
        array[7] = new Vector2(5,6);
        array[8] = new Vector2(7,8);
        array[9] = new Vector2(9,0);
    }
    endTime = DateTime.Now.Ticks;
    double ns = ((double)(endTime - startTime)) / ((double)loopCount);
    Debug.Log(ns.ToString("F"));
}

This reported ~0.77ns and another version which indexed and assigned the struct fields gave ~0.24ns, FWIW. It appears the array index is cheap compared to the struct stack allocation and copy. Might be interesting to see the performance on a mobile device.

Edit2: Dan Bryant's answer below is why I didn't write a benchmark to begin with, too easy to get wrong.

5
  • 3
    Time-efficient? Why don't you write a test program and profile it? Commented Apr 24, 2014 at 20:11
  • @O.R.Mapper Because there is a myriad of ways to screw up a microbenchmark. Commented Apr 24, 2014 at 20:28
  • @JonB I imagine it could be benchmarked, though I personally am not familiar with writing microbenchmarks in C#. It's not just about optimizing, I'm curious about the efficiency for academic reasons. Commented Apr 24, 2014 at 20:33
  • A related question here is the impact of making the value type mutable in the first place; this has a cost in terms of development risk, due to value type subtleties and various ways that you can mutate copies of values mistakenly rather than the value you intended to mutate. In many cases, this engineering (time and money) cost may exceed the equivalent cost associated with a small performance loss. Commented Apr 24, 2014 at 20:39
  • @DanBryant That is a good point. I agree immutable is almost always better and in that case the only option is the struct constructor. In my particular case the structs are from a 3rd party library out of my control. Commented Apr 24, 2014 at 20:43

1 Answer 1

3

I was curious about the first case (field assignment vs. constructor call), so I made a release build and attached post-JIT to see the disassembly. The (x64) code looks like this:

            var array = new Vector2[10];
00000000  mov         ecx,191372h 
00000005  mov         edx,0Ah 
0000000a  call        FFF421C4 
0000000f  mov         edx,eax 

            array[i].x = 1;
00000011  cmp         dword ptr [edx+4],0 
00000015  jbe         0000003E 
00000017  lea         eax,[edx+8] 
0000001a  fld1 
0000001c  fstp        qword ptr [eax] 
            array[i].y = 1;
0000001e  fld1 
00000020  fstp        qword ptr [edx+10h] 

            array[i] = new Vector2(1, 1);
00000023  add         edx,8 
00000026  mov         eax,edx 
00000028  fld1 
0000002a  fld1 
0000002c  fxch        st(1) 
0000002e  fstp        qword ptr [eax] 
00000030  fstp        qword ptr [eax+8] 

One thing worth noting is that the 'constructor call' is inlined when using a release build outside the debugger, so, in principle, there should be no difference between setting fields or calling the constructor. That said, the jitter did some interesting things here.

For the 'constructor' version, it used two floating point stack slots and stores them at the same time to the structure memory (fld1, fld1, fstp, fstp.) It also has an fxch (exchange), which is a bit silly since both slots contain constant value 1, but not exactly a high priority optimization target for most applications, I'd assume.

For the 'individual fields' version, it only used one slot on the FPU stack, by splitting up the writes (fld1, fstp, fld1, fstp). I'm not an x64 guru, so I don't know which ordering is more efficient in terms of execution time. Any difference is probably quite miniscule, though, since the primary potential overhead (constructor method call) is inlined out.

Sign up to request clarification or add additional context in comments.

1 Comment

@NateS, note that this behavior may not be representative of all constructor vs. field assignment cases. The optimizing jitter can do all sorts of things to your code to try to wring various obscure benefits from the CPU pipeline. Inlining is not guaranteed and various re-orderings of operations can occur. In particular, you can get cases where reads and writes occur in different orders than you wrote them in C#, as long as the jitter sees it shouldn't make a difference to the logic.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.