C# Memory optimization for large arrays

Question

Here are two code parts in c++ and c# doing absolutely the same thing:

#include <stdio.h>
int main(int argc, char *argv[]) {
  char p[1000000];
  unsigned int i,j;
  unsigned long long s=0;
  for(i=2;i<1000000;i++) p[i]=1;
  for(i=2;i<500000;) {
    for(j=2*i;j<1000000;j+=i) p[j]=0;
    for(i++;!p[i];i++);
  }
  for(i=3,s=2;i<1000000;i+=2) if(p[i]) s+=i;
  printf ("%lld\n",s);
  return 0;
}

time: 0.01s memmory: 2576 kB

C#
http://ideone.com/baXYm

using System;

namespace ConsoleApplication4
{
    internal class Program
    {
        private  static void Main(string[] args)
        {
            var p = new byte[1000000];
            ulong i, j;
            double s = 0;
            for(i=2;i<1000000;i++) 
                p[i]=1;

            for(i=2;i<500000;) 
            {
                for(j=2*i;j<1000000;j+=i) 
                    p[j]=0;
                for(i++;p[i]==0;i++);
            }

            for(i=3,s=2;i<1000000;i+=2) 
                if(p[i]!=0) s+=i;

            Console.WriteLine(s);
        }
    }
}

time: 0.05s mem: 38288 kB

How can I improve the C# code to prove that C# can be as fast as C++ to my colleague?

As you can see the C# execution time is 5 time larger, and the memory consumption is 15 times larger.

It's probably worth noting some stuff before delving into it: your arrays are different. The array in the C/C++ example is on the stack. In C#, it's going on the heap. Your i and j variables in C# are larger footprints, and may require more effort for the processor to use if it's a 32-bit processor. Use uint, assuming (fairly considering the iteration amount) unsigned int is 4 bytes in the C/C++ example. p[0] and p[1] will be uninitialzed and thus questionable in the C/C++ example, but 0 in the C# example. — pickypg
– pickypg, Commented May 21, 2011 at 23:47
Plus s is a double in the C# example and a ulong (effectively) in the C/C++ example. Integer arithmetic will practically always be faster than floating point. — pickypg
– pickypg, Commented May 21, 2011 at 23:53
The "C++ code example" of yours is written in pure C. So before proving anything it might help to learn something first. — Öö Tiib
– Öö Tiib, Commented May 21, 2011 at 23:54
You didn't time the printing, did you? That doesn't really count. :P — user541686
– user541686, Commented May 21, 2011 at 23:54
@pickypg: not to mention less error-prone. @Dimitry: as you can see the two are not doing "absolutely the same thing", only similar things. — R. Martinho Fernandes
– R. Martinho Fernandes, Commented May 21, 2011 at 23:54

Darin Dimitrov · Accepted Answer · 2011-05-21 23:45:12Z

10

Compile and run in Release mode. I get exactly 0.01s from the C# version when built and run in Release mode. As far as memory consumption is concerned you are comparing apples to oranges. A managed environment will consume more memory as it is hosting the CLR and the Garbage Collector which don't come without cost.

answered May 21, 2011 at 23:45

Darin Dimitrov

1.0m275 gold badges3.3k silver badges3k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

v00d00 Over a year ago

Measurments are done by the Ideone server. If it measures include runtime used memmory, than it becomes clear where the additional 32 Mb memmory comes from.

Stack Overflow is garbage Over a year ago

I get your meaning, but I'd still consider it apples to apples. In @both cases, you're comparing the memory consumption of the entire process. That's fair. .NET users more memory, and a comparison of memory usage ought to reflect that.

Community · Accepted Answer · 2017-05-23 12:26:57Z

6

How to GREATLY Increase the Performance of your C# Code

Go "unsafe" (unmanaged) for that... every time you're doing someSortOfArray[i], the .NET framework is doing all kinds of neat-o things (such as out of bounds checking) which take up time.

That's really the whole point of going unmanaged (and then using pointers and doing myPointer++).

Just to clarify, if you go unmanaged and then still do a for-loop and do someArray[i], you've saved nothing.

Another S.O. question that may help you: True Unsafe Code Performance

Disclaimer

By the way, I'm not saying to do this all the time, but rather as an answer for THIS specific question only.

edited May 23, 2017 at 12:26

CommunityBot

11 silver badge

answered May 21, 2011 at 23:55

Timothy Khouri

32k21 gold badges94 silver badges129 bronze badges

4 Comments

R. Martinho Fernandes Over a year ago

The out of bounds checks can be skipped by the JIT because it never goes out of bounds.

v00d00 Over a year ago

Can I allocate 1 mb array on stack? And get data from there?

R. Martinho Fernandes Over a year ago

@Dmitry: there's stackalloc (probably the only place in the C# spec that mentions this "stack" thingy people keep talking about :). If there is one meg available on the stack, you can.

phoog Over a year ago

@Martinho - the 32 bit jitter only eliminates the bounds check under certain circumstances. For example, if you store the array's length in a local variable and then check the loop variable against that, you are going to be doing bounds checking. The 64-bit jitter behaves differently in this regard (with less optimization). (This comes from an article I read several months ago that I unfortunately can't find at the moment.)

Stack Overflow is garbage · Accepted Answer · 2011-05-21 23:55:37Z

How can I improve the C# code to prove that C# can be as fast as C++ to my colleague?

You can't. There are legitimate areas where C++ is fundamentally faster than C#. But there are also areas where C# code will perform better than the equivalent C++ code. They're different languages with different strengths and weaknesses.

But as a programmer, you really ought to base your decisions in logic.

Logic dictates that you should gather information first, and then decide based on that.

You, on the contrary, made the decision first, and then looked for information to support it. That may work if you're a politician, but it's not a good way to write software.

Don't go hunting for proof that C# is faster than C++. Instead, examine which option is faster in your case.

In any case, if you want to prove that X can be as fast as Y, you have to do it the usual way: make X as fast as Y. And as always, when doing performance tuning, a profiler is your best friend. Find out exactly where the additional time is being spent, and then figure out how to eliminate it.

Memory usage is a lost cause though. .NET simply uses more memory, for several reasons:

it has a bigger runtime library which must be present in the process' address space
.NET objects have additional members not present in C++ classes, so they use more memory
the garbage collector means that you'll generally have some amount of "no-longer-used-but-not-yet-reclaimed" memory lying around. In C++, memory is typically released immediately. In .NET it isn't. .NET is based on the assumption that memory is cheap (which is typically true)

Haymo Kutschbach · Accepted Answer · 2012-02-23 16:27:58Z

3

Just a note to your timing. Its not shown, how did you measure the execution times. One can expect a reasonable overhead for .NET applications on startup. So if you are about the execution time of the loops only, you should run the inner loops several (many) times, skip the 1..2 first iterations, measure the other iterations and compute the average.

I would expect the results be more similar than. However, as always when targeting 'peak performance' - precautions regarding the memory management are important. Here, it probably would be sufficient to prevent from 'new' inside the measurement functions. Reuse the p[] in each iteration.

answered Feb 23, 2012 at 16:27

Haymo Kutschbach

3,3721 gold badge19 silver badges26 bronze badges

Comments

Brendan Long · Accepted Answer · 2011-05-21 23:49:06Z

1

The memory usage may be related to garbage collection. In Java, memory usage is intentionally high -- garbage collection only happens when you need more memory. This is for speed reasons, so it would make sense that C# does the same thing. You shouldn't do this in release code, but to show much memory you're actually using, you can call GC.Collect() before measuring memory usage. Do you really care how much memory it's using though? It seems like speed in more important. And if you have memory limits, you can probably set the amount of memory that your program will use before garbage collecting.

answered May 21, 2011 at 23:49

Brendan Long

54.5k21 gold badges154 silver badges194 bronze badges

2 Comments

user492238 Over a year ago

"you can probably set the amount of memory that your program will use before garbage collecting" how should we achieve this?

Brendan Long Over a year ago

@user492238 - Maybe it's not possible.

Collectives™ on Stack Overflow

C# Memory optimization for large arrays

5 Answers 5

2 Comments

How to GREATLY Increase the Performance of your C# Code

Disclaimer

4 Comments

Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

How to GREATLY Increase the Performance of your C# Code

Disclaimer

4 Comments

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related