4

I have been learning to program and I chose C++ and C# programming as first languages. More specifically, I have an old C book someone was kind enough to let me borrow and I'm using it to learn C#. I use Visual Studio Express and write in C++ and C#. One area that interests me is the ability to do direct memory management. I am trying to learn to use this to optimize my code. However, I am struggling to do it properly and actually see any real performance improvement. For example, here is the following code in C#:

unsafe static void Main(string[] args)
{
    int size = 300000;
    char[] numbers = new char[size];

    for (int i = 0; i < size; i++)
    {
        numbers[i] = '7';
    }

    DateTime start = DateTime.Now;

    fixed (char* c = &numbers[0])
    {
        for (int i = 0; i < 10000000; i++)
        {
            int number = myFunction(c, 100000);
        }
    }

    /*char[] c = numbers;  // commented out C# non-pointer version same 
          speed as C# pointer version
    {
        for (int i = 0; i < 10000000; i++)
        {
            int number = myFunction(c, 100000);
        }
    }*/

    TimeSpan timeSpan = DateTime.Now - start;
    Console.WriteLine(timeSpan.TotalMilliseconds.ToString());
    Console.ReadLine();
}

static int myFunction(ref char[] numbers, int size)
{
    return size * 100;
}

static int myFunction(char[] numbers, int size)
{
    return size * 100;
}

unsafe static int myFunction(char* numbers, int size)
{
    return size * 100;
}

No matter which of three methods I call, I am getting the same execution speed. I'm also still trying to wrap my head around the difference between using ref and using a pointer, except that's probably something that will take time and practice.

What I don't understand, however, is that I am able to produce a very significant performance difference in C++. Here is what I came up with when I attempted to approximate the same code in C++:

/*int myFunction(std::string* numbers, int size)  // C++ pointer version commented 
     out is much faster than C++ non-pointer version
{
    return size * 100;
}*/

int myFunction(std::string numbers, int size) // value version
{
    return size * 100;
}

int _tmain(int argc, _TCHAR* argv[])
{
int size = 100000;
std::string numbers = "";
for (int i = 0; i < size; i++)
{
    numbers += "777";
}

clock_t start = clock();

for (int i = 0; i < 10000; i++)
{
    int number = myFunction(numbers, 100000);
}

clock_t timeSpan = clock() - start;

std::cout << timeSpan;
char c;
std::cin >> c;

return 0;
}

Can anyone tell me why my C# code isn't benefitting from my use of references or pointers? I've been reading stuff online and whatnot, except I'm stuck.

12
  • 1
    Just a tip; use /* and */ to comment everything between them instead of having to use // on every line. Commented May 9, 2014 at 5:25
  • 4
    In real-world C#, you will probably never ever have to use unsafe code except for some cases of platform interop, and possibly image manipulation. C# was not intended to be used for managing memory directly, and thus provides features such as garbage collection out of the box. Because of that, in most cases, performance in it won't benefit from direct memory manipulation. Simply because it does not compile to machine code and still executes on top of the CLR. On the other hand, C++ compiles directly to machine code and executes your code pretty much exactly as-is, so memory optimizations work. Commented May 9, 2014 at 5:25
  • 2
    C# objects are passed by reference by default so even if it looks like that is passed by value by using big expensive copy it is not and is just reference and objects are copied only when necessary (modified) Commented May 9, 2014 at 5:31
  • 2
    I don't think 100000 iterations of a simple multiplication in a very tight loop is enough to even register anything in terms of execution time. You don't even do anything with the arrays/pointers so what exactly are you timing? Commented May 9, 2014 at 5:43
  • 2
    Do not use a C book to learn C# - that will only be confusing. The languages' syntax are somewhat similar, but a crapton of things are different. Commented May 9, 2014 at 6:34

2 Answers 2

7

C# already generates pointers without you explicitly declaring them. Every reference type reference, like your numbers variable, is in fact a pointer at runtime. Every argument you pass with the ref or out keywords are in fact pointers at runtime. The exact C equivalent of your array argument is char**, char*& in C++. There's no difference in C#.

So you don't see any difference in speed because the code that actually executes is the same.

That isn't exactly where it stops either, you never actually do anything with the array. The method you call disappears at runtime, much like it does in a C or C++ compiler, it will be inlined by the optimizer. And since you don't use the array argument, you don't get any code for it either.

Pointers become useful to speed programs up when you use them to actually address memory. You can index the array and be sure that you'll never pay for the array bounds check. You won't pay for it in normal usage in many cases either, the jitter optimizer is fairly smart about removing the checks if it knows that the indexing is always safe. That's unsafe usage of a pointer, you can readily scribble into parts of memory that don't belong to the array and corrupt the GC heap that way. The pointers used for an object reference or a ref argument are never unsafe.

The only real way to see any of this is to look at the generated machine code. Debug + Windows + Disassembly window. It is important that allow code to still be optimized even though you debug it or you can't see the optimizations. Be sure to run the Release build and use Tools + Options, Debugging, General, untick the "Suppress JIT optimization on module load" option. Some familiarity with machine code is required to make sense of what you see.

Sign up to request clarification or add additional context in comments.

4 Comments

Not quite the same. ref char[] is a pointer-to-pointer, the other two are single pointers.
It's also worth pointing out that in the C++ example, a std::string is being passed by value into the function. It's not the same as passing around a pointer or reference to the data, but passing the entire data contents.
It's going to take me some time to think about the points you brought up. One question I can't figure out though is when you said "The method you call disappears at runtime, much like it does in a C or C++ compiler". If the method disappears at runtime, why do I see a difference in the C++ program execution speeds when I pass it by pointer and when I pass it by value if the method disappears at runtime? Simply to make sure I'm not misrepresenting my question, I'm not comparing the C# execution speed to the C++ execution speed. I'm comparing C# value vs. pointer difference to that in C++.
Passing objects by value in C++ is expensive, it has to make a copy of the object. All non-pointer types in C++ are value types. It is impossible to do in C#, reference types like an array or string are always passed by reference.
2

The problem is that you aren't measuring what you think you're measuring. I can read your code and see immediately why you would get this result or that result, and it's not just because of pointers or not pointers. There are lots of other factors at play, or potentially at play. The various comments reflect this.

For what it's worth, the main reason one C++ call is much slower than the other is because the slow one copies a std::string and the fast one does not. The C# examples do not have anything like that order of difference between them.

My suggestion is that, as a bright but early stage programmer you focus first on getting to be a better programmer. Don't worry about "optimising" until you know what you're trying to achieve.

When you're ready to really understand this problem, you will have to study the generated code. In the case of C# that's MSIL, together with whatever it JITs into on the particular platform. In the case of C++ that's Intel opcodes for whatever processor. Until you know what MSIL and JIT and opcodes are, understanding exactly why you get the results you do will be hard to explain.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.