14

I have a solution consisting of a number of C# projects. It was written in C# to get it operational quickly. Garbage collections are starting to become an issue—we are seeing some 100 ms delays that we'd like to avoid.

One thought was to re-write it in C++, project by project. But if you combine C# with unmanaged C++, will the threads in the C++ projects also be frozen by garbage collections?

UPDATE

Thanks for your replies. This is, in fact, an app where 100 ms might be significant. It was probably a poor decision to build it in C#, but it was essential that it be up and running quickly at the time.

Right now, we're using Windows' Multimedia Timers to fire an event every 5 ms. We do see some 100+ ms gaps, and we've confirmed by checking the GC counters that these always occur during a collection. Optimization is on; built in Release mode.

7
  • 1
    The .Net garbage collection should cause major problems. You're probably doing something wrong; can you give us more detail? Commented Mar 2, 2010 at 3:06
  • 2
    @SLaks: I believe you meant that it should not cause problems. ;) Commented Mar 2, 2010 at 3:12
  • @SLaks: "collection should cause"...do you mean "collection should NOT cause"? Commented Mar 2, 2010 at 3:12
  • 1
    I agree with SLaks. You need to start looking into optimization and not re-engineering in C++. Commented Mar 2, 2010 at 3:21
  • 4
    Those who say "garbage collection shouldn't be an issue" are generally coming from the perspective of 100ms not being an issue. The fact is that in certain cases where dependable minimal latency is required--such as in real-time trading (my job)--garbage collection is an issue because it introduces an element of unpredictability into execution times that may need to be exact. @Michael: Have you looked into resource pooling? It can be a very effective weapon against garbage collection. Commented Mar 2, 2010 at 18:05

7 Answers 7

6

I work as a .NET developer at a trading firm where, like you, we care about 100 ms delays. Garbage collection can indeed become a significant issue when dependable minimal latency is required.

That said, I don't think migrating to C++ is going to be a smart move, mainly due to how time consuming it would be. Garbage collection occurs after a certain amount of memory has been allocated on the heap over time. You can substantially mitigate this issue by minimizing the amount of heap allocation your code creates.

I'd recommend trying to spot methods in your application that are responsible for significant amounts of allocation. Anywhere objects are constructed is going to be a candidate for modification. A classic approach to fighting garbage collection is utilizing resource pools: instead of creating a new object every time a method is called, maintain a pool of already-constructed objects, borrowing from the pool on every method call and returning the object to the pool once the method has completed.

Another no-brainer involves hunting down any ArrayList, HashTable, or similar non-generic collections in your code that box/unbox value types, leading to totally unnecessary heap allocation. Replace these with List<T>, Dictionary<TKey, TValue>, and so on wherever possible (here I am specifically referring to collections of value types such as int, double, long, etc.). Likewise, look out for any methods you may be calling which box value type arguments (or return boxed value types).

These are just a couple of relatively small steps you can take to reducing your garbage collection count, but they can make a big difference. With enough effort it can even be possible to completely (or at least nearly) eliminate all generation 2 garbage collections during the continuous operations phase (everything except for startup and shutdown) of your application. And I think you'll find that generation 2 collections are the real heavy-hitters.

Here's a paper outlining one company's efforts to minimize latency in a .NET application through resource pooling, in addition to a couple of other methods, with great success:

Rapid Addition leverages Microsoft .NET 3.5 Framework to build ultra-low latency FIX and FAST processing

So to reiterate: I would strongly recommend investigating ways to modify your code so as to cut down on garbage collection over converting to an entirely different language.

Sign up to request clarification or add additional context in comments.

2 Comments

It sounds like resource pooling and profiling and possibly upgrading to .NET 4.0 for the Background GC is a good way to go instead of C++. Thanks for your help! The White Paper that you linked to is great. Do you know of anything that mentions more specifics (along the lines of the one function call they mention in the paper that does boxing and unboxing behind the scenes). That is, is there a list of best practices for things to do when trying to avoid GCs and resource pooling. Perhaps a complete list of functions not to call?
@Michael: I wish there were. Unfortunately Rapid Addition has not made their standard "do not call" list public, nor would I expect Microsoft to publicize such a list (especially since one would expect it to change over time as implementations of the listed methods become modified). Your best bet, as others have said, is to profile your code and find for yourself those places where unexpected levels of memory allocation may be occurring. In case you aren't already, I also strongly recommend using perfmon to monitor your GC count while debugging your app. It can be quite illuminating.
4

First, have you tried profiling things to see if you could optimize your memory usage? A good place to start is with the CLR profiler (works with all CLRs up to 3.5).

Rewriting everything in C++ is an incredibly drastic change just for the sake of a small performance hit -- this is like fixing a paper cut by amputating your hand.

1 Comment

Thank you for the reply. C# was probably not the right choice here, but we had to have this up and running by day X. And it was decided that there was no way we could write it in C++ quickly enough. But now that that deadline is met and we're running, we have a lot more breathing room to make changes. So re-writing it project by project in C++ is one option that we're considering. We see some 100ms delays that are definitely caused by GCs. And we're considering this re-write only if we can't eliminate these through profiling, pre-allocating, etc.
4

Are you certain that those 100ms delays are due to the GC? I would make VERY sure that the GC really is your problem before you spend a lot of time, effort, and money rewriting the thing in C++. Combining managed code with unmanaged code also presents its own problems, as you have to deal with marshalling between those two contexts. That will add its own performance drain, and your net gain could quite likely end up being zero in the end.

I would profile your C# application and narrow down exactly where your 100ms delays are coming from. This tool might be helpful:

How To: Use CLR Profiler

A word on the GC

Another word about the .NET GC (or really any GC, for that matter.) This one is not nearly said often enough, but it is a critical factor in successfully writing code with a GC:

Having a Garbage Collector does not mean you don't have to think about memory management!

Writing optimal code that plays nicely with the GC requires less effort and hassle than writing C++ code that plays nicely with an unmanaged heap...but you still have to understand the GC and write code that plays nicely with the it. You can't completely ignore all memory management related things. You have to worry about it less, but you still have to think about it. Writing code that plays nicely with the GC is a critically important factor in achieving performant code that does not CREATE memory management problems.

The following article should also be helpful, as it outlines the fundamental behaivor of the .NET GC (valid through .NET 3.5...its quite likely that this article is no longer completely valid for .NET 4.0 as there have been some critical changes to its GC...for one, it no longer has to block .NET threads while collection occurs):

Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework

6 Comments

Thank you for your reply. Yes, unfortunately, every time there's a delay, the GC count confirms that there was a collection. We've never had a delay without a collection also occurring. I hadn't heard about that feature in .NET 4.0 (the not blocking .NET threads while a gc occurs). If that's true, that could really save us. Do you have a reference on it? Thanks!
Here's one: geekswithblogs.net/sdorman/archive/2008/11/07/…. This is great, this might really help us out. I haven't tried the .NET 4.0 beta yet. Has anyone else tried to compare the GC performance in real world tests?
I believe .NET 4.0 is out of beta...isn't the VS2010 launch event happening in the next couple weeks? I've been using the new 2010/4.0 stuff for months in a trial/experimental, and its pretty rock solid stuff. I haven't done anything low-level with the new GC, but I have not had any issues with it at all either.
Just out of curiosity...are the collections that are pausing your application Gen2 collections? Or are they gen0/1 collections? I would be very surprised if gen0/1 collections are pausing your application...however I would not be so surprised if a gen2 collection did. If you get getting a lot of gen2 collections, that might be a problem that could be resolved with some optimization.
It's mostly the Gen2 collections. Thanks for your help, it sounds like the .NET 4 gc along with profiling and optimizations is the way to go rather than C++.
|
3

The CLR GC does not suspend threads running unmanaged code during a collection. If the native code calls into managed code, or returns to managed code then it may be affected by a collection (like any other managed code).

3 Comments

Thanks for your reply. I suspected that that was the case, but I've found some references that say competing things. Do you have a reference to confirm that?
@Michael: The references you've found might be related to unmanaged COM interop stuff, which won't block GC, IIRC. But regular unmanaged calls will.
@Michael msdn.microsoft.com/en-us/magazine/bb985011.aspx If you think about it, there actually isn't a safe way to suspend a thread running native code to do a GC anyway. The CLR has no way of knowing if the native code is holding a resource required to perform the GC resulting in deadlock (this could happen for instance when you start involving the hosting apis).
1

If 100 ms is an issue, I asusme your code is mission critical. Mixing managed and unmanaged code will have interop overhead of calling between managed appdomain and unmanaged space.

GC is very well optimized, so before doing that try to profile your code and refactor it. If you are concerned about GC, try playing with setting the thread priority and minimize object creation and cache the data whenever possible. In your project property turns on Optimize code setting too.

4 Comments

Thanks for your reply. Do you have any data on how much the interop overhead actually is? And which method of doing it is best?
Generally the overhead is ignorable if you are just calling method, see my answer here stackoverflow.com/questions/2309383/… Overhead will become significant if you start marshaling data between managed and unmanaged type such as using string and array. See this microsoft link for more info. msdn.microsoft.com/en-us/library/ms998551.aspx
What about calling code in a .dll written in C++ using the DllImport with the calling convention set to Cdecl? If it only returns values and structs and IntPtrs?
Returning structs may have more overhead then IntPtr or normal values. Just to share, I read on recent post that a basic interop call (without marshalling overhead considered) will take about 10-30 instructions which is nothing on modern CPU that process million of instructions per second
1

One thought was to re-write it in C++, project by project. But if you combine C# with unmanaged C++, will the threads in the C++ projects also be frozen by garbage collections?

Not if the C++ code is running on different threads. the C++ heap and the managed heap are different things.

On the other hand, if your C++ code is doing a lot of new/delete, you will still begin to see allocation stalls in the C++ code as the heap gets to be fragmented. And these stalls are likely to be much worse than what you see in C# code because there is no GC. When the heap needs to be cleaned up, it just happens inside the call to new or delete.

If you really have a tight performance requirement, then you need to plan on not doing any memory allocation from the general heap inside your time critical code. In practice that means this will be more like C code than C++ code, or using special memory pools and placement new.

3 Comments

Thanks for your reply. Do you have any references that confirm that the C++ threads won't, in fact, be frozen by GCs? I've found some conflicting references, and I'm just trying to confirm one way or the other.
@Michael: Sorry, no references. Just general knowledge. I've done realtime coding in C++ for the last decade. Windows C/C++ heap doesn't move active objects, or delay freeing of objects - thus no GC. But that doesn't mean that the cost of a new/delete call can't vary enormously from call to call, the only way to avoid the uncertainty is to get all of your memory allocation taking care of before you go into your time critical code.
Thanks for your help. You're right that moving to C++ won't fix things automatically since the new/delete time can still vary a lot. Best to optimize the C# and move all the new calls out of time critical areas first before considering a full re-write.
1

.NET 4.0 has what's called Background Garbage Collection, which is different than Concurrent Garbage Collection, which may be what is causing your issue. Jason Olson talks about it with Carl Franklin and Richard Campbell on .NET Rocks Episode #517. You can view the transcript here. It's on page 5.

I'm not completely sure if just upgrading to the 4.0 Framework will solve your problem, but I imagine it would be well worth your time looking into it before rewriting everything in C++.

1 Comment

This is great, thank you. It looks like .NET 4.0 and some profiling is probably the way to go instead of a re-write.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.