2

What is overhead in term of parallel and concurrent programming (Haskell)?

However, even in a purely functional language, automatic parallelization is thwarted by an age-old problem: To make the program faster, we have to gain more from parallelism than we lose due to the overhead of adding it, and compile-time analysis cannot make good judgments in this area. An alternative approach is to use runtime profiling to find good candidates for parallelization and to feed this information back into the compiler. Even this, however, has not been terribly successful in practice.

(quoted from Simon Marlow's book Parallel and Concurrent Programming in Haskell)

What are some examples in Haskell?

2
  • The book actually elaborates on that quite a bit, have read all of it yet? Commented Jul 12, 2017 at 7:05
  • Not yet, I'll keep reading. Commented Jul 12, 2017 at 7:16

2 Answers 2

4

In any system, a thread takes resources. You have to store the state of that thread somewhere. It takes time to create the thread and set it running. Now GHC uses lightweight "green threads", which are much less expensive than OS threads. But they still cost something.

If you were to (for example) spawn a new thread for every single add, subtract, multiply and divide... well, the work to spawn a new thread has to be at least several dozen machine instructions, whereas a trivial arithmetic operation is probably a single instruction. Queuing the work as sparks takes even less work than spawning a whole new thread, but even that isn't as cheap as just doing the operation on the current thread.

Basically the cost of the work you want to do in parallel has to exceed the cost of arranging to do it in parallel. (Whether that's launching an OS thread or a green thread or queuing a spark or whatever.) GHC has all sorts of stuff to lower the cost, but it's still not free.

Sign up to request clarification or add additional context in comments.

Comments

1

You have to understand that threads are resources. They do not come for free. In other words: when you create a thread (independent of the language) then you have to make system calls, the OS has to create a thread instance, and so on. Threads have state - which changes over time; so some kind of thread management happens in the background.

And of course, when you end up with more threads than the underlying hardware can support - then the system will have to switch threads from time to time. Of course, that is not as expensive as switching full blown processes, but it still means that registers need to be saved (or restored), your hardware caches might be affected, and so on.

3 Comments

GHC's threads are much cheaper than OS threads, but they still use some memory (1KB?) and there's a little scheduling overhead.
So on the downside that raises the question if those threads can make us of the underlying hardware then.
@GhostCat GHC multiplexes multiple lightweight green threads over a smaller number of real OS threads. (Number of OS threads defaults to the number of CPU cores.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.