Compiling with g++ using multiple cores

Question

Quick question: what is the compiler flag to allow g++ to spawn multiple instances of itself in order to compile large projects quicker (for example 4 source files at a time for a multi-core CPU)?

Will it really help? All my compile jobs are I/O bound rather than CPU bound. — Brian Knoblauch
– Brian Knoblauch, Commented Jan 6, 2009 at 13:28
Even if they are I/O bound you can probably keep the I/O load higher when the CPU heavy bits are happening (with just one g++ instance there will be lulls) and possibly gain I/O efficiencies if the scheduler has more choice about what to read from disk next. My experience has been that judicious use of make -j almost always results in some improvement. — Flexo - Save the data dump
– Flexo - Save the data dump ♦, Commented Aug 22, 2011 at 9:35
@BrianKnoblauch But on my machine(real one or in VirtualBox), it's CPU bound, I found that the CPU is busy through 'top' command when compiling. — superK
– superK, Commented Jul 19, 2013 at 9:07
Even if they are I/O bound, we can use gcc's flag '-pipe' to reduce pain. — superK
– superK, Commented Jul 19, 2013 at 9:09
just saw this in google: gcc.gnu.org/onlinedocs/libstdc++/manual/… — Jim Michaels
– Jim Michaels, Commented Jun 25, 2014 at 0:30

frankodwyer · Accepted Answer · 2009-01-05 22:32:13Z

271

You can do this with make - with gnu make it is the -j flag (this will also help on a uniprocessor machine).

For example if you want 4 parallel jobs from make:

make -j 4

You can also run gcc in a pipe with

gcc -pipe

This will pipeline the compile stages, which will also help keep the cores busy.

If you have additional machines available too, you might check out distcc, which will farm compiles out to those as well.

edited Jan 5, 2009 at 22:32

answered Jan 5, 2009 at 22:26

frankodwyer

14.1k9 gold badges52 silver badges70 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Mark Beckwith Over a year ago

You're -j number should be 1.5x the number of cores you have.

chriv Over a year ago

Thanks. I kept trying to pass "-j#" to gcc via CFLAGS/CPPFLAGS/CXXFLAGS. I had completely forgotten that "-j#" was a parameter for GNU make (and not for GCC).

Alex Bitek Over a year ago

Why does the -j option for GNU Make needs to be 1.5 x the number of CPU cores?

artless-noise-bye-due2AI Over a year ago

The 1.5 number is because of the noted I/O bound problem. It is a rule of thumb. About 1/3 of the jobs will be waiting for I/O, so the remaining jobs will be using the available cores. A number greater than the cores is better and you could even go as high as 2x. See also: Gnu make -j arguments

Antonio Over a year ago

@JimMichaels It could be because dependencies are badly set within your project, (a target starts building even if its dependencies are not ready yet) so that only a sequential build ends up being successful.

|

Lightness Races in Orbit · Accepted Answer · 2016-05-10 20:07:58Z

47

There is no such flag, and having one runs against the Unix philosophy of having each tool perform just one function and perform it well. Spawning compiler processes is conceptually the job of the build system. What you are probably looking for is the -j (jobs) flag to GNU make, a la

make -j4

Or you can use pmake or similar parallel make systems.

edited May 10, 2016 at 20:07

Lightness Races in Orbit

387k77 gold badges670 silver badges1.1k bronze badges

answered Jan 5, 2009 at 22:25

Mihai Limbășan

68.7k4 gold badges52 silver badges60 bronze badges

3 Comments

Jim Michaels Over a year ago

gnu.org/software/make/manual/html_node/Parallel.html also gnu.org/software/make/manual/html_node/…

Lightness Races in Orbit Over a year ago

"Unix pedantry is not helpful" Good thing it wasn't pedantry then, anonymous editor. Rolled back. Reviewers please pay more attention to what you're doing.

Spike0xff Over a year ago

despite the claim of non-pedantry, gcc is getting a flag -fparallel-jobs=N Better tell the GCC devs they're doing it wrong.

Havok · Accepted Answer · 2018-05-29 23:07:11Z

14

If using make, issue with -j. From man make:

  -j [jobs], --jobs[=jobs]
       Specifies the number of jobs (commands) to run simultaneously.  
       If there is more than one -j option, the last one is effective.
       If the -j option is given without an argument, make will not limit the
       number of jobs that can run simultaneously.

And most notably, if you want to script or identify the number of cores you have available (depending on your environment, and if you run in many environments, this can change a lot) you may use ubiquitous Python function cpu_count():

https://docs.python.org/3/library/multiprocessing.html#multiprocessing.cpu_count

Like this:

make -j $(python3 -c 'import multiprocessing as mp; print(int(mp.cpu_count() * 1.5))')

If you're asking why 1.5 I'll quote user artless-noise in a comment above:

The 1.5 number is because of the noted I/O bound problem. It is a rule of thumb. About 1/3 of the jobs will be waiting for I/O, so the remaining jobs will be using the available cores. A number greater than the cores is better and you could even go as high as 2x.

edited May 29, 2018 at 23:07

answered May 29, 2018 at 22:56

Havok

5,9301 gold badge39 silver badges44 bronze badges

3 Comments

Ciro Santilli OurBigBook.com Over a year ago

Most Linux users will likely prefer the shorter: make -j`nproc` with nproc in GNU Coreutils.

Ed K Over a year ago

If you're using an SSD, I/O isn't going to be as much of an issue. Just to build on Ciro's comment above, you can do this: make -j $(( $(nproc) + 1 )) (make sure you put spaces where I have them).

hoefling Over a year ago

Nice suggestion using python, on systems where nproc isn't available, e.g. in manylinux1 containers, it saves additional time by avoiding running yum update/yum install.

rmeador · Accepted Answer · 2009-01-05 22:24:56Z

12

make will do this for you. Investigate the -j and -l switches in the man page. I don't think g++ is parallelizable.

answered Jan 5, 2009 at 22:24

rmeador

25.8k19 gold badges66 silver badges105 bronze badges

1 Comment

NGI Over a year ago

+1 for mentioning -l option ( does not start a new job unless all previous jobs did terminate ). Otherwise it seems that the linker job begins with not all object files built (as some compilations are still ongoing), so that the linker job fails.

MattyT · Accepted Answer · 2009-01-06 11:27:49Z

12

People have mentioned make but bjam also supports a similar concept. Using bjam -jx instructs bjam to build up to x concurrent commands.

We use the same build scripts on Windows and Linux and using this option halves our build times on both platforms. Nice.

answered Jan 6, 2009 at 11:27

MattyT

6,6512 gold badges23 silver badges17 bronze badges

Comments

Jason · Accepted Answer · 2011-08-21 15:58:17Z

7

distcc can also be used to distribute compiles not only on the current machine, but also on other machines in a farm that have distcc installed.

answered Aug 21, 2011 at 15:58

Jason

711 silver badge1 bronze badge

2 Comments

Flexo - Save the data dump Over a year ago

+1, distcc is a useful tool to have in one's arsenal for large builds.

rogerdpack Over a year ago

Looks like there are a few that work "like" distcc as well: stackoverflow.com/questions/5374106/distributed-make/…

Purgoufr · Accepted Answer · 2023-04-13 21:23:46Z

6

You can use make -j$(nproc) . This command is used to build a project using the make build system with multiple jobs running in parallel.

For example, if your system has 4 CPU cores, running make -j$(nproc) would instruct make to run 4 jobs concurrently, one on each CPU core, speeding up the build process.

You can also see how many cores you have with run this command; echo $(nproc)

answered Apr 13, 2023 at 21:23

Purgoufr

99218 silver badges27 bronze badges

1 Comment

Gaslight Deceive Subvert Over a year ago

Note that nproc doesn't always yield optimal performance. My machine has four cores but compilation runs faster with -j2.

Andy · Accepted Answer · 2009-01-05 22:25:31Z

5

I'm not sure about g++, but if you're using GNU Make then "make -j N" (where N is the number of threads make can create) will allow make to run multple g++ jobs at the same time (so long as the files do not depend on each other).

answered Jan 5, 2009 at 22:25

Andy

8715 silver badges6 bronze badges

2 Comments

Sebi2020 Over a year ago

no N ist not the number of threads! Many people misunderstand that, but -j N tells make how many processes at once should be spawned, not threads. That's the reason why it is not as performant as MS cl -MT(really multithreaded).

mercury0114 Over a year ago

what happens if N is too large? E.g. can -j 100 break the system or is N merely an upper bound that is not required to achieve?

Ciro Santilli OurBigBook.com · Accepted Answer · 2025-05-27 08:52:46Z

3

GNU parallel

I was making a synthetic compilation benchmark and couldn't be bothered to write a Makefile, so I used:

sudo apt-get install parallel
ls | grep -E '\.c$' | parallel -t --will-cite "gcc -c -o '{.}.o' '{}'"

Explanation:

{.} takes the input argument and removes its extension
-t prints out the commands being run to give us an idea of progress
--will-cite removes the request to cite the software if you publish results using it...

parallel is so convenient that I could even do a timestamp check myself:

ls | grep -E '\.c$' | parallel -t --will-cite "\
  if ! [ -f '{.}.o' ] || [ '{}' -nt '{.}.o' ]; then
    gcc -c -o '{.}.o' '{}'
  fi
"

xargs -P can also run jobs in parallel, but it is a bit less convenient to do the extension manipulation or run multiple commands with it: Running multiple commands with xargs

Parallel linking was asked at: Can gcc use multiple cores when linking?

Fun note: parsing of context free grammars can be reduced to boolean matrix multiplication, e.g. https://www.ps.uni-saarland.de/courses/seminar-ws06/papers/07_franziska_ebert.pdf so maybe it would also be theoretically possible to speed up single file parsing for large files. Likely not of much practical use, but fun fact.

Tested in Ubuntu 18.10.

edited May 27 at 8:52

answered Dec 25, 2018 at 10:35

Ciro Santilli OurBigBook.com

392k120 gold badges1.3k silver badges1.1k bronze badges

2 Comments

Peter Cordes Over a year ago

compilation can be reduced to matrix multiplication - sounds implausible, unless the matrices are absolutely huge so the total amount of work would be much larger than the way compilers actually do things.

Ciro Santilli OurBigBook.com May 27 at 8:53

@PeterCordes compilation was not the right word, more precisely, parsing of CFG grammars: ps.uni-saarland.de/courses/seminar-ws06/papers/… Likely useless in practice for real languages, but fun fact.

Florian Heigl · Accepted Answer · 2025-05-26 15:24:12Z

1

There has been work to make gcc use multiple cores, see this link: https://gcc.gnu.org/wiki/ParallelGcc, but it seems to be in internal stages, so not yet able to be used and also seems to not have moved in the last 5 years. I'd not say that's too worrisome since compiler coding is not trivial and there's not too many people that can do it.

Looking through the readme you'll see that many things had been already investigated, but it would still need a few makeovers to really unlock the performance benefits. (i.e. see what they write about locks that could still be removed).

But at least it has been tried, so your flag may exist at some point in the future ;-)

edited May 26 at 15:24

answered May 26 at 15:18

Florian Heigl

1421 silver badge9 bronze badges

Collectives™ on Stack Overflow

Compiling with g++ using multiple cores

10 Answers 10

11 Comments

3 Comments

3 Comments

1 Comment

Comments

2 Comments

1 Comment

2 Comments

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

11 Comments

3 Comments

3 Comments

1 Comment

Comments

2 Comments

1 Comment

2 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related