distcc

If you’ve never heard of it, you should really check out distcc. It’s a (fairly painless!) system that distributes compiling among multiple systems. Basically, it creates a wrapper script for gcc/g++/etc. and splits the files to be compiled among designated servers on the network; each one compiles a bit, then sends it back to the main host. The nice thing is, it can also make the compiler spawn multiple instances on the same machine. This alone can give you a little boost (probably a bigger one if on a multicore machine—I don’t know how processes are distributed among cores in Linux).

It works really well with makefiles also.  Just pass the “-j x” parameter to make, where x is the number of concurrent compilations you want.  They recommend something like total_cpus’s * 2 + 1, I think.  If you’re just compiling a single file, you can call distcc directly instead of gcc.

How you set it up is different per distro, and it isn’t hard at all to get it working by source.  In Ubuntu, just install it via apt-get by issuing “sudo apt-get install distcc”.  Then, edit your configuration file (/etc/default/distcc).  Here’s the contents of mine:

# Defaults for distcc initscript
# sourced by /etc/init.d/distcc

#
# should distcc be started on boot?
#
# STARTDISTCC="true"

STARTDISTCC="true"

#
# Which networks/hosts should be allowed to connect to the daemon?
# You can list multiple hosts/networks separated by spaces.
# Networks have to be in CIDR notation, f.e. 192.168.1.0/24
# Hosts are represented by a single IP Adress
#
# ALLOWEDNETS="127.0.0.1"

ALLOWEDNETS="127.0.0.1 192.168.0.1/24"

#
# Which interface should distccd listen on?
# You can specify a single interface, identified by it's IP address, here.
#
# LISTENER="127.0.0.1"

LISTENER=""

#
# You can specify a (positive) nice level for the distcc process here
#
# NICE="10"

NICE="10"

#
# Enable Zeroconf support?
# If enabled, distccd will register via mDNS/DNS-SD.
# It can then automatically be found by zeroconf enabled distcc clients
# without the need of a manually configured host list.
#
# ZEROCONF="true"

ZEROCONF="true"

Then, just edit /etc/distcc/hosts to contain the list of all the hosts you want to compile on.  Of course, distcc will have to be set up on them, and your IP will have to be in their “allow” lists.  List all the hosts on the same line separated by spaces.  If you need per-user configuration on that, you can create ~/.distcc/hosts and format it the same way.  That file is checked first and used if it exists.

I did a test with it building squid from source, and here are my results (as reported by the time command):

No distcc

——————–
real    7m44.374s
user    6m27.300s
sys    0m49.175s

With distcc (make -j 5)
——————–
real    6m23.053s
user    4m29.625s
sys    0m41.711s

What really matters is user+sys.  This is the amount of time the cpu spent on it.  The real value is how much acutal time has passed, including time in which the cpu was working on something else.  So, we basically saved  (1 – 311.336 /446.475) * 100 = 30.27%

That’s quite a bit!  What’s nice, is you can get a cool graphical monitor, too.  Just do “sudo apt-get install distccmon-gnome”.  The command’s name is also distccmon-gnome.  Here’s a screenshot:

distccmon-gnome

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: