Make your console fly with Parallel Processing
I reckon this is kind of a niche thing, but interesting nonetheless.
Our current project at ustwo™ is a PureMVC-multicore application. Sounds really fancy, but it’s only a SWF loading SWFs.
To compile those SWFs we have some scripts using Ant, MTASC, SWFMill, Rhino and other tools. Updating from SVN and compiling 18 modules was taking around 90-100 seconds (that’s updating and compiling the whole project, you can compile single modules much faster).
Keep in mind that our compilation process is a little bit more than calling MTASC. We are running some pre-processing, generating exclude files and some other trickery to gain extra performance (we target mobile devices).
Anyway. Got a hint from one of our developers so I did some research to find a way to speed up compilation time. Since most of us have now dual-core machines we should be able to parallelize some of the work, right? Indeed we can.
Finding PPSS was quite easy but understanding how it to works was a little bit more complicated. I’m not going to bore you to death with the nitty-gritty, so this is the flow that adapted better to us:
* In bash parse the list of module folders and create a txt file containing the path to a script passing to it the path of the module as a parameter. One line per module:
-
/path/to/script.sh path/to/module0
-
/path/to/script.sh path/to/module1
-
/path/to/script.sh path/to/module2
-
…
* Feed that txt file to PPSS like this:
-
ppss -f moduleList.txt -c ‘bash $ITEM’
That’s where the magic happens. PPSS parallelizes each call to the compilation script using both CPUs. When you run it you can see they go all the way up to 80-90% usage, which is kind of the point.
We took some metrics and found a 40% speed improvement, sometimes even more. If you are a compulsive compiler like yours truly, this saves you quite some time.
Going from the serial approach to the parallel approach wasn’t straight forward. Mostly because I had to split the main script into several scripts and that caused some issues due to my bash programming limitations. This is what I learnt:
* If you execute a script from another, the child doesn’t have access to the variables defined by the parent unless you export them…
* … but arrays don’t get exported.
* Also, you can’t “escalate” exported vars from children to parents. The trick only goes from parent to children (security, I guess).
Anyway. Some more command line black magic under my belt, which is great. The console is a very, very powerful tool that can simplify and standardize daily tasks, which is a must when you are on a team of 10 devs. Not that it is the nicest programming language (actually, it’s pretty ugly), but its ubiquity makes learning it worthwhile.
January 27th, 2010 at 21:56
Hi there,
Nice to hear you’ve found some use in PPSS. Tip:
You could also fill ‘modulelist.txt’ with this”
path/to/module0
And then execute
or
Cheers!