Make your console fly with Parallel Processing

I reckon this is kind of a niche thing, but interesting nonetheless.

Our current project at ustwo™ is a PureMVC-multicore application. Sounds really fancy, but it’s only a SWF loading SWFs.

To compile those SWFs we have some scripts using Ant, MTASC, SWFMill, Rhino and other tools. Updating from SVN and compiling 18 modules was taking around 90-100 seconds (that’s updating and compiling the whole project, you can compile single modules much faster).

Keep in mind that our compilation process is a little bit more than calling MTASC. We are running some pre-processing, generating exclude files and some other trickery to gain extra performance (we target mobile devices).

Anyway. Got a hint from one of our developers so I did some research to find a way to speed up compilation time. Since most of us have now dual-core machines we should be able to parallelize some of the work, right? Indeed we can.

Finding PPSS was quite easy but understanding how it to works was a little bit more complicated. I’m not going to bore you to death with the nitty-gritty, so this is the flow that adapted better to us:

* In bash parse the list of module folders and create a txt file containing the path to a script passing to it the path of the module as a parameter. One line per module:

  1. /path/to/script.sh path/to/module0
  2. /path/to/script.sh path/to/module1
  3. /path/to/script.sh path/to/module2

* Feed that txt file to PPSS like this:

  1. ppss -f moduleList.txt -c ‘bash $ITEM’

That’s where the magic happens. PPSS parallelizes each call to the compilation script using both CPUs. When you run it you can see they go all the way up to 80-90% usage, which is kind of the point.

We took some metrics and found a 40% speed improvement, sometimes even more. If you are a compulsive compiler like yours truly, this saves you quite some time.

Going from the serial approach to the parallel approach wasn’t straight forward. Mostly because I had to split the main script into several scripts and that caused some issues due to my bash programming limitations. This is what I learnt:

* If you execute a script from another, the child doesn’t have access to the variables defined by the parent unless you export them…
* … but arrays don’t get exported.
* Also, you can’t “escalate” exported vars from children to parents. The trick only goes from parent to children (security, I guess).

Anyway. Some more command line black magic under my belt, which is great. The console is a very, very powerful tool that can simplify and standardize daily tasks, which is a must when you are on a team of 10 devs. Not that it is the nicest programming language (actually, it’s pretty ugly), but its ubiquity makes learning it worthwhile.

One Response to “Make your console fly with Parallel Processing”

  1. Louwrentius Says:

    Hi there,

    Nice to hear you’ve found some use in PPSS. Tip:

    You could also fill ‘modulelist.txt’ with this”

    path/to/module0

    And then execute

    1. ppss.sh -f modulelist.txt -c ‘./path/to/script.sh ‘

    or

    1. ppss.sh -f modulelist.txt -c ‘./path/to/script.sh "$ITEM"’

    Cheers!

Leave a Reply