Sunday, 14 March 2010

GHC, LLVM, A Couple Shootout Entries: Faster for Free

After seeing a post by Alp Mestanogullari , I thought it might be interesting to see how the new LLVM backend to GHC is working out. The instructions in Alp's post are quite thorough, although they are under the assumption that you are patching LLVM to do so. This is no longer necessary as the LLVM project has accepted a required patch to provide a new calling convention for GHC.

Thus, the instructions have to be changed a little:
  1. Get LLVM from SVN, compile/install it.
  2. Get GHC from the darcs repository.
  3. Patch it with both Don's and David's patches (found on the blog post at the top). These provide the LLVM backend and command-line options to GHC to invoke certain optimizations through LLVM.

  4. Apply the following:

    hunk ./compiler/llvmGen/Llvm/Types.hs 527

    - show CC_Fastcc = "fastcc"
    + show CC_Fastcc = show (CC_Ncc 10)
  5. Compile GHC. You may need to rollback to a stable-date of some sort, the last time I checked, the most recent didn't compile. I think 2010-02-25 was fine.
I would also reference Alp's post to pick up other details, as he gives much more time to this process than I do here.

Anyhow, once this was done I tried to run a couple of the shootout programs, and found that the LLVM backend seemed to hold up. Admittedly this is a very limited test, just testing the waters:

  • Mandelbrot -- old: 2.723s, LLVM: 2.641s
  • NBody -- old 2.093s, LLVM: 2.029s
The LLVM options that were used were just -std-compile-opts and -O3, also note that these were run with reduced parameters, 4000 for mandelbrot, 5000000 for nbody. These were averaged from 5 runs of each binary. This shows that LLVM gives us a speedup of  3.08% and 3.15% respectively, in these two limited cases. Even if the cases are small, and few, it's still very encouraging to note that its possible to get a bit more performance, essentially free of cost! Hopefully the LLVM-backend work will continue to improve and pay off into the future!