Blogged by Ujihisa. Standard methods of programming and thoughts including Clojure, Vim, LLVM, Haskell, Ruby and Mathematics written by a Japanese programmer. github/ujihisa

Monday, October 31, 2011

Optimizer Comparisons -- Skip Nops

In Haskell

The following Haskell code is supposed to just output "hello".

main = print $ times 1000000000000 id "hello"

times 0 _ x = x
times n f x = times (n-1) f (f x)

The times function I defined applies the given function many times.

  • times 3 f x is...
    • times 2 f (f x)
    • times 1 f (f (f x))
    • times 0 f (f (f (f x)))
    • f (f (f x))

id, defined in Prelude, just returns the given argument. You may want to assume the GHC internal optimizer to eliminate any id call. In other words, the Haskell code in the very beginning is supposed to be transformed just to main = print "hello".

Unfortunately, no it doesn't, even with -O2 option.

$ ghc -O2 a.hs -o a
$ ./a
... (long long hours) ..

In C

I just translated the original Haskell code to C.

#include "stdio.h"

char *id(char *x) {
  return x;
char *times(long long n, char *(*f)(char *), char *x) {
  if (n == 0) {
    return x;
  } else {
    return times(n-1, f, f(x));

int main(int argc, char const* argv[]) {
  puts(times(1000000000000LL, id, "hello"));
  return 0;

I tried it with GCC and Clang.

GCC on Mac:

  • i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)

    $ gcc -O2 a.c -o a
    $ ./a
    ... (long long hours) ..

The assembly code for the main function with -O2 -m32 option is the following.

  pushl %ebp
  movl  %esp, %ebp
  pushl %ebx
  subl  $20, %esp
  call  L13
  popl  %ebx
  leal  LC0-"L00000000001$pb"(%ebx), %eax
  movl  %eax, 8(%esp)
  leal  _id-"L00000000001$pb"(%ebx), %eax
  movl  %eax, 4(%esp)
  movl  $-727379968, (%esp)
  call  _times
  movl  %eax, (%esp)
  call  _puts
  xorl  %eax, %eax
  addl  $20, %esp
  popl  %ebx

You see it's calling _times for sure.


$ clang -O2 a.c -o a
$ ./a

This gives you the message immediately!

The below is the generated LLVM IR. You can easily tell that it just calls puts function directly. Note that the @.str const is just "hello" ending with null.

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind uwtable ssp {
  %call1 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @.str, i64 0, i64 0)) nounwind
  ret i32 0

GCC on Linux:

  • gcc 4.4

      pushl %ebp
      movl  %esp, %ebp
      andl  $-16, %esp
      subl  $16, %esp
      movl  $.LC0, (%esp)
      call  puts
      xorl  %eax, %eax

It ran like the one with Clang!

When the big number isn't long long

I wrote the first argument of times in C as int at first.

I got warning messages from both GCC and Clang that the value, 1000000000000, was too big for int. In the case, GCC on Linux made the following assembly code.

  pushl %ebp
  movl  %esp, %ebp
  jmp   .L11

It's just an infinite loop! That actually makes sense because any int values never go to non-int value 1000000000000.


  • GHC doesn't eliminate id in optimization
  • GCC eliminates it except for Apple GCC 4.2
  • Clang eliminates id
  • Some optimizations can conflict if the code depends on undefined behaviour


  1. I want to know what the smallest modification for times function can prevents such optimization.

  2. smallest-type modification: change -O2 to -O0

    smallest-range modification: change id() to cause a side-effect

    smallest-type modification2: use apple gcc 4.2