O'Caml vs Haskell - is a strict language faster?

A couple of years ago I had a student doing a project to compare Haskell with Clean for performance. He developed a translator (Hacle) from one language to the other, so the same program could be directly compared for measurement. The initial results were promising - it looked like Clean always produced faster code. But a more careful comparison over more test data showed a mixed picture, some were faster, some slower. Because the languages are so similar (both lazy) this seemed like a good test of one compiler team's skills against another.

So this year, I have a student doing something similar, only with O'Caml and Haskell. Again, there will be a translator from Haskell, and again we are looking to see if Xavier is just a better hacker than SPJ. :-) However, there is more to it this time. To what extent is laziness a time loss, or a time gain? The idea is that there are two possible translations from Haskell to O'Caml - one preserving the structure and semantics of the code, which is then simply evaluated strictly instead of lazily; and the other simulating non-strictness in O'Caml, so the code has the same termination behaviour as well as its semantics. I think everyone would expect O'Caml to win the first challenge, because of its reputation for very fast code. Even so, I suspect there may be cases where laziness is a better strategy, and Haskell might just do well. But how about lazy O'Caml vs lazy Haskell? That's the interesting question for me.

Surely in the second one Haskell will win, because it natively understands laziness, whereas whatever lazy construct is added to O'Caml will probably obscure the intent - and hence add more overhead. Also I'd be very surprised if O'Caml can perform strictness analysis on a programmed encoding of laziness!
My biggest problem with Haskell is that I haven't quite figured out how to write code without space leaks. Anything which uses a non-trivial amount of data and arithmetic or list shuffling seems to give me grief. Can anyone recommend a good reference for how a person should write code to prevent these situations? The usual cure proposed to heal these problems is profiling and "seq" after the program is complete, but wonder if it wouldn't be better to tackle the issue before it becomes a problem. In fact, I currently find it less time consuming to port the misbehaving program to a strict language. In order to avoid space leaks I don't use Haskell in programs where I think I'll have data structures bigger than 1/1000th of my memory and anything involving +,*,/, etc. So from my perspective, a good candidate for your benchmark would be a numerical test operating on medium amounts of data. (P.S. Undoubtedly I'm suffering from some sort of call-by-need brain damage caused by years of exposure to the call-by-value paradigm.)
I think that the fact that a Haskell-to-Clean translator produces programs wich performance similar to originals proves that Haskell-to-Clean translator produces programs wich performance similar to originals.

Take a look at the Programming Language Shootout. The results shows that Haskell is an order of magnitude slower and more memory hungry. If this isn't the case, then why someone hasn't improved Haskell's benchmark programs to work as fast as Clean programs?
I think that Haskell to Clean translator cannot introduce such concepts as unique arrays or strictness annotations?
malcolm wrote ... compare Haskell with Clean... the languages are so similar (both lazy) ...
I share lanfor's confusion about this, in my very limited experience we can stop Clean programs being lazy by allowing the strictness analyzer to do its work, and when that isn't enough by adding explicit strictness annotations.
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?