Wednesday, July 18, 2007

Haskell Considered Wacky

Just to show how randomly my thoughts wander at times, I was thinking of a problem in our code at work, and how some particular stuff really needed to be encapsulated into some functions, and I got to thinking about these fancy functional languages that the bloggers have been talking about these days, and I went and found a tutorial on Haskell and started reading it for a bit.

At first the purity and cleanliness that it described was quite alluring to me. No side effects. No destructive updates. Beautiful mathematical recursive definitions of functions. This stuff is cool. But wait, how do you do some simple IO? I wondered. The tutorial read my mind and replied:

If you are familiar with books on other (imperative) languages, you might be wondering why you haven’t seen many of the standard programs written in tutorials of other languages (like ones that ask the user for his name and then says “Hi” to him by name).

The language is smart, and so is the tutorial! Yes! Let's read on:

The reason for this is simple: Being a pure functional language, it is not entirely clear how one should handle operations like user input.

Okaaay... I can see what you are saying. Once you break out of the nice idealized world that is Functional Programming, you have to deal with the Real World, which might not give you the same deterministic output for the same input every time. But ultimately all programs have to do something useful for us here in the real world, don't they? There must be a solution to this, even in Haskell. Reading on:

The solution to this was found in the depths of category theory, a branch of formal mathematics: monads. We’re not yet ready monads to talk about monads formally, but for now, think of them simply as a convenient way to express operations like input/output.

I nearly laughed out loud when I read this. It was a shocked laughter. Shocked and delighted in some ways. Delighted that the evil mathematical geniuses behind this thing had stayed so true to themselves, yet still shocked at the pure dogmatism. "We can't express input and output in our perfectly functional language unless there is some math behind it. Let us delve deeper into Mathematics to find our answer!" In the end, that's just awesome that something was found, even in the depths of category theory. Amazing. Insane. Maybe I'm just reading too much out of one tutorial author's understanding of things...

Anyway, at this point I was sharply reminded of what I had just posted this morning about the software philosophers not being directly helpful to my firmware endeavors. Haskell was most definitely a far cry from the direct control over the real world that firmware has to have. But there is something to be learned in the theory. And I love learning this stuff, why else would I have picked up the tutorial? Maybe just so I could laugh. No seriously, I think it's good to be exposed to this stuff. I think our code could very much benefit from fewer side effects from our functions. The real world is tricky enough to deal with on its own without non-deterministic code trying to control it. Let's all pretend our firmware only allows Haskell-like "pure" functions a little more and see what happens. Just don't let it get too out of hand ;-)


Anonymous said...

"We can't express input and output in our perfectly functional language unless there is some math behind it. Let us delve deeper into Mathematics to find our answer!"

Well, sort of.

What's going on with I/O (and the modeling of any sort of mutable state in general) in Haskell is the type system. There's a type for effects, and the way those types behave guarantees that the side-effects happen in a safe, predictable way.

The Haskell designers believe in having a type system that's proven not to fall apart in odd cases, and it just turns out to have taken some artifacts from category theory to make that sort of proof.

Tassilo said...

The main problem with IO in purely functional programming languages is, that they guarantee that any expression in a function will only evaluated if it is really needed and at most once. That's part of the lazy evaluation rules.

So if you have a function that has ((2 + x)*21) several times in it, that expression is evaluated only once and all occurences use that result. (Technically the program is parsed into an abstract syntax graph, not a syntax tree.) So you cannot make real assumptions on the evaluation order of expressions unless they depend on each other's return value.

Now let's say you have an IO-function getLine that reads a line from standard input, and a function whose contents is only getLine followed by an additional getLine. According the lazy evaluation rules the second getLine evaluates to the same value as the first. That's the problem with IO in purely functional languages. IO depends heavily on the sequenced evaluation order and side effects -- all things that are strictly against the main thoughts of functional programming.