An apologia for lazy evaluation

An apologia of lazy evaluation

As a programming language, laziness is the definitive feature of Haskell. Every other quality such as: higher order functions, Hindler-Milner type system or being lambda calculus based is present on other languages. In fact Haskell was born as a committee initiated language to avoid a proliferation of non-strict languages at the end of the 80's, putting all the effort before the same cart.

If you take some time, and look at what HN, stack overflow or reddit has to say about the general sentiment of non Haskell programmer on laziness, it could be resumed on a single word: anxiety. It seems that space leaks due to laziness are unavoidable and that Haskell programmers choose to live with them. The methods of detecting them come from dynamic analysis of compiled programs. There is no general advice on how to avoid them apart from experience; an even then it not enough, as expert Haskell programmers still trip over them time to time. What hope do new Haskell programmer have then?

I think this sentiment can be resumed on two general statement that I have seen pop on the web:

  1. Lazy/non-strict evaluation on functional programming languages will unavoidably lead to space leaks and consequently general performance degradation. You cannot rely on production with this system. Strict functional languages are free from these concerns and should be preferred.

  2. Laziness by default is a mistake, but we can recover the benefits of laziness with an explicit iterator concept as in rust or python. No surprising space leaks that way.

I think these are strong man versions of the statements I have seen. It is not my intention straw man my way on apologetics. But before discussing these points we need to talk about value systems.

Difference on values separate programming languages communities

There was a presentation done some years ago on a PL conference that made the following statement:

The real distinction between programming languages and their communities come from different orders on their value systems

I did not find the video on Youtube nor Google (has google gotten worse?). If someone knows it, please send me a message. (edit: It was the great Bryan Cantrill as always)

So for example, what separates the C++ community from the other languages communities? the extreme emphasis on performance. That is why there is so much emphasis on zero cost abstractions. Lisp communities care more about syntactic extensibility than performance, etc.

Haskell can be plenty of fast if you put your effort, but those Haskell programs won't look like the normal ones that you encounter. Take a look at the Haskell entries on the languages shootout benchmark. These are written using mutable arrays and explicit recursion everywhere. The pass to a C-like version is almost clear on most cases.

For the Haskell community, having ultimate single core performance on the common code is not a priority, we prefer to have compositional and correct/reliable code than fast code. With this in mind, we can approach the questions on the previous section.

Is not the presence of space leaks a matter of correctness and reliability too?

That might seem the case from afar, but once you start writing Haskell and start experimenting these space leaks, you will notice that:

  1. 90% of the space leaks you write end up adding a tiny amount of memory usage to your functions, mostly unnoticeable. Think thunks like (1 + 2) that are subjected to demand analysis under optimization.

  2. 1-2% of them are serious enough to require profiling your code with cost-centres.

There is a power distribution on the amount of memory you will use on these space leaks, thus for the grand majority of the space leaks you will write, they are a performance concern, not a correctness/reliability concern.

It is not a natural law that the distribution of intensity of space leaks is like this, it has required countless hours of profiling and optimization on part of GHC developers. The presence of a demand analyser and fusion optimizations on GHC pushes the space leaks of category (b) to be even more rare. But is true that for the remaining space leaks of category (b), learning how to profile and insert cost centres is a necessity, sorry.

In resume: space leaks are mostly a performance concern and for the rare cases it is not, you need to learn to profile to manage them. The Haskell community lives with it, because its main focus is not performance loss but the compositional benefits.

The compositional benefits

In a strict functional languages, you program differently than in lazy ones. On a strict FP language, you end up writing more explicitly recursive functions (hopefully with tail calls) than using higher order functions that express your intent. And if you use higher order functions, you need to take in consideration if the function is tail recursive, otherwise you can end with stack safety issues. On lazy languages, your intent is better expressed by higher order functions, as your can rely on guarded (co)-recursion and laziness to avoid stack issues. This helps with the correctness goal.

But more than that, you end up with compositional benefits. In a lazy language, function composition (either with (.) or with function calls) ends up with the nice property of the consumer of the pipeline driving the reduction instead of the producer as in a strict language. And in a pure languages with inductive data types, you will end up with a bunch of tiny pipelines. Let's work with the following example to see what this buys us:

let y = (f . g . h . i) x

On a strict language, evaluation of this would proceed by having the output of i x fully evaluated as a inductive data type, then pass that complete structure at one to h which would repeat that process until reaching f. Even if f could short circuit the evaluation given an edge case, it has to consume all the intermediate values to be able to do so.

On a lazy language, evaluation proceeds by demands and data dependencies. If we want the value y, we need to reduce the pipeline and we need f to produce a value. That will probably introduce a demand on the results of g, but that could be not the case: f could be satisfied with just part of the inductive data type.

This has a bunch of interesting consequences:

  • For library authors, you can provide big product data types and let the consumers pick the correct subparts they care about. This is what people mean with "lazy languages have products, strict ones sums".
  • You can define new control structures controlling which part of the structures you choose to demand. Stuff like the Alternative type class would not be possible without laziness.
  • With inductive data types, the garbage collector has an easier time as it can collect un-needed or already consumed parts of the structure as they come and go. Also, subparts that where never generated (because they where under a thunk) did not add pressure to the nursery of the GC. Less garbage is a win.
  • Some asymptotics are improved on composed code. The well know example of head . sort works on O(n) time.

You can say, so what? don't create the unnecessary parts in the first place, but if you want to collaborate with other programmers, massaging the output of their functions becomes a major concern. Having a system to respect data dependencies becomes a god-send.

So to resume: laziness super charges common composition of functions in ways that it become the primary way you organize your systems in Haskell. A weaker composition of functions would mean that other patterns such as imperative or object oriented programming could be seen as an alternative when organizing the system. This plus being allowed to use higher order functions without extra care is why haskellers prefer lazy evaluation.

What about explicit iterator like in Rust?

These iterators are explicit state machines with a next() function. They are quite nice. As a programming concept though, they are a new concept you should learn. With lazy evaluation every inductive data structure pulls double duty: as a data structure and as a control structure.

So this means that a library author implicitly provides their yield points on each constructor and you the consumer of the library can take them apart. With iterators, this process is explicit in the .iter() or .into_iter() functions.

On a imperative language with statements, this is actually pretty great. But on a functional languages with just expressions and pure function, I prefer the merge of concern that have inductive (co)data types. This is a whole dimension I just don't have to care about.

Although there is a nitpick on here: in presence of effects, you also need a streaming library for functional languages. That is a separate matter.

But what about those space leaks that affect the correctness of a program?

Most of my Haskell code end ups being pipelines even if it is not written explicitly as a bunch of (.) everywhere. I have encountered a pattern I call "being a good consumer" that help to avoid the major space leaks I have encountered of type (b). On a next post I will discuss this concept, link to some PR that implement fixes proposed by it.

Comentarios

Entradas más populares de este blog

How to avoid correctness space leaks on a lazy setting in haskell

Presenting data-forced, an alternative to bang patterns for long lived data structures