A beginner with Haskell is bound to be exposed, usually too early, to some fancy code illustrating either a cool feature of the language, or of its underlying theoretical foundations.
Among others, there is the one-liner lazy Fibonacci definition fibs = 1:1:zipWith (+) fibs (tail fibs)
. Another one, the topic of this post, is foldl f a bs = foldr (\b g x -> g (f x b)) id bs a
.
Unfortunately for the beginner, the explanation he or she could find on, say, the Haskell Wiki, shares the problem of many intermediate Haskell tutorials: they explain moderately complex concepts in terms of more arcane ones.
So here I will attempt to provide a justification of the equation above using fairly simple tools (higher order functions being the most difficult ones).
Building intuition
Before even getting to a definition of foldl
in terms of foldr
, there are a few questions we should ask:
- is it even possible?
- how are these two functions different
- what should change to make them similar
So first, let’s look at each function. The definitions below are very similar to the ones used in GHC (as you can see here when looking at the source of foldl
and foldr
).
First foldl
:
1 2 3 |
|
foldl
is tail recursive: it uses the a
parameter as an accumulator.
Then foldr
:
1 2 3 |
|
foldr
is body recursive. This is why it is lazy and able to work on infinite list. If the function f
is a constructor that is lazy in it’s second argument, then it is possible to examine some of the output before evaluating the recursive call.
So far, the two functions appear different. Let’s try and look at how they process their arguments.
I define a pair of data types to play the role of the function f
. That way, I can look at the structure of calls. I need two because foldl
and foldr
have different types for f
1 2 3 4 5 |
|
Fl
is a data type whose constructor :<
can be used in foldl
. Fr
has constructor :>
and can be used in foldr
.
Using them:
1 2 3 4 |
|
It seems that, to translate from foldl
to foldr
, there are two problems:
- the function
f
infoldr
uses its parameters in reverse order from thef
infoldl
. Perhaps something thatflip
could help with? - the list is iterated over in the opposite order. A task for
reverse
?
Let’s see:
1 2 3 4 |
|
So yes, it is indeed possible to write foldl
in terms of foldr
: foldl f a bs = foldr (flip f) a $ reverse bs
. This answer the first question, and along the way we have collected some hints on how to do it (I mean, beside using flip
and reverse
as above)
Planning the solution
Let’s look again at foldl
:
1 2 |
|
This time I pass flip (:>)
as the function argument. This is to make :>
compatible with foldl
. :>
is really just like flip (:<)
, but it gives a new perspective: :>
has type b -> a -> a
, which is the same as saying that (:>) b
has type a -> a
(I apply partially. b :>
is the same as (:>) b
).
In point-free notation, the result above is identical to (:>) 5 . (:>) 4 . (:>) 3 . (:>) 2 . (:>) 1 $ Er
:
1 2 |
|
(I’m always happy when ghci agrees with me).
This is great for two reasons:
- while we cannot say anything about the original
f
function (here represented as:>
), the function composition.
operator is associative, meaning we can put the parenthesis where we (orfoldr
) wants. In other words, if we manipulate functions(:>) b
instead of values, and combine them with composition, we don’t have to care about the fact thatfoldr
andfoldl
nest expressions differently. - The
a
parameter, represented here byEr
, is removed from the iteration. Asfoldr
andfoldl
use this parameter differently, if we can abstract it away, this is another difference that disappear.
So the solution can be built on two concepts:
- use
flip f
instead off
so that we can operate on functionsa -> a
instead of valuesa
- use composition over these functions to combine them, then apply the result to the
a
parameter to get the answer.
Building the solution
First, I introduce a new function, foldl_alt
, that is supposed to implement foldl
. The definition of foldl_alt
is then rewritten until foldr
appears:
1
|
|
First, let’s handle the base case:
1
|
|
This is easily shown true:
1 2 3 |
|
The recursive case is simple as well (by induction):
1 2 |
|
Assuming foldl f a bs == foldl_alt f bs a
, let’s show by induction that foldl f a (b:bs) == foldl_alt f (b:bs) a
:
1 2 3 4 5 6 |
|
So foldl_alt
as currently defined is identical to foldl
. The recursive case can be further rewritten as:
1 2 3 |
|
Here the composition operator is replaced by comp'
, which just takes its argument in reversed order. This is done to show the similarity of this current foldl_alt
with the recursive case of foldr
. Indeed, foldl_alt f
is identical to foldr m id
for some function m
(for Mystery):
1 2 |
|
Now I can use induction to show that foldl_alt f bs == foldr m id bs
implies foldl_alt f (b:bs) == foldr m id (b:bs)
, and compute m
at the same time:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
When using the induction hypothesis, I replaced m
with (\b' g a -> g (f a b'))
. This is because this function is independent of any specific b
: by construction it would be the same at every step of the induction (except the base case, where it is not use, and therefore can be anything we need it to be).
b'
as a parameter to the function is bound to b
as the value in the list. I use different names to make it clear they’re different, but of course inside the function I could use b
as the variable scope is limited to the function.
For the same reason, I replace the a
above by x
, as I need a
to represent the original a
parameter (once again I could use a
for both). This gives:
1
|
|
Ok, that was long. I hope it was clear (or clearer).
Next time I’ll show how to implement foldr
as foldl
(minus the laziness).