# Seven Languages in Seven Weeks Haskell Day 2

Today introduces the functional aspects of Haskell: higher order functions, partial application of functions and lazy evaluation.

Higher order functions should no longer be surprising: many languages have that, even if Haskell other features make them very easy to use.

Partial application is such a feature. A function can be passed some, but not all its arguments, meaning that it is still a function, not a value. It reduces the number of anonymous functions one needs to write when using higher order functions.

Lazy evaluation is something very unique (among the lazy languages, only Haskell is somewhat mainstream). Clojure has lazy lists, which is cool, but lazy evaluation applies to everything. A piece of data can refer to itself in its definition, as long as the part that is needed can be evaluated before the part it depends on.

For instance, a canonical definition of the Fibonacci sequence is

The `fibs` list is the list of Fibonacci numbers. It starts with 1, 1, then the list of itself summed with its own tail… but the 3rd number depends on the first and the second, so its ok. By the time we need to compute the 4th, the 3rd is already known, and so on.

It takes but an instant to compute the 100000th number in the sequence:

Which brings me to a remark on the book: why on earth is `fibNth` defined the way it is? That function exists, and is called `!!`. The code in the book is convoluted, does not need that many parenthesis, and even if you have a problem with `!!`, there is no need to use both `take` and `drop` if you’re going to take the `head` of the result (`take` will make a copy of the list for no good reason).

## Exercises

In general I tried to avoid standard functions that implement a significant portion of the intended behaviour. So I didn’t use `sort` in my sort function, or `read` in parsing, …

### Simple sort

A good sort algorithm is always tricky, but insertion sort is simple enough and easy to express with pattern matching. My implementation has the same signature as the standard `sort` function. It expects is arguments to have the class `Ord`s, which guarantees they can be compared.

Testing it:

### Sort using comparison function

Sort using a specific comparison function is not harder. The standard implementation uses `Data.Ord.Ordering` to replace `>` by the comparison result `GT`. My implementation has the same signature as the standard `sortBy`, but still uses the insertion sort as in `my_sort`.

Testing it (using `compare` on the absolute value):

### Parse string into number

Parsing is not hard; to do it I break the string into a integral part, and the fractional part. Both are then cleaned to remove non digits.

The integral part is parsed left to right (with `foldl`), each time multiplying the already parsed number by 10 before adding the current number.

The fractional part is parsed right to left (with `foldr`), dividing the already parsed number by 10 before adding the current number.

Note the use of `fromIntegral` function. This is used to convert and integral number (`Int`, `Integer`, …) into any type of number. This is necessary to be allowed to divide the results and add the fractional part.

The use of fractional arithmetic makes this function less effective than `read`.

Testing:

### Lazy sequences

Once again, nothing difficult. Haskell notation pretty much reads as a specification of the problem:

Testing:

### Partial application

Notice the use of partial application of operators: if you wrap the operator and its argument in parenthesis (they are needed here), you have a function that takes the missing argument. The missing argument can be the left one as see here.

Testing:

## Challenges

### Greatest Common Denominator

I must have missed something, because that was hardly a challenge. I just implemented the Euclidean algorithm:

Testing:

`my_gcd` agrees with the standard `gcd` function.

### Lazy prime number sequences

This one was a bit trickier, yet an implementation that closely follows the Sieve of Eratosthenes algorithm is fairly short.

I first need a difference function that works on infinite lists: I manage this by taking into account the fact that the lists are always sorted. The `minus` just compares the first item of its arguments, so it can work linearly on both of them. Note that this function is not able to work on finite lists, but in this context there is no need to.

The implementation follows the proposed optimizations: it puts 2 in the prime number list right from the start, and skips other even numbers. It also start filtering at `p*p`, as smaller multiples have been filtered already (being a multiple of smaller prime numbers).

The implementation is very slow, but can compute the first 1000 prime numbers.

This turns out to be the first implementation on the Prime Number generator page on the Haskell wiki. Other implementations are much smarter and faster.

### Breaking string into lines

The exercise description seems to be missing something: a line length. So I have added that to the functions.

Breaking into words is best done with `words`, but I implemented my version. I actually started with a first abstraction, not really necessary here, that splits a sequence based on a predicate (items that return true for the predicate are all removed). Then `my_words` is just calling that function with `isSpace` as the predicate.

To combine words back into lines, I used two small functions: one (`accumUntil`) builds a line one word at a time, and stops when the line is too long. It starts with a word as the first tentative line, to make sure that a line is not empty even if a word is too long to fit.

The other function (`loop`) uses the previous one to build a list of lines until the list of words is empty.

Testing (splitting a long paragraph into lines of at most 72 characters):

I used `unlines` to group the split lines back into a single string separated by newlines and `putStrLn` to print the result.

### Justify text

The general structure of the `justify` functions is the same:

• compute the maximum length of all the lines
• for each line, compute the difference between the line length and the maximum line length
• insert spaces in the right location (depending on the kind of justification)

Each justification is a specific function. First `pad` is a small utility that creates a string of spaces of the required length.

`right` and `left` uses the above strategy to add spaces left and right, respectively. `center` adds half left, and half right. Of course `left` does not do anything visible, it just adds spaces to make each line the same length.

`both` is more complex, as it inserts spaces between words. The strategy is naive (actual algorithms include dynamic programming to balance the amount of space), but effective.

The general idea is to spread the missing space between words. For this I follow these steps:

• split the line into words using the code from the `Split` module;
• compute the number of interval (the count of the words minus 1). As I’m going to put the spaces between words, this interval also count as missing spaces (see next step). I refer to this amount as `iter`;
• divide the number of missing spaces (difference between maximum line length and effective line length plus the interval): this is the amount of space I should add between each word to add up to the right amount, if I could add fractional space
• multiply each item in `[1..inter]` by the fractional space amount as computed above.
• iterate over the list from previous step:
• compute the nearest integer of the current item (note that by construction, the nearest integer of the last item is exactly the amount of missing space);
• the difference between this integer and the amount of space allocated so far (this amount is zero at the start, of course)
• add a padding (using the `pad` function) to a list of spaces, and update the amount of space allocated before the next iteration
• then `zipWith` the list of spaces with the list of words, and recreate the line with `concat`.

The algorithm above is for a single line, but when justifying a whole paragraph, the last line should be left justified. So the `justify_both` applies the `both` justification to all but the last line, and `left` to the last line.

Testing (the full test text is not reproduce here to save space):

### Number lines

I finished with this one, as I reused some functions defined in the module `Justify` above.

This is much simpler than justifying. I need to know the number of digits I would need (which depends on the number of lines). Then I can right justify the line number and add it left of each line.

Testing:

## Wrapping up Day 2

As I knew Haskell already, this was not too taxing. I had fun with the justify challenge, trying to come up with a reasonable way to insert the right amount of space at the right place.

Dealing with types was also mostly painless. I had a couple of errors when trying to compile, but every time the location was well reported and the fix easy to figure out.