In this article I will try to provide you what I lacked to learn Haskell.
Why should you care about learning Haskell?
You will learn far more than just a new language.
By learning Haskell you will learn a lot of new concept you certainly never heard about.
This article is not intented to be easy.
It will certainly be a bit hard to follow.
If you can't follow me you'll certainly have a far better and longer version in "Learn You a Haskell" and "Real World Haskell".
Try to follow me until the end.
Hopefully, you'll be rewarded by having learned a lot of new concepts.
This actual article contains three parts.
- Introduction: a fast short example to show Haskell can be friendly.
- Basic Haskell: Haskell syntax, and some essential notions.
- Hard Part:
- Functional style; an example from imperative to functional
- Types; a standard binary tree example
- Purity and IO; how the Haskell solution is incredible.
- Monads; incredible how we can generalize
- Other links.
> Note: Each time you'll see a separator with a filename ending in `.lhs`, you could click the filename to get this file. If you save the file as `filename.lhs`, you can run it with
Also, while learning Haskell, it _really_ doesn't matter much if you don't understand syntax details. If you cross a `>>=`, `<$>`, `<-` or any other weird symbol, just ignore them and follows the flow of the code.
But just before that, we should verify, static typing really work as expected:
<divclass="codehighlight">
<codeclass="haskell">
f :: Num a => a -> a -> a
f x y = x*x + y*y
main = print (f 3 2.4)
</code>
</div>
It works, because, `3` is a valid reprensation for both Frational numbers like Float and for Integer. As `2.4` is a Fractional number, `3` is then interpreted as being also a Fractional number.
But if we force our function to work with different type, it will fail:
<divclass="codehighlight">
<codeclass="haskell">
f :: Num a => a -> a -> a
f x y = x*x + y*y
x :: Int
x = 3
y :: Float
y = 2.4
main = print (f x y)
</code>
</div>
The comiler complains.
The two parameter must have the same type.
If you believe it is a bad idea, and the compiler should make the transformation from a type to another for you, you should really watch this great (and funny) video:
If you are not used to it, you should exercise a bit.
I would like to introduce another higher order function: `(.)`.
The `(.)` function correspond to the mathematical composition.
<divclass="codehighlight">
<codeclass="haskell">
(f . g . h) x ⇔ f ( g (h x))
</code>
</div>
We can take advantage of this operator to curry a bit more our function:
<divclass="codehighlight">
<codeclass="haskell">
-- Version 9
import Data.List
evenSum :: Integral a => [a] -> a
evenSum = (foldl' (+) 0) . (filter even)
</code>
</div>
Also, there already exists a `sum` function.
<divclass="codehighlight">
<codeclass="haskell">
-- Version 10
import Data.List
evenSum :: Integral a => [a] -> a
evenSum = sum . (filter even)
</code>
</div>
!!!!!
What power did we gain by using `foldl'`?
You have no more different case to test, it feels more like a mathematical function.
And it become far easier to compose the function with other ones.
!!!!!
Suppose we want to modify slightly our function.
We want to get the sum of all even square of element of the list.
~~~
[1,2,3,4] ~> [1,4,9,16] ~> [4,16] ~> 20
~~~
Update the version 10 is extremely easy:
<divclass="codehighlight">
<codeclass="haskell">
squareEvenSum = sum . (filter even) . (map (^2))
</code>
</div>
We simply had to add another "transformation function".
~~~
map (^2) [1,2,3,4] ⇔ [1,4,9,16]
~~~
!!!!!
The main advantage is you didn't have to modify _inside_ the function definition, but you just had to use another function.
You encapsulate the function and you could use a "pipe-like" notation and way of thinking.
With the ability of not having to open the pipe to modify the behaviour of your program will fastly become a huge help to think about it.
!!!!!
To modify version 1 is left as an exercise to the reader.
If you believe we reached the end of generalization, then know you are very wrong. For example, there is a way to not only use this function on list but on any recursive type. If you want to know how, I suggest you to read this quite fun article: [Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire by Meijer, Fokkinga and Paterson](http://eprints.eemcs.utwente.nl/7281/0
1/db-utwente-40501F46.pdf).
This example should show you how pure functional programming is
great. Unfortunately, using pure functional programming isn't well
suited for all usages. Or at least it isn't found yet.
One of the great power of Haskell, is the ability to create DSL
(Domain Specific Language)
making it easy to change the programming paradigm.
In fact, Haskell is also great when you want to write imperative style
programming. Understand this was really hard for me when learning Haskell.
Because a lot of effort is provided to explain you how much functional
approach is superior. Than when you attack the imperative style of Haskell, it
is hard to understand why and how.
But before talking about this Haskell super-power, we must talk about another
Typically, you can re-create lists, but with a more verbose syntax:
~~~
data List a = Empty | Cons a (List a)
~~~
If you really want to use an easier syntax you can use infix name for constructors.
~~~
infixr 5 :::
data List a = Nil | a ::: (List a)
~~~
The number after `infixr` is the priority.
If you want to be able to print (`Show`), read (`Read`), test equality (`Eq`) and compare (`Ord`) your new data structure you can tell Haskell to derive the appropriate function for you.
<divclass="codehighlight">
<codeclass="haskell">
infixr 5 :::
data List a = Nil | a ::: (List a)
deriving (Show,Read,Eq,Ord)
</code>
</div>
When told to use deriving Show, Haskell create a `show` function for you.
We'll see soon how you could use your own `show` function.
Why did we used some strange syntax, and what exactly is this `IO` type.
It looks a bit like magic.
For now let's just forget about all the pure part of our program, and focus
on the impure part:
<divclass="codehighlight">
<codeclass="haskell">
askUser :: IO [Integer]
askUser = do
putStrLn "Enter a list of numbers (separated by comma):"
input <-getLine
let maybeList = getListFromString input in
case maybeList of
Just l -> return l
Nothing -> askUser
main :: IO ()
main = do
list <-askUser
print $ sum list
</code>
</div>
First noticiable thing:
the structure of these function is very similar to the one of an imperative language.
The fact is, Haskell is powerful enough to recreate function to help code look like in an imperative language.
For example, if you wish you could create a `while` in Haskell.
In fact, for dealing with `IO`, imperative style is generally more appropriate.
But, you also see there are some light differences.
The notation is a bit strange.
This is here that reside the beauty of how Haskell handle IOs.
Imagine you want to write a pure language.
But, a completely pure language will have few utility in real life.
Wihout effect, you couldn't print anything on a screen, read the user input, etc...
You can imagine, in standard impure language, there is a hidden global variable.
For example, you could write something in a file.
Somebody else could modify this file.
And you could later read the content of the file.
Each time something changed in the external world, it was like a global variable had changed its value.
This global variable can be represented as a World state.
Now, to have a pure language with some utility you could simply state the execution of your program will be an evaluation of the main function with the following type.
~~~
main :: World -> World
~~~
Which means, main instead of having a global variable accessible by all functions of you program.
Main will be given as parameter an id representing the state of the World on which you can access.
And it will certainly make some changes to it.
In reality, the real type is closer to
~~~
main :: World -> ((),World)
~~~
The `()` type is the null type.
Nothing to see here.
Now let's write our main function:
~~~
main w0 =
let (list,w1) = askUser w0 in
let (x,w2) = print (sum list,w1) in
x
~~~
Also remember, the order of evaluation is generally not fixed in Haskell.
For example in general to evaluate `f a b`, you have many choices:
- first eval `a` then `b` then `f a b`
- first eval `b` then `a` then `f a b`.
- eval `a` and `b` in parallel then `f a b`
This is true, because we should work in a pure language.
Now, if you look at the main function, it is clear you must eval the first
line before the second one since, to evaluate the second line you have
to get a parameter given by the evaluation of the first line.
Such trick works nicely.
The compiler will at each step provide a pointer to a new real world id.
Under the hood, `print` will evaluate as:
- print something on the screen
- modify the id of the world
- evaluate as `((),new world id)`.
Now, if you look at the style of the main function, it is clearly awkward.
Let's try to make the same to the askUser function:
~~~
askUser :: World -> ([Integer],World)
~~~
The type has changed as we will modify the "World" we simulate this by
returning a world value different than the input "World" value.
This way we remain "pure" in the language.
You could write a completely pure implementation and it will works.
In the real world, the evaluation will have some side effect each time a function
return another value of the world input.
Before:
<divclass="codehighlight">
<codeclass="haskell">
askUser :: IO [Integer]
askUser = do
putStrLn "Enter a list of numbers:"
input <-getLine
let maybeList = getListFromString input in
case maybeList of
Just l -> return l
Nothing -> askUser
</code>
</div>
After:
<divclass="codehighlight">
<codeclass="haskell">
askUser w0 =
let (_,w1) = putStrLn "Enter a list of numbers:"
(input,w2) = getLine w1
(l,w3) = case getListFromString input of
Just l -> (l,w2)
Nothing -> askUser w2
in
(l,w3)
</code>
</div>
This is similar, but awkward. All these `let ... in`. Even if with Haskell
you could remove most, it's still awkard.
The lesson, is, naive IO implementation in Pure functional language is awkward!
Fortunately, some have found a better way to handle this problem.
We see a pattern.
Each line is of the form:
~~~
let (y,w') = action x w in
~~~
Even if for some line the first `x` argument isn't needed.
The output type is a couple, `(answer, newWorldValue)`.
Each function `f` must have a type of kind:
~~~
f :: World -> (a,World)
~~~
Not only this, but we can also remark we use them always
with the following general pattern:
~~~
let (y,w1) = action1 x w0 in
let (z,w2) = action2 y w1 in
...
~~~
Now, we will make a magic trick. We will make the world variable "disappear".
We will `bind` the two lines. Let's define the `bind` function.
~~~
bind :: (World -> (a,World))
-> (a -> (World -> (b,World)))
-> (World -> (b,World))
~~~
(World -> (a,World)) is the type for an IO action. Like getLine, printing something, etc... Now let's rename it for more clarity.
~~~
type IO a = World -> (a, World)
~~~
Some example of functions:
~~~
getLine :: IO String
print :: Show a => a -> IO ()
~~~
`getLine` is an IO action which take a world as parameter, then return a couple (String,World).
Which can be said as: `getLine` is of type IO String.
Which we also see as, an IO action which will return a String "embeded inside an IO".
The function `print` is also interresting.
It takes on argument which can be shown.
In fact it takes two arguments.
The first is the value to print and the other is the state of world.
It then return a couple of type `((),World)`.
This means it changes the world state, but don't give anymore data.
We simplify the bind type:
~~~
bind :: IO a
-> (b -> IO b)
-> IO b
~~~
The function bind take two actions.
The type is quite intimidating. But stay with me here.
On a line like
~~~
let (x,w1) = action1 w0 in
let (y,w2) = action2 x w1 in
(y,w2)
~~~
On the first line, action1 is of type `(World -> (a,World))`.
On the second line, action2 is of type `(a -> (World -> (b,World))`.
`bind`:
- take a function similar to all lines as first argument wich returns a `(a,World)`
- take a function with take an `a` as argument and returns a line wich return a `(b,World)`
- return a line wich returns a `(b,World)`.
~~~
(bind action1 action2) w0 =
let (x, w1) = action1 w0
(y, w2) = action2 x w1
in (y, w2)
~~~
The idea is to hide the World argument with this function. Let's go:
As example imagine if we wanted to simulate:
~~~
let (line1,w1) = getLine w0 in
let ((),w2) = print line1 in
((),w2)
~~~
Now, using the bind function:
~~~
(res,w2) = (bind getLine (\l -> print l)) w0
~~~
As print is of type (World -> ((),World)), we know res = () (null type).
If you didn't saw what was magic here, let's try with three lines this time.
~~~
let (line1,w1) = getLine w0 in
let (line2,w2) = getLine w1 in
let ((),w3) = print (line1 ++ line2) in
((),w3)
~~~
Which is equivalent to:
~~~
(res,w3) = bind getLine (\line1 ->
bind getLine (\line2 ->
print (line1 ++ line2)))
~~~
Didn't you remark something?
Yes, there isn't anymore temporary World variable used anywhere!
This is _MA_. _GIC_.
We can make thinks look better. Let's call bind (>>=) which is an infix function.
Infix is like (+), 3 + 4 <=> "(+) 3 4"
~~~
(res,w3) = getLine >>=
(\line1 -> getLine >>=
(\line2 -> print (line1 ++ line2)))
~~~
Ho Ho Ho! Happy Christmas Everyone!
Haskell has made a syntactical sugar for us:
~~~
do
y <-fx
z <-gy
t <-hyz
~~~
Is replaced by:
~~~
(f >>= (\y ->
g y >>= (\z ->
h y z >>= (\t ->
...
))))
~~~
Which is perfect for IO.
Now we also just need a way to remove the last statement containing a World value.