hwp-book/3_Intermediate.org
Yann Esposito (Yogsototh) b3063945d2
minor comments
2018-06-22 00:15:52 +02:00

24 KiB
Raw Blame History

Haskell for the working programmer

THIS IS A WORK IN PROGRESS

CONTRIBUTORS

This part is the real beginning of the book.

The user should have basic Haskell knowledge but shouldn't be familiar with it. So, I would prefer not to use much operators and prefer named functions.

In the same spirit I would tend to prefer over parentheses usage instead of using (.) and ($) and currying.

For an Haskell foreigner the first is easier to read than the second:

myFunc aMiddleware aHandler aRequest =
  aMiddleware (aHandler aRequest)

myFunc m h x = m $ h x

The part that will be really not shared as a consensus is:

As the target aren't beginner programmers but more Haskell beginners/unfamiliar. I use another prelude for that part to prevent the first basic mistakes. I might even think to use the Strict pragma for the user to be in a not so foreign environment. Note Strict doesn't make the Haskell strict, it just make it strict where is should be strict for most usage. But I would imagine we would enable a lot of common pragmas such as OverloadedStrings.

So let's say first, use Protolude, with many pragmas enabled by default.

There are two intermediate parts:

  1. The first part is about writing basic programs meant to be contained in a

single file and that should use few dependencies. For that, I would tend to use stack scripts.

  1. The second part we create a few minor projects.

So the workflow is a bit more complex. To minize frict with the tooling I would recommend using hpack. First its yaml, and everybody know yaml, second it minimize the number of manipulation when adding a new Haskell Module.

In that part, there should be a part explaining how to find the informations needed to program. How to find and use a package. Where to find the documentation, how to read it, etc… Also, give some tricks, like pointing to hayoo and hoogle, etc…

TODO Intermediate

In that part of the book, we'll use simple examples. Thus instead of going directly to a full project structure we'll focus on the language. That file can be treated as a single executable strict.

For example:

#!/usr/bin/env stack
{- stack script
   --resolver lts-11.6
   --install-ghc
   --package protolude
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
import Protolude

main = putText "Hello World!"

The firsts line are simply here to set the correct execution environment. The real program starts after them. Once stack will be installed (see the Install a dev environment section) if you put that content in a file named hello.hs then you can launch it with:

> chmod +x hello.hs
> ./hello.hs

The first time it is launched can take a little bit of time because it will download all dependencies. The advantage of this form of distribution is that it is a quasi self-contained exectuable. That's a good one for minimal examples.

But after a short introduction we'll use full projects.

We'll start by example first and all notion will be introduced as they appear. If you find confident you could feel free to skip some descriptions and explanations.

TODO Short Examples / Scripts

TO-CLEAN Guess a number

TO-CLEAN Print and read things

Now let's modify the code of main to print things. First comment the import line for Lib. Haskell comment are -- till the end of the line or {- .... -} for multiline comments. Without this comment you'll get a warning that this import is unused. And by default we compile using -Werror flag to GHC which tell that the compilation should fail also on warnings as well as on errors.

The default template tend to be a professional environment and has more restrictions in order to maximize confidence in quality.

#!/usr/bin/env stack
{- stack script
   --resolver lts-11.6
   --install-ghc
   --package protolude
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
import Protolude

main = putText "Hello, world!"

Simple and natural. Now let's ask your name.

#!/usr/bin/env stack
{- stack script
   --resolver lts-11.6
   --install-ghc
   --package protolude
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
import Protolude

main = do
 putText "What is your name?"
 name <- getLine
 putText ("Hello " <> name <> "!")

We can try that in the REPL (GHCI). You should be able to start it from your editor. For example in spacemacs I can load the current buffer (open file) in the REPL with SPC m s b.

You could also start the repl in a terminal with stack ghci And then load the module with :l hello_name.hs. The :l is a shortcut for :load.

> stack ghci

Warning: No local targets specified, so ghci will not use any options from your package.yaml / *.cabal files.

         Potential ways to resolve this:
         * If you want to use the package.yaml / *.cabal package in the current directory, use stack init to create a new stack.yaml.
         * Add to the 'packages' field of ~/.stack/global-project/stack.yaml

Configuring GHCi with the following packages:
GHCi, version 8.2.2: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /private/var/folders/bp/_8thkcjd4k3g81mpxtkq44h80000gn/T/ghci70782/ghci-script
Prelude> :l hello_name.hs
[1 of 1] Compiling Main             ( hello_name.hs, interpreted ) [flags changed]
Ok, one module loaded.
*Main> main
What is your name?
Yann
Hello Yann!

But you should also simply run it from command line:

> ./hello_name.sh
What is your name?
Yann
Hello Yann!

OK simple enough.

But let's take a moment to understand a bit more what's going on.

We started with the do keyword. It's a syntactical sugar that helps in combining multiple lines easily. Let's take a look at the type of each part.

putText :: Text -> IO ()

It means that putText is a function that take a Text as parameter and return an IO (). Mainly IO () simply means, it will return () (nothing) while doing some IO or border effect. The border effect here being, writing the text to the standard output.

putText "What is your name?" :: IO ()

So yes this line make an IO but returns nothing significant.

name <- getLine

The function getLine will read from standard input and provide the line read and send the value as a Text. If you look at the type of getLine you have:

getLine :: IO Text

And that means that to be able to retrieve and manipulate the Text returned by in an "IO context" you can use the <- notation. So in the code the type of name is Text

More generally if foo :: IO a then when you write

do
  x <- foo :: IO a

Then the type of x is a.

Finally the last line:

  putText ("Hello " <> name <> "!")

putText take a Text as argument so: ("Hello " <> name <> "!") :: Text.

So (<>) is the infix operator equivalent to the function mappend. Here are equivalent way to write the same thing:

"Hello" <> name <> "!"
"Hello" `mappend` name `mappend` "!"

mappend "Hello" (mappend name "!")
(<>) "Hello" ((<>) name "!")

So in Haskell if your function contains chars it will be a prefix function. If your function contains special chars then it is considered to be an infix operator.

You can use your function as infix if you put "`" around it name. And you can make your operator prefix if you put it inside parentheses.

So you should have remarqued a pattern here. Which is really important. Each line of a do bloc has a type of IO a.

main = do
  putText "What is your name?"      :: IO ()
  name <- getLine                   :: IO Text
  putText ("Hello " <> name <> "!") :: IO ()

So whenever you have an error message try to think about the type of your expression.

Another very important aspect to notice. The type of "Hello " <> name <> "!" is Text not IO Text. This is because this expression can be evaluated purely. Without any side effect.

Here we see a clear distinction between a pure part of our code and the impure part.

☞ Pure vs Impure (function vs procedure)

That is one of the major difference between Haskell and other languages. Haskell provide a list of function that are considered to have border effects. Those functions are given a type of the form IO a.

And the type system will restrict the way you can manipulate function with type IO a.

So, first thing that might be counter intuitive. If an expression has a type of IO a it means that we potentially perform a side effect and we "return" something of type a.

And we don't want to ever perform a side effect while doing any pure evaluation. This is why you can't write something like:

-- DOESN'T COMPILE
main = do
   putText ("Hello " <> getLine <> "!")

Because you need to "traverse" the IO barrier to get back the value after the evaluation. This is why you NEED to use the <- notation. Now knowing if a code is potentially making any side effect is explicit.

TO-CLEAN Strings in Haskell digression

Generally working with string is something you do at the beginning of learning a programming language. It is straightforward. In Haskell you have many different choices when dealing with Strings depending on the context. But let just say that 95% of the time, you'll want to use Strict Text.

Here are all the possible choices:

  • String: Just a list of Char very inefficient representation,
  • Text: UTF-16 strings can be Lazy or Strict,
  • Bytestring: Raw stream of Char and also Lazy.Bytestring.

That is already 5 different choices. But there is another package that provide other string choices. In Foundation the strings are UTF-8.

Hmmm… so much choices.

A rule of thumbs is to never use String for anything serious. Use Text most of the time because they support encoding. Use Bytestring if you need efficient bytes arrays.

By using Protolude, we naturally don't use String.

TO-CLEAN Guess my age program

So far so good. But the logic part of the code should be in a library in src/ directory. Because this part is easier to test.

The src-exe/Main.hs should be very minimalist, so now let's change its content by:

#!/usr/bin/env stack
{- stack script
   --resolver lts-11.6
   --install-ghc
   --package protolude
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
import Protolude

guess :: IO ()
guess = undefined

main :: IO ()
main = do
  guess
  putText "Thanks for playing!"

We know that the type of guess must be IO (). We don't know yet what the code will be so I just used undefined. This way the program will be able to typecheck.

The next step is to define the guess function.

#!/usr/bin/env stack
{- stack script
   --resolver lts-11.6
   --install-ghc
   --package protolude
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
import Protolude

guess :: IO ()
guess = guessBetween 0 120

guessBetween :: Integer -> Integer -> IO ()
guessBetween minAge maxAge = do
  let age = (maxAge + minAge) `div` 2
  if minAge == maxAge
    then putText ("You are " <> show minAge)
    else do
      putText ("Are you younger than " <> show age <> "?")
      answer <- getLine
      case answer of
        "y" -> guessBetween minAge (age - 1)
        _ ->  guessBetween (if age == minAge then age + 1 else age) maxAge

main :: IO ()
main = do
  guess
  putText "Thanks for playing!"

So going from there we declared the guess function to call the guessBetween function with the two paramters 0 and 120 to guess an age between 0 and 120.

And the main function is a classic recursive function. We ask for each age if the user is younger than some age.

the let keyword permit to introduce pure values in between IO ones. so age = (maxAge + minAge) `div` 2 is mostly straightforward. Note that we manipulate Integer and so that mean `div` is the integer division. so 3 `div` 2 = 1=.

We see that working in IO you can put print statements in the middle of your code. First remark we used a recursive function. In most imperative programming languages explicit loops are preferred to recursive functions for efficiency reasons. That shouldn't be the case in Haskell.

In Haskell recursive functions are the natural way to program things.

Important Remarks to note:

  • to test equality we use the (=)= operator.
  • Haskell is lazy, so the age value is only computed if needed. So if you are in the case where minAge = maxAge=, age value is not evaluated.
  • In Haskell if .. then .. else .. form always have an else body. There is no Implicit "no result" value in Haskell. Each expression need to return something explicitely. Even if it is the empty tuple ().

So now here we go:

> stack build
> stack exec -- guess-exe
Are you younger than 60?
y
Are you younger than 29?
n
Are you younger than 44?
y
Are you younger than 36?
n
Are you younger than 39?
n
Are you younger than 41?
y
Are you younger than 39?
n
You are 40
Bye!

We see we can still make the program better. For example, the same question is asked twice in that example. Still, it works.

TO-CLEAN Guess a random number

Let's write another slightly more complex example. Instead of guessing the age of somebody. This will be the role of the user to guess a random number choosen by the program.

First we'll need to generate random numbers. To that end we'll use a the random package as a new dependency.

You can get more information either on hackage or on stackage:

Hackage is the official place where to put Haskell public libraries. Stackage works in conjunction with stack and mainly it takes care of having a list of packages version working together. So that means that all packages in an LTS (Long Term Support) release can work together without any build conflict.

Now let's use that package. Notice the added --package random argument.

We'll start by writing a guessNumber function:

#!/usr/bin/env stack
{- stack script
   --resolver lts-11.6
   --install-ghc
   --package protolude
   --package random
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}

import Protolude

import System.Random (randomRIO)

...

-- | Choose a random number and ask the user to find it.
guessNumber :: IO ()
guessNumber = do
  n <- randomRIO (0,100)
  putText "I've choosen a number bettween 0 and 100"
  putText "Can you guess which number it was?"
  guessNum 0 n

-- | Given a number of try the user already made and the number to find
-- ask the user to find it.
guessNum :: Int -> Int -> IO ()
guessNum nbTry nbToFound = undefined

So for now we just focus on how to get a random number:

   do
     n <- randomRIO (0::Int,100)
     -- do stuff with n

You NEED to use the <- notation inside a do bloc. If you try to use let n = randomRIO (0,100) it will fail because the types won't match.

And that's it!

Now to write the guessNum function, we'll write a classical recursive function:

-- | Given a number of try the user already made and the number to find
-- ask the user to find it.
guessNum :: Int -> Int -> IO ()
guessNum nbTry nbToFound = do
  putText "What is your guess?"
  answer <- getLine
  let guessedNumber = readMaybe (toS answer)
  case guessedNumber of
    Nothing -> putText "Please enter a number"
    Just n ->
      if n == nbToFound
        then putText ("You found it in " <> show (nbTry + 1) <> " tries.")
        else do
          if n < nbToFound
          then putText "Your answer is too low, try a higher number"
          else putText "Your answer is too high, try a lower number"
          guessNum (nbTry + 1) nbToFound
TODO Let's explain each line of the that function.

The full program is then:

#!/usr/bin/env stack
{- stack script
   --resolver lts-11.6
   --install-ghc
   --package protolude
   --package random
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}

import Protolude

import System.Random (randomRIO)

main :: IO ()
main = guessNumber

-- | Choose a random number and ask the user to find it.
guessNumber :: IO ()
guessNumber = do
  n <- randomRIO (0,100)
  putText "I've choosen a number bettween 0 and 100"
  putText "Can you guess which number it was?"
  guessNum 0 n

-- | Given a number of try the user already made and the number to find
-- ask the user to find it.
guessNum :: Int -> Int -> IO ()
guessNum nbTry nbToFound = do
  putText "What is your guess?"
  answer <- getLine
  let guessedNumber = readMaybe (toS answer)
  case guessedNumber of
    Nothing -> putText "Please enter a number"
    Just n ->
      if n == nbToFound
        then putText ("You found it in " <> show (nbTry + 1) <> " tries.")
        else do
          if n < nbToFound
          then putText "Your answer is too low, try a higher number"
          else putText "Your answer is too high, try a lower number"
          guessNum (nbTry + 1) nbToFound

which once executed:

> ./guess_number.hs
I've choosen a number bettween 0 and 100
Can you guess which number it was?
What is your guess?
50
Your answer is too low, try a higher number
What is your guess?
75
Your answer is too low, try a higher number
What is your guess?
90
Your answer is too high, try a lower number
What is your guess?
83
Your answer is too low, try a higher number
What is your guess?
87
You found it in 5 tries.
TO-CLEAN What did we learn so far?

So up until now, if you followed. You should be able to "reproduce" and make minimal changes. But I am certain than it still be difficult to make some changes. It is time to learn some general principles. I know it might be a bit repetitive but its important to be certain to ingest those informations.

A generic function of type IO () typically main should look like:

f :: IO a
f = do
    α <- f1
    β <- f2
    γ <- f3
    δ <- f4
    f5

where each expression fi is of type IO a for some a. You can use any value α, β, etc‥ as a parameter. In order to be valid. The last expression must have the same type as f. so here f5 :: IO a.

Now if I give you the following functions:

~getLine
IO Text~ that read a line from stdin.
~putText
Text -> IO ()~ that read a line from stdin.

With that you have the ability to read stdin and print things.

if τ then f1 else f2 where =τ
Bool= and the type of f1 and f2 must be the same. Generally this is denoted by: :type f1 ~ :type f2 and that type will be the same as the entire if ‥ then ‥ else ‥ expression.
?
you can compare things that can be compared with <, <=, >, >=, ==, /= (different).
?
you can concatenate things that could be concatenated (like Text) with <>
?
you can transform things as Text with show in particular numbers.

So that is a few number of component but they are all composable. And so far we only needed that to write our first programs.

Haskell libs will provide you with a lot more base functions but also a lot more composition functions.

TODO Command Line Application

Another thing you might want to achieve at first is to retrieve arguments for a command line application.

TO-CLEAN Basic

The simplest way to retrieve parameters to a command line is to use the getArgs function.

getArgs :: IO [String]

Here is a minimal example.

#!/usr/bin/env stack
-- stack --resolver lts-11.6 script
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE NoImplicitPrelude #-}
import Protolude
import System.Environment (getArgs)

main :: IO ()
main = do
  arguments <- getArgs
  case head arguments of
    Just filename -> die ("The first argument is: " <> toS filename)
    Nothing -> die "Please enter a filename"
> ./cmdline-basic.sh foo
The first argument is: foo
> ./cmdline-basic.sh
Please enter a filename

If you have a very basic command line it could be good enough. But if you plan to have more things to configure you can consider to use a library to parse options.

TODO Option Parsing

For that we will use the optparse-generic package.

TODO File Access

TODO Daemons & Logging

TODO Intermediate

TO-CLEAN Stack template

☞ As a first projet a lot of new concept will be introduced. Don't be discouraged by that.

Let's create a project with a sane and modern file organisation.

I made a stack templates largely inspired by tasty-travis template. It will provide a bootstrap for organizing your application with tests, benchmarks and continuous integration.

This template provide a file organisation for your projects.

Mainly do jump into programmin you could theoretically just download the binary of the main Haskell compiler GHC to your compiler and compile each file with ghc myfile.hs. But let's face it. It's not suitable for real project which need more informations about it.

So let's start with a sane professional organisation for your files.

TODO modify the URL to use a better URL: torrent / IPFS
stack new guess https://git.io/vbpej

After that, this should generate a new guess directory with the following files:

> tree
.
├── CHANGELOG.md
├── LICENSE
├── README.md
├── Setup.hs
├── guess.cabal
├── package.yaml
├── src
│   └── Lib.hs
├── src-benchmark
│   └── Main.hs
├── src-doctest
│   └── Main.hs
├── src-exe
│   └── Main.hs
├── src-test
│   └── Main.hs
├── stack.yaml
└── tutorial.md

5 directories, 13 files

Most of your source code should be in the src directory. Generally src-exe should be a minimal code that could handle the main function to start your application. We'll talk about other parts later in the book but most other file should be quite straightforward.

Edit the file src-exe/Main.hs

The file contains:

import Protolude

import Lib (inc)

main :: IO ()
main = print (inc 41)

To compile it do a

> stack build
> stack exec -- guess-exe
42

So that program print 42 and stop.

TODO DB Access

NoSQL (Redis looks easy)
Stream DB (Kafka or NATS, etc…)
SQL (SQLite & Postgres)

Not sure about that part. Perhaps this should move in the Production section

TODO REST API

TODO Servant
TODO JSON manipulation
TODO Swagger-UI

TODO Intermediate Conclusion

This should conclude a part where the reader should already gained a lot of knowledge. He should now be mainly autonomous. Still, the next section will provide many advices.

Congratulation for going this far. Now you should be able to work in Haskell at least as well as in any other programming language.

Now there are different directions:

  • learning more libraries
  • learn to optimise code to make it as fast as C
  • learn to understand details of the compilation and Haskell
  • learn tips and tricks
  • learn more about abstractions and type classes
  • learn parallel and concurrent programming
  • learn to deploy like a pro using nix

The order in which to learn all thoses things can be very different for everty need.