---
title: What Makes Haskell Unique
---
### What Makes Haskell Unique
* Michael Snoyman
* VP of Engineering, FP Complete
* F(by) 2017
---
## Why uniqueness matters
* Programmers have lots of options
* Need to know what distinguishes programming languages
* Need to understand what makes Haskell different from other languages
----
## Is Haskell functional?
* What even is a functional language?
* Lax definition
* First class functions
* Higher order functions
* Wait... is C functional?
----
__Haskell may be functional, but that doesn't make it unique__
----
## Let's Describe Haskell
* Functional
* Statically typed
* Pure
* Lazy
* Strongly typed
* Green threads
* Native executables
* Garbage collected
* Immutability
----
__It's the combination of these features that makes Haskell unique__
* Example: purity + strong typing + functional style leads to:
* Easy to write
* Easy to read
* Easy to modify
* Efficient
* We'll get to this later
* Now: lots of examples!
* Different here is usually better, but some downsides
---
## Async I/O and Concurrency
What's wrong with this?
```
json1 := httpGet(url1)
json2 := httpGet(url2)
useJsonBodies(json1, json2)
```
* Hint: it's in the title of this slide
* Ties up an entire system thread on blocking I/O calls
* We want to be more efficient with resources, so...
----
## Callbacks
```
httpGetA(url1, |json1| =>
httpGetA(url2, |json2| =>
useJsonBodies(json1, json2)
)
)
```
* Aka "callback hell"
* Lots of techniques to work around it, e.g. promises/futures
* "Oh, promises form a monad!" Not even going there today :)
----
## Asynchronous Haskell version
```haskell
json1 <- httpGet url1
json2 <- httpGet url2
useJsonBodies json1 json2
```
* But that looks just like the blocking code! Exactly
* Runtime converts to async system calls
* Runtime schedules threads
* Sleeps when waiting for data
* Wake them up when data is available
* Not only Haskell: Erlang and Go do this too
* Therefore....
----
----
## Concurrency
* Why wait for `url1` before starting `url2`?
* Need to fork threads, write to mutable variables, do some locking
* Or be awesome
----
## Canceling
* We only want one of the responses
* Take whichever comes first
```
promise1 := httpGet(url1)
promise2 := httpGet(url2)
result := newMutex()
promise1.andThen(|json1| =>
result.set(json1)
promise2.cancel())
promise2.andThen(|json2| =>
result.set(json2)
promise1.cancel())
useJsonBody(result.get())
```
----
## Canceling in Haskell
```haskell
eitherJson <- race
(httpGet url1)
(httpGet url2)
case eitherJson of
Left json1 -> useJsonBody1 json1
Right json2 -> useJsonBody2 json2
```
* More than just a well designed API
* Depends on asynchronous exceptions
* Cancel any other running thread
----
## Not just about I/O
* Thread scheduling, sleeping, killing works for CPU bound tasks too!
* Don't need to worry about a heavy computation starving other threads
* No need to offload your heavy tasks to a different microservice, do
it all in Haskell
```haskell
let tenSeconds = 10 * 1000 * 1000
timeout tenSeconds expensiveComputation
```
----
## Summary: concurrency and async I/O
Advantages
Cheap threads
Simple API
Highly responsive
Disadvantages
Complicated runtime system
Need to be aware of async exceptions when writing code
---
## Immutability and purity
* Most languages default to mutable values
* Haskell differs in two ways:
* Immutable by default, explicit kind of mutability
* Mutating is an effect, tracked by the type system
----
## Mutable Haskell
Impossible
```haskell
let mut total = 0
loop i =
if i > 1000000
then total
else total += i; loop (i + 1)
in loop 1
```
Real and tedious
```haskell
total <- newIORef 0
let loop i =
if i > 1000000
then readIORef total
else do
modifyIORef total (+ i)
loop (i + 1)
loop 1
```
----
## Better Haskell
Recursive and immutable, much better!
```haskell
let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
```
But why does this matter?
----
## Reasoning about code
Guess the output
```
// scores.txt
Alice,32
Bob,55
Charlie,22
func main() {
results := readResultsFromFile("results.txt")
printScoreRange(results)
print("First result was by: " + results[0].name)
}
func printScoreRange(results: Vector) {
...
}
```
----
## Expected output
```
Lowest: 22
Highest: 55
First result was by: Alice
```
----
## What's in printScoreRange?
```
func printScoreRange(results: Vector) {
results.sortBy(|result| => result.score)
print("Lowest: " + results[0].score)
print("Highest: " + results[results.len() - 1].score)
}
```
Actual output:
Lowest: 22
Highest: 55
First result was by: Charlie
Non-local changes broke our guessed result
----
----
## Do it in Haskell
```haskell
main :: IO ()
main = do
results <- readResultsFromFile "results.txt"
printScoreRange results
putStrLn $ "First result was by: " ++ name (head results)
printScoreRange :: [TestResult] -> IO ()
printScoreRange results = do
let results' = sortBy score results
putStrLn $ "Lowest: " ++ show (score (head results'))
putStrLn $ "Highest: " ++ show (score (last results'))
```
* Impossible for `printScoreRange` to modify results
* `printScoreRange` sorts into a local copy
* Have to think about less when writing `main`
----
## Data races
```haskell
main :: IO ()
main = do
results <- readResultsFromFile "results.txt"
concurrently_ printFirstResult printScoreRange
printFirstResult results =
putStrLn $ "First result was by: " ++ name (head results)
printScoreRange results = do
let results' = sortBy score results
putStrLn $ "Lowest: " ++ show (score (head results'))
putStrLn $ "Highest: " ++ show (score (last results'))
```
* Concurrent data accesses? No problem!
* Concurrent data writes? Impossible!
* We'll come back to mutable, multithreaded data
----
## Mutability when needed
* In place, mutable algorithms can be much faster
* Example: sorting a vector with only pure transformations is __slow__
* Haskell's answers:
1. You can still have mutable data structures if you want them,
but they're __explicit__
2. Temporary mutable copy, then freeze it
----
## Freeze!
```haskell
sortMutable :: MutableVector a -> ST (MutableVector a)
sortMutable = ... -- normal sorting algorithm
sortImmutable :: Vector a -> Vector a
sortImmutable orig = runST $ do
mutable <- newMutableVector (length orig)
copyValues orig mutable
sort mutable
freeze mutable
```
* `ST` is for temporary, local mutability
* Cannot be affected by the outside world, and cannot affect it
* Keeps functional guarantee: same input ==> same output
----
## Summary: immutability and purity
Advantages
Easier to reason about code
Avoid many cases of data races
Functions are more reliable, returning the same output for the same input
Disadvantages
Lots of ceremony if you actually want mutation
Some runtime performance hit for mutable algorithms
----
## Concurrent Mutation
What's wrong with this code?
```
runServer (|request| => {
from := accounts.lookup(request.from)
to := accounts.lookup(request.to)
accounts.set(request.from, from - request.amt)
accounts.set(request.to, to + request.amt)
})
```
Looks reasonable, but...
```
Thread 1: receive request: Alice gives $25
Thread 2: receive request: Alice receives $25
Thread 1: lookup that Alice has $50
Thread 2: lookup that Alice has $50
Thread 1: set Alice's account to $25
Thread 2: set Alice's account to $75
```
----
## Locking
```
runServer (|request| => {
accounts.lock(request.from)
accounts.lock(request.to)
// same code as before
accounts.unlock(request.from)
accounts.unlock(request.to)
})
```
```
Thread 1: receive request: $50 from Alice to Bob
Thread 2: receive request: $50 from Bob to Alice
Thread 1: lock Alice
Thread 2: lock Bob
Thread 1: try to lock Bob, but can't, so wait
Thread 2: try to lock Alice, but can't, so wait
```
__Deadlock!__
NOTE: Typical solution to this is to use locking, but it leads to other problems
----
## Software Transactional Memory
```haskell
runServer $ \request -> atomically $ do
let fromVar = lookup (from request) accounts
toVar = lookup (to request) accounts
origFrom <- readTVar fromVar
writeTVar fromVar (origFrom - amt request)
origTo <- readTVar toVar
writeTVar toVar (origTo + amt request)
```
* Looks like it has a race condition
* But STM ensures transactions are atomic
* No explicit locking required
* `TVar` is an example of explicit mutation
* Alternatives: `IORef`, `MVar`
NOTE:
* There are helper functions to make this shorter
* Want to make a point with the longer code
* STM will automatically retry when needed
----
## The role of purity
STM retries if a transaction isn't atomic. How many Bitcoins will I
buy?
```haskell
atomically $ do
buyBitcoins 3 -- side effects on my bank account
modifyTVar myBitcoinCount (+ 3)
```
* Trick question! Code doesn't compile
* `atomically` only allows side effects on `TVar`s
* Other side effects (like my bank account) are disallowed
* Safe for runtime to retry thanks to purity
NOTE:
* `buyBitcoins` needs to go to an exchange and spend $100,000
* Due to retry, this code could spend $10m
* This is where purity steps in
----
## Summary of STM
Advantages
Makes concurrent data modification much easier
Bypass many race conditions and deadlocks
Disadvantages
Depends on purity to work at all
Not really a disadvantage, you're already stuck with purity in Haskell
Not really any other disadvantages, so just use it!
---
## Laziness
*A double edged sword*
Let's revisit our previous summing example
```haskell
let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
```
There are two problems with this code:
1. There's a major performance bug in it
2. It's much more cumbersome than it should be
NOTE: Kind of cheeky to hold off on laziness this long
----
## Space leaks
Consider `let foo = 1 + 2`
* `foo` isn't `3`, it's an instruction for how to create `3`
* `foo` is a _thunk_ until it's evaluated
* Storing thunks is more expensive than simple types like `Int`s
* Which values are evaluated in our `loop`?
```haskell
let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
```
NOTE:
* The bane of laziness is space leaks, which you may have hard
about. Need to understand how laziness is implemented.
* Explain why `i` is forced and `total` is not
* Builds a tree, lots of CPU and memory pressure
----
## Explicit strictness
Need to tell Haskell compiler to evaluate `total`. Bang!
```haskell
let loop i !total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
```
* Needing to do this is a downside of Haskell's laziness
* But do we get any benefits in return?
NOTE:
* Can be explicit about what needs to be evaluated
* This is one approach, there are others
* Just added a `!`
----
## Looping (1)
Let's write our `loop` in an imperative language:
```
total := 0
for(i := 1; i <= 1000000; i++) {
total += i
}
```
Or just the evens
```
total := 0
for(i := 1; i <= 1000000; i++) {
if (isEven(i)) {
total += i
}
}
```
----
## Looping (2)
Now add up the values modulus 13 (for some weird reason)
```
total := 0
for(i := 1; i <= 1000000; i++) {
if (isEven(i)) {
total += i % 13
}
}
```
* Each modification is fine
* Getting harder to see the forest for the trees
* If our logic was more complicated, code reuse would be an issue
NOTE: Example of more complicated use case, writing a lookahead parser
----
## Some better Haskell
Our original recursive implementation sucked
```haskell
let loop i !total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
```
But this is great
sum [1..1000000]
Doesn't it allocate 8mb of ints?
Nope, laziness!
Just a thunk telling us how to get the rest of the list
----
## Composable Haskell
Just the evens?
```haskell
sum (filter even [1..1000000])
```
Modulus 13?
```haskell
sum (map (`mod` 13) (filter even [1..1000000]))
```
* Easy and natural to compose functions in a lazy context
* Avoids doing unnecessary work or using too much memory
NOTE:
* Never using more than a few machine words
* Other GHC optimizations avoid allocating any thunks
* Not covering that today
* Mixed bag, but functional+lazy=declarative, performant
----
## Short circuiting for free
In most languages, `&&` and `||` short circuit
```
foo() && bar()
```
* `bar` only called if `foo` returns `true`
In Haskell: we get that for free from laziness:
```haskell
False && _ = False
True && x = x
True || _ = True
False || x = x
```
See also: `and`, `or`, `all`, `any`
----
### Other downsides
* Laziness means an exception can be hiding in any thunk
* Aka partial functions: `head []`
* Also, some inefficient functions still available, `foldl` vs
`foldl'`
NOTE:
* Generally partial functions are frowned upon
* But they're still in the language
----
## Summary of laziness
Advantages
More composable code
Get efficient results with high level code
Short-circuiting no longer a special case
Disadvantages
Need to worry about space leaks
Exceptions can be hiding in many places
Bad functions still linger
----
## Side note; other languages
* Laziness is very similar to features in other languages
* Python generators
* Rust iterators
* In Haskell, it's far more prevalent since it affects how all code works
* However, you can get a lot of the benefits of Haskell with these techniques
---
## Other examples
* Too much to talk about in 40 minutes!
* Two other topics I wanted to touch on
* Feel free to ask me about these at breaks
----
## Parser (and other) DSLs
* Operator overloading!
* Abstractions like `Alternative` a natural fit
* `parseXMLElement <|> parseXMLText`.
* Able to reuse huge number of existing library functions,
* `optional`, `many`, `foldMap`
* General purpose `do`-notation is great
----
## Parser example
```haskell
data Time = Time Hour Minutes Seconds (Maybe AmPm)
data AmPm = Am | Pm
parseAmPm :: Parser Time
parseAmPm = Time
<$> decimal
<*> (":" *> decimal)
<*> (":" *> decimal)
<*> optional (("AM" $> Am) <|> ("PM" $> Pm))
```
c/o [@queertypes](https://twitter.com/queertypes/status/941064338848100352)
----
## Advanced techniques
* Free monads
* Monad transformer stacks
* Lens, conduit, pipes, ...
* Lots of ways to do things in Haskell!
* It's a plus and a minus
* Recommendation: choose a useful subset of Haskell and its libraries,
and define some best practices
---
## Conclusion
* Haskell combines a lot of uncommon features
* Very few of those features are unique
* Combining those features allows you to write code very differently
than in other languages
* If you want readable, robust, easy to maintain code: I think it's a
great choice
* Be aware of the sharp edges: they do exist!
----
## Questions?
Thanks everyone!