From 75027de06a83b2f061e0321d77484f18c7d5d204 Mon Sep 17 00:00:00 2001 From: Michael Snoyman Date: Fri, 15 Dec 2017 12:15:37 +0300 Subject: [PATCH] What makes Haskell Unique slides --- reveal/what-makes-haskell-unique.md | 806 ++++++++++++++++++++++++++++ 1 file changed, 806 insertions(+) create mode 100644 reveal/what-makes-haskell-unique.md diff --git a/reveal/what-makes-haskell-unique.md b/reveal/what-makes-haskell-unique.md new file mode 100644 index 0000000..2238cfc --- /dev/null +++ b/reveal/what-makes-haskell-unique.md @@ -0,0 +1,806 @@ +--- +title: What Makes Haskell Unique +--- + +### What Makes Haskell Unique + +* Michael Snoyman +* VP of Engineering, FP Complete +* F(by) 2017 + +
+ +--- + +## Why uniqueness matters + +* Programmers have lots of options +* Need to know what distinguishes programming languages +* Need to understand what makes Haskell different from other languages + +---- + +## Is Haskell functional? + +* What even is a functional language? +* Lax definition + * First class functions + * Higher order functions +* Wait... is C functional? + +---- + +__Haskell may be functional, but that doesn't make it unique__ + +---- + +## Let's Describe Haskell + +* Functional +* Statically typed +* Pure +* Lazy +* Strongly typed +* Green threads +* Native executables +* Garbage collected +* Immutability + +---- + +__It's the combination of these features that makes Haskell unique__ + +* Example: purity + strong typing + functional style leads to: + * Easy to write + * Easy to read + * Easy to modify + * Efficient +* We'll get to this later +* Now: lots of examples! +* Different here is usually better, but some downsides + +--- + +## Async I/O and Concurrency + +What's wrong with this? + +``` +json1 := httpGet(url1) +json2 := httpGet(url2) +useJsonBodies(json1, json2) +``` + +* Hint: it's in the title of this slide +* Ties up an entire system thread on blocking I/O calls +* We want to be more efficient with resources, so... + +---- + +## Callbacks + +``` +httpGetA(url1, |json1| => + httpGetA(url2, |json2| => + useJsonBodies(json1, json2) + ) +) +``` + +* Aka "callback hell" +* Lots of techniques to work around it, e.g. promises/futures +* "Oh, promises form a monad!" Not even going there today :) + +---- + +## Asynchronous Haskell version + +```haskell +json1 <- httpGet url1 +json2 <- httpGet url2 +useJsonBodies json1 json2 +``` + +* But that looks just like the blocking code! Exactly +* Runtime converts to async system calls +* Runtime schedules threads + * Sleeps when waiting for data + * Wake them up when data is available +* Not only Haskell: Erlang and Go do this too + * Therefore.... + +---- + + + +---- + +## Concurrency + +* Why wait for `url1` before starting `url2`? +* Need to fork threads, write to mutable variables, do some locking +* Or be awesome + +```haskell +(json1, json2) <- concurrently + (httpGet url1) + (httpGet url2) +useJsonBodies json1 json2 +``` + +* Cheap green thread implementation +* Wonderful `async` library +* Builds on the async I/O system + +---- + +## Canceling + +* We only want one of the responses +* Take whichever comes first + +``` +promise1 := httpGet(url1) +promise2 := httpGet(url2) +result := newMutex() +promise1.andThen(|json1| => + result.set(json1) + promise2.cancel()) +promise2.andThen(|json2| => + result.set(json2) + promise1.cancel()) +useJsonBody(result.get()) +``` + +---- + +## Canceling in Haskell + +```haskell +eitherJson <- race + (httpGet url1) + (httpGet url2) +case eitherJson of + Left json1 -> useJsonBody1 json1 + Right json2 -> useJsonBody2 json2 +``` + +* More than just a well designed API +* Depends on asynchronous exceptions +* Cancel any other running thread + +---- + +## Not just about I/O + +* Thread scheduling, sleeping, killing works for CPU bound tasks too! +* Don't need to worry about a heavy computation starving other threads +* No need to offload your heavy tasks to a different microservice, do + it all in Haskell + +```haskell +let tenSeconds = 10 * 1000 * 1000 +timeout tenSeconds expensiveComputation +``` + +---- + +## Summary: concurrency and async I/O + +
+ +

Advantages

+ + + +

Disadvantages

+ + + +
+ +--- + +## Immutability and purity + +* Most languages default to mutable values +* Haskell differs in two ways: + * Immutable by default, explicit kind of mutability + * Mutating is an effect, tracked by the type system + +---- + +## Mutable Haskell + +Impossible + +```haskell +let mut total = 0 + loop i = + if i > 1000000 + then total + else total += i; loop (i + 1) + in loop 1 +``` + +Real and tedious + +```haskell +total <- newIORef 0 +let loop i = + if i > 1000000 + then readIORef total + else do + modifyIORef total (+ i) + loop (i + 1) +loop 1 +``` + +---- + +## Better Haskell + +Recursive and immutable, much better! + +```haskell +let loop i total = + if i > 1000000 + then total + else loop (i + 1) (total + i) + in loop 1 0 +``` + +But why does this matter? + +---- + +## Reasoning about code + +Guess the output + +``` +// scores.txt +Alice,32 +Bob,55 +Charlie,22 + +func main() { + results := readResultsFromFile("results.txt") + printScoreRange(results) + print("First result was by: " + results[0].name) +} + +func printScoreRange(results: Vector) { + ... +} +``` + +---- + +## Expected output + +``` +Lowest: 22 +Highest: 55 +First result was by: Alice +``` + +---- + +## What's in printScoreRange? + +``` +func printScoreRange(results: Vector) { + results.sortBy(|result| => result.score) + print("Lowest: " + results[0].score) + print("Highest: " + results[results.len() - 1].score) +} +``` + +Actual output: + +``` +Lowest: 22 +Highest: 55 +First result was by: Charlie +``` + +Non-local changes broke our guessed result + +---- + + + +---- + +## Do it in Haskell + +```haskell +main :: IO () +main = do + results <- readResultsFromFile "results.txt" + printScoreRange results + putStrLn $ "First result was by: " ++ name (head results) + +printScoreRange :: [TestResult] -> IO () +printScoreRange results = do + let results' = sortBy score results + putStrLn $ "Lowest: " ++ show (score (head results')) + putStrLn $ "Highest: " ++ show (score (last results')) +``` + +* Impossible for `printScoreRange` to modify results +* `printScoreRange` sorts into a local copy +* Have to think about less when writing `main` + +---- + +## Data races + +```haskell +main :: IO () +main = do + results <- readResultsFromFile "results.txt" + concurrently_ printFirstResult printScoreRange + +printFirstResult results = + putStrLn $ "First result was by: " ++ name (head results) + +printScoreRange results = do + let results' = sortBy score results + putStrLn $ "Lowest: " ++ show (score (head results')) + putStrLn $ "Highest: " ++ show (score (last results')) +``` + +* Concurrent data accesses? No problem! +* Concurrent data writes? Impossible! +* We'll come back to mutable, multithreaded data + +---- + +## Mutability when needed + +* In place, mutable algorithms can be much faster +* Example: sorting a vector with only pure transformations is __slow__ +* Haskell's answers: + 1. You can still have mutable data structures if you want them, + but they're __explicit__ + 2. Temporary mutable copy, then freeze it + +---- + +## Freeze! + +```haskell +sortMutable :: MutableVector a -> ST (MutableVector a) +sortMutable = ... -- normal sorting algorithm + +sortImmutable :: Vector a -> Vector a +sortImmutable orig = runST $ do + mutable <- newMutableVector (length orig) + copyValues orig mutable + sort mutable + freeze mutable +``` + +* `ST` is for temporary, local mutability +* Cannot be affected by the outside world, and cannot affect it +* Keeps functional guarantee: same input ==> same output + +---- + +## Summary: immutability and purity + +
+ +Advantages + +
    +
  • Easier to reason about code
  • +
  • Avoid many cases of data races
  • +
  • Functions are more reliable, returning the same output for the same input
  • +
+ +Disadvantages + +
    +
  • Lots of ceremony if you actually want mutation
  • +
  • Some runtime performance hit for mutable algorithms
  • +
+ +
+ +---- + +## Concurrent Mutation + +What's wrong with this code? + +``` +runServer (|request| => { + from := accounts.lookup(request.from) + to := accounts.lookup(request.to) + accounts.set(request.from, from - request.amt) + accounts.set(request.to, to + request.amt) +}) +``` + +``` +Thread 1: receive request: Alice gives $25 +Thread 2: receive request: Alice receives $25 +Thread 1: lookup that Alice has $50 +Thread 2: lookup that Alice has $50 +Thread 1: set Alice's account to $25 +Thread 2: set Alice's account to $75 +``` + +---- + +## Locking + +``` +runServer (|request| => { + accounts.lock(request.from) + accounts.lock(request.to) + // same code as before + accounts.unlock(request.from) + accounts.unlock(request.to) +}) +``` + +``` +Thread 1: receive request: $50 from Alice to Bob +Thread 2: receive request: $50 from Bob to Alice +Thread 1: lock Alice +Thread 2: lock Bob +Thread 1: try to lock Bob, but can't, so wait +Thread 2: try to lock Alice, but can't, so wait +``` + +__Deadlock!__ + +---- + +## Software Transactional Memory + +```haskell +runServer $ \request -> atomically $ do + let fromVar = lookup (from request) accounts + toVar = lookup (to request) accounts + origFrom <- readTVar fromVar + writeTVar fromVar (origFrom - amt request) + origTo <- readTVar toVar + writeTVar toVar (origTo + amt request) +``` + +* Looks like it has a race condition +* But STM ensures transactions are atomic +* No explicit locking required +* `TVar` is an example of explicit mutation +* Alternatives: `IORef`, `MVar` + +---- + +## The role of purity + +STM retries if a transaction isn't atomic. How many Bitcoins will I +buy? + +```haskell +atomically $ do + buyBitcoins 3 -- side effects on my bank account + + modifyTVar myBitcoinCount (+ 3) +``` + +* Trick question! Code doesn't compile +* `atomically` only allows side effects on `TVar`s +* Other side effects (like my bank account) are disallowed +* Safe for runtime to retry thanks to purity + +---- + +## Summary of STM + +
+ +

Advantages

+ +
    +
  • Makes concurrent data modification much easier
  • +
  • Bypass many race conditions and deadlocks
  • +
+ +

Disadvantages

+ +
    +
  • Depends on purity to work at all
  • +
  • Not really a disadvantage, you're already stuck with purity in Haskell
  • +
  • Not really any other disadvantages, so just use it!
  • +
+ +
+ +--- + +## Laziness + +*A double edged sword* + +Let's revisit our previous summing example + +```haskell +let loop i total = + if i > 1000000 + then total + else loop (i + 1) (total + i) + in loop 1 0 +``` + +There are two problems with this code: + +1. There's a major performance bug in it +2. It's much more cumbersome than it should be + +---- + +## Space leaks + +Consider `let foo = 1 + 2` + +* `foo` isn't `3`, it's an instruction on how to create `3` +* `foo` is a _thunk_ until it's evaluated +* Storing thunks is more expensive than simple types like `Int`s +* Which values are evaluated in our `loop`? + +```haskell +let loop i total = + if i > 1000000 + then total + else loop (i + 1) (total + i) + in loop 1 0 +``` + +---- + +## Explicit strictness + +Need to tell Haskell compiler to evaluate `total`. Bang! + +```haskell +let loop i !total = + if i > 1000000 + then total + else loop (i + 1) (total + i) + in loop 1 0 +``` + +* Needing to do this is a downside of Haskell +* But do we get any benefits in return? + +---- + +## Looping (1) + +Let's write our `loop` in an imperative language: + +``` +total := 0 +for(i := 1; i <= 1000000; i++) { + total += i +} +``` + +Or just the evens + +``` +total := 0 +for(i := 1; i <= 1000000; i++) { + if (isEven(i)) { + total += i + } +} +``` + +---- + +## Looping (2) + +Now add up the values modulus 13 (for some weird reason) + +``` +total := 0 +for(i := 1; i <= 1000000; i++) { + if (isEven(i)) { + total += i % 13 + } +} +``` + +* Each modification is fine +* Getting harder to see the forest for the trees +* If our logic was more complicated, code reuse would be an issue + +---- + +## Some better Haskell + +Our original recursive implementation sucked + +```haskell +let loop i !total = + if i > 1000000 + then total + else loop (i + 1) (total + i) + in loop 1 0 +``` + +But this is great + +```haskell +sum [1..1000000] +``` + +* Doesn't is allocate 8mb of ints? +* Nope, laziness! +* Just a thunk telling us how to get the rest of the list + +---- + +## Composable Haskell + +Just the evens? + +```haskell +sum (filter even [1..1000000]) +``` + +Modulus 13? + +```haskell +sum (map (`mod` 13) (filter even [1..1000000])) +``` + +* Easy and natural to compose functions in a lazy context +* Avoids doing unnecessary work or using too much memory + +---- + +## Short circuiting for free + +In most languages, `&&` and `||` short circuit + +``` +foo() && bar() +``` + +* `bar` only called if `foo` returns `true` + +In Haskell: we get that for free from laziness: + +```haskell +False && _ = False +True && x = x + +True || _ = True +False || x = x +``` + +See also: `and`, `or`, `all`, `any` + +---- + +### Other downsides + +* Laziness means an exception can be hiding in any thunk +* Aka partial functions: `head []` +* Also, some inefficient functions still available, `foldl` vs + `foldl'` + +---- + +## Summary of laziness + +
+ +

Advantages

+ +
    +
  • More composable code
  • +
  • Get efficient results with high level code
  • +
  • Short-circuiting no longer a special case
  • +
+ +

Disadvantages

+ +
    +
  • Need to worry about space leaks
  • +
  • Exceptions can be hiding in many places
  • +
  • Bad functions still linger
  • +
+ +
+ +---- + +## Side note; other languages + +* Laziness is very similar to features in other languages +* Python generators +* Rust iterators +* In Haskell, it's far more prevalent since it affects how all code works +* However, you can get a lot of the benefits of Haskell with these techniques + +--- + +## Other examples + +* Too much to talk about in 40 minutes! +* Two other topics I wanted to touch on +* Feel free to ask me about these at breaks + +---- + +## Parser (and other) DSLs + +* Operator overloading! +* Abstractions `Alternative` a natural fit + * `parseXMLElement <|> parseXMLText`. +* Able to reuse huge number of existing library functions, + e.g. `optional`, `many` +* General purpose `do`-notation is great + +---- + +## Parser example + +```haskell +data Time = Time Hour Minutes Seconds (Maybe AmPm) +data AmPm = Am | Pm + +parseAmPm :: Parser Time +parseAmPm = Time + <$> decimal + <*> (":" *> decimal) + <*> (":" *> decimal) + <*> optional (("AM" $> Am) <|> ("PM" $> Pm)) +``` + +c/o [@queertypes](https://twitter.com/queertypes/status/941064338848100352) + +---- + +## Advanced techniques + +* Free monads +* Monad transformer stacks +* Lens, conduit, pipes, ... +* Lots of ways to do things in Haskell! +* It's a plus and a minus +* Recommendation: choose a useful subset of Haskell and its libraries, + and define some best practices + +--- + +## Conclusion + +* Haskell combines a lot of uncommon features +* Very few of those features are unique +* Combining those features allows you to write code very differently + than in other languages +* If you want readable, robust, easy to maintain code: I think it's a + great choice +* Be aware of the sharp edges: they do exist! + +---- + +## Questions? + +Thanks everyone!