32 KiB
% Haskell for Shell Scripting % Gabriel Gonzalez % September 3, 2015
Before class
If you haven't installed ghc
, visit:
... and install a GHC distribution appropriate to your operating system.
To test your Haskell installation, run these commands from a terminal:
$ echo 'main = putStrLn "Hello, world!"' > hello.hs
$ runhaskell hello.hs
Hello, world!
Install the shell scripting library using these commands:
$ cabal update
$ cabal install turtle-1.2.1
Outline
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
I'm hosting slides on Github so that people can follow along locally
Overview of Haskell
Haskell is a purely functional language with strong and static types
-
Purely functional means side effect order is not tied to evaluation order
-
Strong types are fine-grained (i.e.
FilePath
/Time
/Name
vsString
) -
Static types catch errors at compile time
Haskell can be both interpreted or compiled to a native binary
Haskell is a managed language, providing garbage collection, concurrency, and transactional shared memory:
- Garbage collection is efficient (throughput measured in GB / s)
- Concurrency uses green-threads and is efficient (can handle 1M threads)
- Transactional memory simplifies race-free concurrent code
Big disadvantages of Haskell
- No JVM backend
- Beginners can't easily reason about performance
- Built-in record syntax is clumsy
- Most language features are libraries, which hampers discoverability
- Culture of abstraction astronauts (myself included)
Comparing Haskell to Scala
Similarities:
- Static types
- Strong types
- Functional
- Automatic memory management
Differences:
- Haskell is not object-oriented
- Haskell is not a JVM language
- Haskell has a faster startup time (10 ms compiled, < 1 second interpreted)
- Haskell compiles to native code
Comparing Haskell to Python
Similarities
- Lightweight syntax
- Significant whitespace (with optional curly braces)
- Procedural
- Automatic memory management
Differences:
- Haskell is statically typed (unless you enable
-fdefer-type-errors
) - Haskell is strongly typed
- Haskell compiler/interpreter not pre-installed on most Unix-like systems
- Haskell compiles to native code
Why use Haskell for shell scripting?
Haskell has light-weight syntax and fast start-up times
Haskell code is easy to refactor and maintain
Hello, world!
Save this to: example.hs
:
#!/usr/bin/env runhaskell
-- #!/bin/bash
{-# LANGUAGE OverloadedStrings #-} --
--
import Turtle --
--
main = echo "Hello, world!" -- echo Hello, world!
... then run the example script:
$ chmod u+x example.hs
$ ./example.hs
Hello, world!
Create a native binary
$ ghc -O2 example.hs
$ ./example
Hello, world!
Use Haskell interactively
$ ghci -v0
Prelude> :set -XOverloadedStrings
Prelude> import Turtle
Prelude Turtle> echo "Hello, world!"
Hello, world!
Prelude Turtle> 2 + 2
4
Prelude Turtle> let f x = x + x
Prelude Turtle> f 2
4
Prelude Turtle> :quit
Load code into the REPL
$ ghci -v0 example.hs
*Main> main
Hello, world!
*Main> :quit
Exercise
What do you think this code does?
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
say = echo
main = say "Hello, world!"
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
Values
#!/usr/bin/env runhaskell
-- #!/bin/bash
{-# LANGUAGE OverloadedStrings #-} --
--
import Turtle --
--
str = "Hello, world!" -- STR='Hello, world!'
--
main = echo str -- echo $STR
$ ./example.hs
Hello, world!
str
is immutable (analogous to Scala's val
)
Why do you think Haskell defaults to immutability?
Order of definitions does not matter
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
main = echo str
str = "Hello, world!"
You need main
Modify your program to to eliminate main
:
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
echo "Hello, world!"
You will get this error message if you run the program:
example.hs:7:1: Parse error: naked expression at top level
The top level of a Haskell program is declarative and only allows definitions
You cannot execute code at the top level
The runtime only executes main
!
Subroutines
Use do
to create a subroutine that runs more than one command:
Using significant whitespace:
#!/usr/bin/env runhaskell
-- #!/bin/bash
{-# LANGUAGE OverloadedStrings #-} --
--
import Turtle --
--
main = do --
echo "Line 1" -- echo Line 1
echo "Line 2" -- echo Line 2
$ ./example.hs
Line 1
Line 2
You can opt out of significant whitespace
main = do
{ echo "Line 1"
; echo "Line 2"
}
main = do {
echo "Line 1";
echo "Line 2";
}
main = do { echo "Line1"; echo "Line2" }
Storing results
#!/usr/bin/env runhaskell
-- #!/bin/bash
import Turtle --
--
main = do --
dir <- pwd -- DIR=$(pwd)
time <- datefile dir -- TIME=$(date -r $DIR)
print time -- echo $TIME
$ ./example.hs
2015-09-01 23:56:03.245 UTC
Why not this?
main = print(datetime(pwd))
Difference between (=)
and (<-)
(<-)
is overloaded; in this context it means "store the subroutine's result"(=)
is not overloaded; equating two things means they are interchangeable
Example of overloading (<-)
:
Prelude> do { x <- [1, 2]; y <- [3, 4]; return (x, y) }
[(1,3),(1,4),(2,3),(2,4)]
do
/(<-)
/return
is analogous to for
/(<-)
/yield
in Scala:
scala> for { x <- Seq(1, 2); y <- Seq(3, 4) } yield (x, y)
res0: Seq[(Int, Int)] = List((1,3), (1,4), (2,3), (2,4))
... or LINQ/from
/select
in C#:
List<int> xs = new List<int> { 1, 2 }
List<int> ys = new List<int> { 3, 4 }
var result =
from x in xs
from y in ys
select Tuple<int, int>(x, y)
Nesting subroutines
#!/usr/bin/env runhaskell
-- #!/bin/bash
import Turtle --
--
datePwd = do -- datePwd() {
dir <- pwd -- DIR=$(pwd)
result <- datefile dir -- RESULT=$(date -r $DIR)
return result -- echo $RESULT
-- }
main = do --
time <- datePwd -- TIME=$(datePwd)
print time -- echo $TIME
Same result:
$ ./example.hs
2015-09-01 23:56:03.245 UTC
Unnecessary return
You can simplify this:
datePwd = do -- datePwd() {
dir <- pwd -- DIR=$(pwd)
result <- datefile dir -- RESULT=$(date -r $DIR)
return result -- echo $RESULT
-- }
... to this:
datePwd = do -- datePwd() {
dir <- pwd -- DIR=$(pwd)
datefile dir -- date -r $DIR
-- }
The return value of a subroutine is the return value of its last command
return
return
does not break from the surrounding subroutine
return
is just a command whose return value is its argument
do x <- return expr -- X=EXPR
command x -- command $X
-- Same as:
do let x = expr -- X=EXPR
command x -- command $X
-- Same as:
command expr -- command EXPR
return
is the only case where (<-)
and (=)
behave the same way
Single-command subroutines
main = do echo "Hello, world!"
-- Same as:
main = echo "Hello, world!"
do
is only necessary if you want to chain multiple commands together
Exercise
What do you think this code does?
main = do
let x = print 1
print 2
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
Types
What happens if we use print
instead of echo
?
#!/usr/bin/env runhaskell
import Turtle
main = do
dir <- pwd
time <- datefile dir
echo time -- This used to be: print time
$ ./example.hs
example.hs:8:10:
Couldn't match expected type `Text' with actual type `UTCTime'
In the first argument of `echo', namely `time'
In a stmt of a 'do' block: echo time
In the expression:
do { dir <- pwd;
time <- datefile dir;
echo time }
Type-directed development - REPL
main = do
dir <- pwd
time <- datefile dir
echo time -- This used to be: print time
$ ghci -v0
Prelude> import Turtle
Prelude Turtle> :type pwd
pwd :: IO Turtle.FilePath
Prelude Turtle> :type datefile
datefile :: Turtle.FilePath -> IO UTCTime
Prelude Turtle> :type echo
echo :: Text -> IO ()
Prelude Turtle> :type print
print :: Show a => a -> IO ()
Type-directed development - Documentation
Visit:
https://hackage.haskell.org/package/turtle
repr
Use repr
to render a human-readable representation of a value as Text
:
-- This behaves like Python's `repr` function
repr :: Show a => a -> Text
print
is (conceptually) the same as echo
+ repr
:
print x = echo (repr x)
Basic types
Int
Double
Text
(a, b)
[a]
a -> b
IO a
FilePath
ExitCode
UTCTime
Exercise
What are the types of x
, y
, and z
?
(Assume all string literals are Text
and all numeric literals are Int
s)
x = ("123", 4)
y = [2, 3]
z a = 1 + a
Answers
x :: (Text, Int)
x = ("123", 4)
y :: [Int]
y = [2, 3]
z :: Int -> Int
z a = 1 + a
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
Customize ghci
Create a .ghci
file in your current directory that looks like this:
:set -XOverloadedStrings
import Turtle
This automatically runs the above two commands every time you run ghci
ghci
searches the current directory and your home directory for a .ghci
file
Use ghci
like a shell
$ ghci -v0
Prelude Turtle> view (ls ".")
FilePath "/Users/ggonzalez/.bash_history"
FilePath "/Users/ggonzalez/.bash_profile"
FilePath "/Users/ggonzalez/.bashrc"
...
FilePath "/Users/ggonzalez/workspace"
Prelude Turtle> cd "/tmp"
Prelude Turtle> pwd
FilePath "/private/tmp"
Prelude Turtle> touch "foo.txt"
Prelude Turtle> testfile "foo.txt"
True
Prelude Turtle> rm "foo.txt"
Prelude Turtle> testfile "foo.txt"
False
Prelude Turtle> test<TAB>
testdir testfile
Prelude Turtle> testdir "/tmp/<TAB>
.vbox-ggonzalez-ipc
KSOutOfProcessFetcher.0.r55jifrBu08ZlGAfPLYXKgYad4c=
launch-0kuyez
...
sync-dottools.stdout.log
ghci
auto-print
ghci
implicitly print
s any value that is not a subroutine
Prelude Turtle> 2 + 2
4
Prelude Turtle> "123" <> "456" -- (<>) concatenates strings
"123456"
The behavior is the same as if we had explicitly called print
:
Prelude Turtle> print (2 + 2)
4
Prelude Turtle> print ("123" <> "456")
"123456"
Shell commands
Prelude Turtle> shell "true" empty
ExitSuccess
Prelude Turtle> shell "false" empty
ExitFailure 1
Prelude Turtle> shell "ls | wc -l" empty
5
ExitSuccess
Use proc
if you want safer command templating:
Prelude Turtle> -- ls /tmp /usr
Prelude Turtle> proc "ls" ["/tmp", "/usr"] empty
/tmp:
KSOutOfProcessFetcher.0.r55jifrBu08ZlGAfPLYXKgYad4c=
...
/usr:
X11 bin lib local share
X11R6 include libexec sbin standalone
ExitSuccess
Exercise
Within ghci
:
- Create a directory named
dir1
- Rename
dir1
todir2
- Delete
dir2
Answers
Prelude Turtle> mkdir "dir1"
Prelude Turtle> mv "dir1" "dir2"
Prelude Turtle> rmdir "dir2"
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
Type signatures
#!/usr/bin/env runhaskell
import Turtle
-- +----- A subroutine ...
-- |
-- | +-- ... that returns `UTCTime`
-- | |
-- v v
datePwd :: IO UTCTime
datePwd = do
dir <- pwd
datefile dir
-- +----- A subroutine ...
-- |
-- | +-- ... that returns an empty value (i.e. `()`)
-- | |
-- v v
main :: IO ()
main = do
time <- datePwd
print time
Machine-checked documentation
str :: Int -- Oops!
str = "Hello!"
main :: IO ()
main = echo str
$ ./example.hs
example.hs:8:7:
No instance for (IsString Int)
arising from the literal `"Hello, world!"'
Possible fix: add an instance declaration for (IsString Int)
In the expression: "Hello, world!"
In an equation for `str': str = "Hello, world!"
example.hs:11:13:
Couldn't match expected type `Text' with actual type `Int'
In the first argument of `echo', namely `str'
In the expression: echo str
In an equation for `main': main = echo str
OverloadedStrings
Anything that implements IsString
can be represented by a string literal
Examples we've seen so far:
FilePath
Text
- ???
Reverse the error
str :: Text
str = 4
main :: IO ()
main = echo str
$ ./example.hs
example.hs:8:7:
No instance for (Num Text)
arising from the literal `4'
Possible fix: add an instance declaration for (Num Text)
In the expression: 4
In an equation for `str': str = 4
Num
Anything that implements Num
can be represented by a numeric literal
Examples we've seen so far:
Int
Double
- ???
Types clarify documentation
shell
:: Text -- Command line
-> Shell Text -- Standard input (as lines of `Text`)
-> IO ExitCode -- Exit code of the shell command
proc
:: Text -- Program
-> [Text] -- Arguments
-> Shell Text -- Standard input (as lines of `Text`)
-> IO ExitCode -- Exit code of the shell command
Type inference
Haskell (almost always) does not require type annotations
Type signatures are for the benefit of the programmer, not the compiler
Example:
Prelude Turtle> let addAsText x y = repr (x + y)
Prelude Turtle> :type addAsText
addAsText :: (Show a, Num a) => a -> a -> Text
Prelude Turtle> addAsText 2 3
"5"
No need to annotate argument types
No need to specify interfaces
No need to specify generic type parameters
Exercise
Use the compiler to infer the type of this function:
swap (x, y) = (y, x)
Answer
Prelude Turtle> :type swap
swap :: (t1, t) -> (t, t1)
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
Exit codes
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
import Turtle
main = do
let cmd = "false"
x <- shell cmd empty
case x of
ExitSuccess -> return ()
ExitFailure n -> die (cmd <> " failed with exit code: " <> repr n)
This always prints an error message since false
always fails:
$ ./example.hs
example.hs: user error (false failed with exit code: 1)
String formatting
We can replace this:
cmd <> " failed with exit code: " <> repr n
... with printf
-style formatting:
format (s%" failed with exit code: "%d) cmd n
The compiler infers the number and types of arguments from the format string:
Prelude Turtle> :type format (s%" failed with exit code: "%d)
format (s%" failed with exit code: "%d) :: Text -> Int -> Text
Exercise
What do you think this prints out?
Prelude Turtle> format ("A "%s%" string that takes "%d%" arguments") "format" 2
The Format
type
A format string is not Text
!
Prelude Turtle> :type format
format :: Format Text r -> r
So what is going on here?
Prelude Turtle> format "I take 0 arguments"
"I take 0 arguments"
Format
implements IsString
(%) :: Format b c -> Format a b -> Format a c
"A " :: Format a a
s :: Format a (Text -> a)
" string that takes " :: Format a a
d :: Format a (Int -> a)
" arguments" :: Format a a
"A "%s%" string that takes "%d%" arguments" :: Format a (Text -> Int -> a)
format ("A "%s%" string that takes "%d%" arguments") :: Text -> Int -> Text
You can build your own format specifiers!
OverloadedStrings
Examples we've seen so far:
FilePath
Text
Format
- ???
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
Streams
You've already encountered at least one stream: the ls
command
Prelude Turtle> :type ls
ls :: Turtle.FilePath -> Shell Turtle.FilePath
A "Shell a
" is a stream of "a
"s
Streams are not subroutines, so you can't run them directly within ghci
:
Prelude Turtle> ls "/tmp"
<interactive>:2:1:
No instance for (Show (Shell Turtle.FilePath))
arising from a use of `print'
Possible fix:
add an instance declaration for (Show (Shell Turtle.FilePath))
In a stmt of an interactive GHCi command: print it
ghci
tries to print
the Shell
stream, but fails because Shell
does not
implement Show
view
The view
command is the simplest way to display a Shell
stream:
view :: Show a => Shell a -> IO ()
view
prints every element of the stream:
Prelude Turtle> view (ls "/tmp")
FilePath "/tmp/.X11-unix"
FilePath "/tmp/.X0-lock"
FilePath "/tmp/pulse-PKdhtXMmr18n"
FilePath "/tmp/pulse-xHYcZ3zmN3Fv"
FilePath "/tmp/tracker-gabriel"
FilePath "/tmp/pulse-PYi1hSlWgNj2"
FilePath "/tmp/orbit-gabriel"
FilePath "/tmp/ssh-vREYGbWGpiCa"
FilePath "/tmp/.ICE-unix
The empty stream
empty :: Shell a
The empty stream emits nothing:
Prelude Turtle> view empty -- Outputs nothing
Prelude Turtle>
In other words:
view empty = return ()
The singleton stream
return :: a -> Shell a
return
builds a singleton stream that emits exactly one element:
1 :: Int
return 1 :: Shell Int
Prelude Turtle> view (return 1)
1
In other words:
view (return x) = print x
Embedding subroutines
liftIO :: IO a -> Shell a
liftIO
transforms a subroutine into a singleton stream:
pwd :: IO Turtle.FilePath
liftIO pwd :: Shell Turtle.FilePath
Prelude Turtle> view (liftIO pwd)
FilePath "/tmp"
In other words:
view (liftIO io) = do x <- io
print x
Concatenate streams
(<|>) :: Shell a -> Shell a -> Shell a
(<|>)
concatenates two streams together to build a new stream:
Prelude Turtle> view (return 1 <|> return 2)
1
2
In other words:
view (xs <|> ys) = do view xs
view ys
A more complex Shell
stream
Prelude Turtle> view (ls "/tmp" <|> liftIO home <|> ls "/usr" <|> return "/lib")
FilePath "/tmp/.X11-unix"
FilePath "/tmp/.X0-lock"
FilePath "/tmp/pulse-PKdhtXMmr18n"
FilePath "/tmp/pulse-xHYcZ3zmN3Fv"
FilePath "/tmp/tracker-gabriel"
FilePath "/tmp/pulse-PYi1hSlWgNj2"
FilePath "/tmp/orbit-gabriel"
FilePath "/tmp/ssh-vREYGbWGpiCa"
FilePath "/tmp/.ICE-unix"
FilePath "/Users/ggonzalez"
FilePath "/usr/lib"
FilePath "/usr/src"
FilePath "/usr/sbin"
FilePath "/usr/include"
FilePath "/usr/share"
FilePath "/usr/games"
FilePath "/usr/local"
FilePath "/usr/bin"
FilePath "/lib"
Reasoning about streams
view (ls "/tmp" <|> liftIO home <|> ls "/usr" <|> return "/lib")
... is the same as:
do view (ls "/tmp")
dir <- home
print dir
view (ls "/usr")
print "/lib"
Shell
implements IsString
Prelude Turtle> view "123"
"123"
Prelude Turtle> view (return "123") -- Same thing
"123"
Prelude Turtle> view ("123" <|> "456")
"123"
"456"
Prelude Turtle> view (return "123" <|> return "456") -- Same thing
"123"
"456"
OverloadedStrings
Examples seen so far:
FilePath
Text
Format
Shell
- ???
select
You can build a Shell
stream from a list:
select :: [a] -> Shell a
Example:
Prelude Turtle> view (select [1, 2, 3])
1
2
3
Loops
We can use select
to loop within a Shell
:
#!/usr/bin/env runhaskell
-- #!/bin/bash
{-# LANGUAGE OverloadedStrings #-} --
--
import Turtle --
--
example :: Shell () --
example = do --
x <- select [1, 2] -- for x in 1 2; do
y <- select [3, 4] -- for y in 3 4; do
liftIO (print (x, y)) -- echo \(${x},${y}\);
-- done;
main = sh example -- done
This prints every permutation of x
and y
:
$ ./example
(1,3)
(1,4)
(2,3)
(2,4)
The sh
utility
sh
is like view
, except that it doesn't print any elements:
view :: Show a => Shell a -> IO ()
sh :: Shell a -> IO ()
Looping over arbitrary Shell
s
You can loop over things other than select:
Prelude Turtle> -- for file in /tmp/*; do echo $file; done
Prelude Turtle> sh (do file <- ls "/tmp"; liftIO (print file))
FilePath "/tmp/.X11-unix"
FilePath "/tmp/.X0-lock"
FilePath "/tmp/pulse-PKdhtXMmr18n"
FilePath "/tmp/pulse-xHYcZ3zmN3Fv"
FilePath "/tmp/tracker-gabriel"
FilePath "/tmp/pulse-PYi1hSlWgNj2"
FilePath "/tmp/orbit-gabriel"
FilePath "/tmp/ssh-vREYGbWGpiCa"
FilePath "/tmp/.ICE-unix"
In fact, that is how view
is implemented:
view :: Show a => Shell a -> IO ()
view s = sh (do { x <- s; liftIO (print x) })
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
stdout
stdout :: Shell Text -> IO ()
stdout s = sh (do
txt <- s
liftIO (echo txt) )
Standard out writes each Text
element of the stream to a separate line:
Prelude Turtle> stdout "Line 1"
Line 1
Prelude Turtle> stdout ("Line 1" <|> "Line 2")
Line 1
Line 2
stdin
stdin :: Shell Text
stdin
streams lines from standard input:
#!/usr/bin/env runhaskell
-- #!/bin/bash
{-# LANGUAGE OverloadedStrings #-} --
--
import Turtle --
--
main = stdout stdin -- cat
stdin
keeps producing lines until hitting EOF:
$ ./example.hs
ABC<Enter>
ABC
Test<Enter>
Test
42<Enter>
42
<Ctrl-D>
(&)
If you prefer to read left-to-right, you can use the infix (&)
operator:
(&) :: a -> (a -> b) -> b
x & f = f x
main = stdin & stdout
input
and output
input :: FilePath -> Shell Text
output :: FilePath -> Shell Text -> IO ()
Run these examples:
Prelude Turtle> output "file.txt" ("Test" <|> "ABC" <|> "42")
Prelude Turtle> stdout (input "file.txt")
Test
ABC
42
Or left-to-right:
Prelude Turtle> "Test" <|> "ABC" <|> "42" & output "file.txt"
Prelude Turtle> input "file.txt" & stdout
Test
ABC
42
inshell
inshell
:: Text -- Command line
-> Shell Text -- Standard input to feed to program
-> Shell Text -- Standard output produced by program
Prelude Turtle> output "ls.txt" (inshell "ls" empty)
Prelude Turtle> stdout (input "ls.txt")
.X11-unix
.X0-lock
...
.ICE-unix
Turtle Prelude> output "awk.txt" (inshell "awk '{ print $1 }'" "123 456")
Turtle Prelude> stdout (input "awk.txt")
123
inshell
(Left-to-right)
Turtle Prelude> "123 456" & inshell "awk '{ print $1 }'" & output "awk.txt"
Turtle Prelude> input "awk.txt" & stdout
123
inproc
inproc
:: Text -- Program
-> [Text] -- Arguments
-> Shell Text -- Standard input to feed to program
-> Shell Text -- Standard output produced by program
Turtle Prelude> stdout (inproc "awk" ["{ print $1 }"] "123 456")
123
Exercise
Build the following pipeline within the REPL:
- Use
input
to read inexample.hs
- Use
inshell
/inproc
to number the lines with the Unixnl
utility - Use
output
to write the result tonumbered.txt
The result should be equivalent to this Unix command:
$ nl < example.hs > numbered.txt
Answer
Prelude Turtle> input "example.hs" & inproc "nl" [] & output "numbered.txt"
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
Folds
Use a Fold
to reduce the stream to a single value:
Prelude Turtle> import qualified Control.Foldl as Fold
Prelude Turtle Fold> fold (ls "/tmp") Fold.length
9
Prelude Turtle Fold> fold (ls "/tmp") Fold.head
Just (FilePath "/tmp/.X11-unix")
You can combine folds:
Prelude Turtle Fold> let minMax = (,) <$> Fold.minimum <*> Fold.maximum
Prelude Turtle Fold> fold (select [1..10]) minMax
(Just 1,Just 10)
Exercise
What are the types of:
fold
Fold.length
Fold.head
Answer
fold :: Shell a -> Fold a b -> IO b
Fold.length :: Fold a Int
Fold.head :: Fold a (Maybe a)
ls "/tmp" :: Shell Turtle.FilePath
fold :: Shell a -> Fold a b -> IO b
fold (ls "/tmp") :: Fold Turtle.FilePath b -> IO b
fold (ls "/tmp") Fold.length :: IO Int
Fold
implements Num
>>> fold (select [1..10]) Fold.sum
55
>>> fold (select [1..10]) (1 + 2 * Fold.sum)
111
>>> fold (select [1..10]) (Fold.length + Fold.sum)
65
>>> fold (select [1..10]) 5
5
Examples so far:
- Int
- Double
- Fold
Questions?
- Haskell overview
- Subroutines
- Types
- Use
ghci
as a shell - Type signatures
- String formatting
- Streams
- Pipes
- Folds
- Patterns
Patterns
You can transform streams using Unix-like utilities, like grep
:
Prelude Turtle> stdout (input "file.txt")
Test
ABC
42
Prelude Turtle> stdout (grep "ABC" (input "file.txt"))
ABC
However, the first argument of grep
is not a string!
grep :: Pattern a -> Shell Text -> Shell Text
grep
matches against a Pattern
, which implements IsString
Comparison to regular expressions
Here is how to translate regular expression idioms to patterns:
Regex Pattern
========= =========
"string" "string"
. dot
e1 e2 e1 <> e2
e1 | e2 e1 <|> e2
e* star e
e+ plus e
e*? selfless (star e)
e+? selfless (plus e)
e{n} count n e
e? option e
[xyz] oneOf "xyz"
[^xyz] noneOf "xyz"
Pattern examples
Prelude Turtle> -- grep '^[[:digit:]]\+$' file.txt
Prelude Turtle> stdout (grep (plus digit) (input "file.txt"))
42
Prelude Turtle> -- grep '^[[:digit:]]\+\|Test$' file.txt
Prelude Turtle> stdout (grep (plus digit <|> "Test") (input "file.txt"))
Test
42
Patterns match the entire string by default
To match the interior of the string, use has
:
Prelude Turtle> -- grep B file.txt
Prelude Turtle> stdout (grep (has "B") (input "file.txt"))
ABC
prefix
and suffix
match the beginning or end of a string, respectively:
Prelude Turtle> -- grep '^A' file.txt
Prelude Turtle> stdout (grep (prefix "A") (input "file.txt"))
ABC
Prelude Turtle> -- grep 'C$' file.txt
Prelude Turtle> stdout (grep (suffix "C") (input "file.txt"))
ABC
match
match :: Pattern a -> Text -> [a]
Prelude Turtle> match ("can" <|> "cat") "cat"
["cat"]
Prelude Turtle> match ("can" <|> "cat") "dog"
[]
Prelude Turtle> match (decimal `sepBy` ",") "1,2,3"
[[1,2,3]]
Prelude Turtle> match (prefix (decimal `sepBy` ",")) "1,2,3"
[[1,2,3],[1,2],[1],[]]
Patterns can do more than regular expressions
bit :: Pattern Bool
bit = (do { "0"; return False }) <|> (do { "1"; return True })
portableBitMap :: Pattern [[Bool]]
portableBitMap = do
"P1"
spaces1
width <- decimal
spaces1
height <- decimal
count width (count height (do { spaces1; bit }))
Prelude Turtle> match (prefix portableBitMap) "P1\n2 2\n0 0\n1 0\n"
[[[False,False],[True,False]]]
P1
2 2
0 0
1 0
Real parsing example
{-# LANGUAGE OverloadedStrings #-}
import Turtle
import Data.Time
entry :: Text
entry = "2015-03-27 10:25:40+0000 [-] 10.45.209.121 ..."
pattern = do
year <- decimal
"-"
month <- decimal
"-"
day <- decimal
" "
hour <- decimal
":"
minute <- decimal
":"
second <- decimal
let d = fromGregorian year month day
let t = TimeOfDay hour minute second
return (d, t)
Patterns are typed
$ ghci -v0 pattern.hs
*Main Turtle> :type pattern
pattern :: Pattern (Day, TimeOfDay)
*Main Turtle> match (prefix pattern) entry
[(2015-03-27,10:25:40),(2015-03-27,10:25:04)]
Exercise
Create a pattern that parses two integers stored in a string representation of a tuple:
tuple :: Pattern (Int, Int)
tuple = ???
Such that you get this result when you use it:
>>> match tuple "(3,4)"
[(3,4)]
Answer
tuple :: Pattern (Int, Int)
tuple = do
"("
x <- decimal
","
y <- decimal
")"
return (x, y)
Questions?
Backup utility example
{-# LANGUAGE OverloadedStrings #-}
import Turtle
import Prelude hiding (FilePath)
parser = (,) <$> argPath "src" "Source directory"
<*> argPath "dst" "Destination directory"
backup file = do
exists <- testfile file
when exists (do
let backupFile = file <.> "bak"
backup backupFile
mv file backupFile )
main = do
(src, dest) <- options "Backup a directory" parser
sh (do
inFile <- lstree src
Just suffix <- return (stripPrefix src inFile)
let outFile = dest </> suffix
backup outFile
echo (format ("Copying "%fp%" to "%fp) inFile outFile)
cp inFile outFile )
echo "Done!"
Command line usage
$ ./backup --help
Backup a directory
Usage: backup SRC DST
Available options:
-h,--help Show this help text
SRC Source directory
DST Destination directory
$ ./backup a/ b/
Copying a/1 to b/1
Copying a/2 to b/2
$ ls b/
1 2
$ ./backup a/ b/
Copying a/1 to b/1
Copying a/2 to b/2
$ ls b/
1 1.bak 2 2.bak
Conclusions
You can use Haskell as a "better Bash", getting types for free without slow startup times or heavyweight syntax.
Visit https://hackage.haskell.org/package/turtle for more extensive documentation on the shell scripting library we used today