parsec-presentation/parsec.html
Yann Esposito (Yogsototh) 3387bb886c better git ignore + update
2013-10-07 22:34:27 +02:00

371 lines
19 KiB
HTML

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="viewport" content="width=1024, user-scalable=no">
<title>Parsec Presentation by Yann Esposito</title>
<!-- Required stylesheet -->
<link rel="stylesheet" href="core/deck.core.css">
<!-- Extension CSS files go here. Remove or add as needed. -->
<link rel="stylesheet" href="extensions/goto/deck.goto.css">
<link rel="stylesheet" href="extensions/menu/deck.menu.css">
<link rel="stylesheet" href="extensions/navigation/deck.navigation.css">
<link rel="stylesheet" href="extensions/status/deck.status.css">
<link rel="stylesheet" href="extensions/hash/deck.hash.css">
<!-- <link rel="stylesheet" href="extensions/scale/deck.scale.css"> -->
<!-- Transition theme. More available in /themes/transition/ or create your own. -->
<!-- <link rel="stylesheet" href="themes/transition/fade.css"> -->
<!-- Style theme. More available in /themes/style/ or create your own. -->
<!-- <link rel="stylesheet" href="themes/style/web-2.0.css"> -->
<link rel="stylesheet" href="themes/style/y/main.css" />
<link rel="stylesheet" href="themes/style/y/solarized.css" />
<!-- Required Modernizr file -->
<script src="modernizr.custom.js"></script>
<script>
function gofullscreen(){
var body=document.getElementById('body');
try {
body.requestFullScreen();
} catch(err) {
try {
body.webkitRequestFullScreen();
} catch(err) {
body.mozRequestFullScreen();
}
}
return false;
}
</script>
</head>
<body id="body" class="deck-container">
<div style="display:none">
\(\newcommand{\F}{\mathbf{F}}\)
\(\newcommand{\E}{\mathbf{E}}\)
\(\newcommand{\C}{\mathcal{C}}\)
\(\newcommand{\D}{\mathcal{D}}\)
\(\newcommand{\id}{\mathrm{id}}\)
\(\newcommand{\ob}[1]{\mathrm{ob}(#1)}\)
\(\newcommand{\hom}[1]{\mathrm{hom}(#1)}\)
\(\newcommand{\Set}{\mathbf{Set}}\)
\(\newcommand{\Mon}{\mathbf{Mon}}\)
\(\newcommand{\Vec}{\mathbf{Vec}}\)
\(\newcommand{\Grp}{\mathbf{Grp}}\)
\(\newcommand{\Rng}{\mathbf{Rng}}\)
\(\newcommand{\ML}{\mathbf{ML}}\)
\(\newcommand{\Hask}{\mathbf{Hask}}\)
\(\newcommand{\Cat}{\mathbf{Cat}}\)
\(\newcommand{\fmap}{\mathtt{fmap}}\)
</div>
<!-- Begin slides. Just make elements with a class of slide. -->
<section class="slide">
<div style="text-align:center; position:absolute; top: 2em; font-size: .9em; width: 100%">
<h1 style="position: relative;">Parsec</h1>
<author><em class="base01">by</em> Yann Esposito</author>
<div style="font-size:.5em">
<twitter>
<a href="http://twitter.com/yogsototh">@yogsototh</a>,
</twitter>
<googleplus>
<a href="https://plus.google.com/117858550730178181663">+yogsototh</a>
</googleplus>
</div>
<div class="base01" style="font-size: .5em; font-weight: 400; font-variant:italic">
<div class="button" style="margin: .5em auto;border: solid 2px; padding: 5px; width: 8em; border-radius: 1em; background:rgba(255,255,255,0.05);" onclick="javascript:gofullscreen();">ENTER FULLSCREEN</div>
HTML presentation: use arrows, space to navigate.
</div>
</div>
</section>
<section class="slide">
<h2 id="parsing">Parsing</h2>
<p>Latin pars (ōrātiōnis), meaning part (of speech).</p>
<ul>
<li><strong>analysing a string of symbols</strong></li>
<li><strong>formal grammar</strong>.</li>
</ul>
</section>
<section class="slide">
<h2 id="parsing-in-programming-languages">Parsing in Programming Languages</h2>
<p>Complexity:</p>
<table>
<thead>
<tr class="header">
<th align="left">Method</th>
<th align="left">Typical Example</th>
<th align="left">Output Data Structure</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">Splitting</td>
<td align="left">CSV</td>
<td align="left">Array, Map</td>
</tr>
<tr class="even">
<td align="left">Regexp</td>
<td align="left">email</td>
<td align="left">+ Fixed Layout Tree</td>
</tr>
<tr class="odd">
<td align="left">Parser</td>
<td align="left">Programming language</td>
<td align="left">+ Most Data Structure</td>
</tr>
</tbody>
</table>
</section>
<section class="slide">
<h2 id="parser-culture">Parser <span class="and">&amp;</span> culture</h2>
<p>In Haskell Parser are really easy to use.</p>
<p>Generally:</p>
<ul>
<li>In most languages: <strong>split</strong> then <strong>regexp</strong> then <strong>parse</strong></li>
<li>In Haskell: <strong>split</strong> then <strong>parse</strong></li>
</ul>
</section>
<section class="slide">
<h2 id="parsing-example">Parsing Example</h2>
<p>From String:</p>
<pre class="sourceCode haskell"><code class="sourceCode haskell">(<span class="dv">1</span><span class="fu">+</span><span class="dv">3</span>)<span class="fu">*</span>(<span class="dv">1</span><span class="fu">+</span><span class="dv">5</span><span class="fu">+</span><span class="dv">9</span>)</code></pre>
<p>To data structure:</p>
<p><img src="parsec/img/mp/AST.png" alt="AST" /><br /></p>
</section>
<section class="slide">
<h2 id="parsec">Parsec</h2>
<blockquote>
<p>Parsec lets you construct parsers by combining high-order Combinators to create larger expressions.</p>
<p>Combinator parsers are written and used within the same programming language as the rest of the program.</p>
<p>The parsers are first-class citizens of the languages [...]&quot;</p>
</blockquote>
</section>
<section class="slide">
<h2 id="parser-libraries">Parser Libraries</h2>
<p>In reality there are many choices:</p>
<table>
<tbody>
<tr class="odd">
<td align="left">attoparsec</td>
<td align="left">fast</td>
</tr>
<tr class="even">
<td align="left">Bytestring-lexing</td>
<td align="left">fast</td>
</tr>
<tr class="odd">
<td align="left">Parsec 3</td>
<td align="left">powerful, nice error reporting</td>
</tr>
</tbody>
</table>
</section>
<section class="slide">
<h2 id="haskell-remarks-1">Haskell Remarks (1)</h2>
<p>spaces are meaningful</p>
<pre class="sourceCode haskell"><code class="sourceCode haskell">f x <span class="co">-- ⇔ f(x) in C-like languages</span>
f x y <span class="co">-- ⇔ f(x,y)</span></code></pre>
</section>
<section class="slide">
<h2 id="haskell-remarks-2">Haskell Remarks (2)</h2>
<p>Don't mind strange operators (<code>&lt;*&gt;</code>, <code>&lt;$&gt;</code>).<br />Consider them like separators, typically commas.<br />They are just here to deal with types.</p>
<p>Informally:</p>
<pre class="sourceCode haskell"><code class="sourceCode haskell">toto <span class="fu">&lt;$&gt;</span> x <span class="fu">&lt;*&gt;</span> y <span class="fu">&lt;*&gt;</span> z
<span class="co">-- ⇔ toto x y z</span>
<span class="co">-- ⇔ toto(x,y,z) in C-like languages</span></code></pre>
</section>
<section class="slide">
<h2 id="a-parsec-example">A Parsec Example</h2>
<pre class="sourceCode haskell"><code class="sourceCode haskell">whitespaces <span class="fu">=</span> many (oneOf <span class="st">&quot;\t &quot;</span>)
number <span class="fu">=</span> many1 digit
symbol <span class="fu">=</span> oneOf <span class="st">&quot;!#$%<span class="and">&amp;</span>|*+-/:&lt;=&gt;?@^_~&quot;</span></code></pre>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="st">&quot; \t &quot;</span> <span class="co">-- whitespaces on &quot; \t &quot;</span>
<span class="st">&quot;&quot;</span> <span class="co">-- whitespaces on &quot;32&quot;</span>
<span class="st">&quot;32&quot;</span> <span class="co">-- number on &quot;32&quot;</span>
<span class="co">-- number on &quot; \t 32 &quot;</span>
<span class="st">&quot;number&quot;</span> (line <span class="dv">1</span>, column <span class="dv">1</span>)<span class="fu">:</span>
unexpected <span class="st">&quot; &quot;</span>
expecting digit</code></pre>
</section>
<section class="slide">
<h2 id="comparison-with-regexp-parsec">Comparison with Regexp (Parsec)</h2>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">IP</span> <span class="fu">=</span> <span class="dt">IP</span> <span class="dt">Word8</span> <span class="dt">Word8</span> <span class="dt">Word8</span> <span class="dt">Word8</span>
ip <span class="fu">=</span> <span class="dt">IP</span> <span class="fu">&lt;$&gt;</span>
number <span class="fu">&lt;*</span> char <span class="ch">&#39;.&#39;</span> <span class="fu">&lt;*&gt;</span>
number <span class="fu">&lt;*</span> char <span class="ch">&#39;.&#39;</span> <span class="fu">&lt;*&gt;</span>
number <span class="fu">&lt;*</span> char <span class="ch">&#39;.&#39;</span> <span class="fu">&lt;*&gt;</span>
number
number <span class="fu">=</span> <span class="kw">do</span>
x <span class="ot">&lt;-</span> read <span class="fu">&lt;$&gt;</span> many1 digit
guard (<span class="dv">0</span> <span class="fu">&lt;=</span> x <span class="fu"><span class="and">&amp;</span><span class="and">&amp;</span></span> x <span class="fu">&lt;</span> <span class="dv">256</span>)
return (fromIntegral x)</code></pre>
</section>
<section class="slide">
<h2 id="comparison-with-regexp-perl-regexp">Comparison with Regexp (Perl Regexp)</h2>
<pre class="sourceCode perl"><code class="sourceCode perl"><span class="co"># remark: 888.999.999.999 is accepted</span>
\b\d{<span class="dv">1</span>,<span class="dv">3</span>}\.\d{<span class="dv">1</span>,<span class="dv">3</span>}\.\d{<span class="dv">1</span>,<span class="dv">3</span>}\.\d{<span class="dv">1</span>,<span class="dv">3</span>}\b
<span class="co"># exact but difficult to read</span>
\b(?:(?:<span class="dv">25</span>[<span class="dv">0-5</span>]|<span class="dv">2</span>[<span class="dv">0-4</span>][<span class="dv">0-9</span>]|[<span class="dv">01</span>]?[<span class="dv">0-9</span>][<span class="dv">0-9</span>]?)\.){<span class="dv">3</span>}
(?:<span class="dv">25</span>[<span class="dv">0-5</span>]|<span class="dv">2</span>[<span class="dv">0-4</span>][<span class="dv">0-9</span>]|[<span class="dv">01</span>]?[<span class="dv">0-9</span>][<span class="dv">0-9</span>]?)\b</code></pre>
<p>Also, regexp are <em>unityped</em> by nature.</p>
</section>
<section class="slide">
<h2 id="monadic-style">Monadic style</h2>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="ot">number ::</span> <span class="dt">Parser</span> <span class="dt">String</span>
number <span class="fu">=</span> many1 digit
<span class="ot">number&#39; ::</span> <span class="dt">Parser</span> <span class="dt">Int</span>
number&#39; <span class="fu">=</span> <span class="kw">do</span>
<span class="co">-- digitString :: String</span>
digitString <span class="ot">&lt;-</span> many1 digit
return (read digitString)</code></pre>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="st">&quot;32&quot;</span><span class="ot"> ::</span> [<span class="dt">Char</span>] <span class="co">-- number on &quot;32&quot;</span>
<span class="dv">32</span><span class="ot"> ::</span> <span class="dt">Int</span> <span class="co">-- number&#39; on &quot;32&quot;</span></code></pre>
</section>
<section class="slide">
<h2 id="combining-monadic-style-s-asb-ε">Combining Monadic style (S = aSb | ε)</h2>
<pre class="sourceCode haskell"><code class="sourceCode haskell">s <span class="fu">=</span> (<span class="kw">do</span>
a <span class="ot">&lt;-</span> string <span class="st">&quot;a&quot;</span>
mid <span class="ot">&lt;-</span> s
b <span class="ot">&lt;-</span> string <span class="st">&quot;b&quot;</span>
return (a <span class="fu">++</span> mid <span class="fu">++</span> b)
<span class="fu">&lt;|&gt;</span> string <span class="st">&quot;&quot;</span></code></pre>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="st">&quot;&quot;</span> <span class="co">-- s on &quot;&quot;</span>
<span class="st">&quot;aaabbb&quot;</span> <span class="co">-- s on &quot;aaabbb&quot;</span>
<span class="st">&quot;aabb&quot;</span> <span class="co">-- s on &quot;aabbb&quot;</span>
<span class="co">-- s on &quot;aaabb&quot;</span>
<span class="dt">S</span> (line1 <span class="dv">1</span>, column <span class="dv">4</span>)<span class="fu">:</span>
unexpected end <span class="kw">of</span> input
expecting <span class="st">&quot;b&quot;</span></code></pre>
</section>
<section class="slide">
<h2 id="combining-applicative-style-s-asb-ε">Combining Applicative style (S = aSb | ε)</h2>
<pre class="sourceCode haskell"><code class="sourceCode haskell">s <span class="fu">=</span> concat3 <span class="fu">&lt;$&gt;</span> string <span class="st">&quot;a&quot;</span> <span class="fu">&lt;*&gt;</span> s <span class="fu">&lt;*&gt;</span> char <span class="st">&quot;b&quot;</span>
<span class="fu">&lt;|&gt;</span> string <span class="st">&quot;&quot;</span>
<span class="kw">where</span>
concat3 x y z <span class="fu">=</span> x <span class="fu">++</span> y <span class="fu">++</span> z</code></pre>
</section>
<section class="slide">
<h2 id="applicative-style-usefull-with-data-types">Applicative Style usefull with Data types</h2>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">IP</span> <span class="fu">=</span> <span class="dt">IP</span> <span class="dt">Int</span> <span class="dt">Int</span> <span class="dt">Int</span> <span class="dt">Int</span>
parseIP <span class="fu">=</span> <span class="dt">IP</span> <span class="fu">&lt;$&gt;</span>
number <span class="fu">&lt;*</span> char <span class="ch">&#39;.&#39;</span> <span class="fu">&lt;*&gt;</span>
number <span class="fu">&lt;*</span> char <span class="ch">&#39;.&#39;</span> <span class="fu">&lt;*&gt;</span>
number <span class="fu">&lt;*</span> char <span class="ch">&#39;.&#39;</span> <span class="fu">&lt;*&gt;</span>
number
monadicParseIP <span class="fu">=</span> <span class="kw">do</span>
d1 <span class="ot">&lt;-</span> number
char <span class="ch">&#39;.&#39;</span>
d2 <span class="ot">&lt;-</span> number
char <span class="ch">&#39;.&#39;</span>
d3 <span class="ot">&lt;-</span> number
char <span class="ch">&#39;.&#39;</span>
d4 <span class="ot">&lt;-</span> number
return (<span class="dt">IP</span> d1 d2 d3 d4)</code></pre>
</section>
<section class="slide">
<h2 id="write-number-correctly">Write number correctly</h2>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="ot">number ::</span> <span class="dt">Parser</span> <span class="dt">Int</span>
number <span class="fu">=</span> <span class="kw">do</span>
x <span class="ot">&lt;-</span> read <span class="fu">&lt;$&gt;</span> many1 digit
guard (<span class="dv">0</span> <span class="fu">&lt;=</span> x <span class="fu"><span class="and">&amp;</span><span class="and">&amp;</span></span> x <span class="fu">&lt;</span> <span class="dv">256</span>) <span class="fu">&lt;?&gt;</span>
<span class="st">&quot;Number between 0 and 255 (here &quot;</span> <span class="fu">++</span> show x <span class="fu">++</span> <span class="st">&quot;)&quot;</span>
return (fromIntegral x)</code></pre>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="fu">&gt;&gt;&gt;</span> test parseIP <span class="st">&quot;parseIP&quot;</span> <span class="st">&quot;823.32.80.113&quot;</span>
<span class="st">&quot;parseIP&quot;</span> (line <span class="dv">1</span>, column <span class="dv">4</span>)<span class="fu">:</span>
unexpected <span class="st">&quot;.&quot;</span>
expecting digit or <span class="dt">Number</span> between <span class="dv">0</span> and <span class="dv">255</span> (here <span class="dv">823</span>)</code></pre>
</section>
<section class="slide">
<h2 id="so">So</h2>
<ul>
<li>combination of simple parsers</li>
<li>error messages with <code>(&lt;?&gt;)</code></li>
<li>embed result in data type using Applicative style</li>
<li>Not shown, use another monad with the parser</li>
</ul>
<p>Time to do something cool</p>
</section>
<section class="slide">
<h2 id="a-simple-dsl">A Simple DSL</h2>
<p>Let's write a minimal DSL</p>
</section>
<section class="slide">
<h2 id="useful-definitions">Useful definitions</h2>
<p><code>try</code> tries to parse and backtracks if it fails.</p>
<pre class="sourceCode haskell"><code class="sourceCode haskell">(<span class="fu">&lt;||&gt;</span>) parser1 parser2 <span class="fu">=</span> try parser1 <span class="fu">&lt;|&gt;</span> parser2</code></pre>
<p><code>lexeme</code>, just skip spaces.</p>
<pre class="sourceCode haskell"><code class="sourceCode haskell">lexeme parser <span class="fu">=</span> whitespaces <span class="fu">*&gt;</span> parser <span class="fu">&lt;*</span> whitespaces</code></pre>
</section>
<section class="slide">
<h2 id="data-structure">Data Structure</h2>
<p>Remember from text to data structure</p>
<pre class="sourceCode haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">Tree</span> <span class="fu">=</span> <span class="dt">Node</span> <span class="dt">Element</span> [<span class="dt">Tree</span>]</code></pre>
</section>
<!-- End slides. -->
<!-- Begin extension snippets. Add or remove as needed. -->
<!-- deck.navigation snippet -->
<a href="#" class="deck-prev-link" title="Previous">&#8592;</a>
<a href="#" class="deck-next-link" title="Next">&#8594;</a>
<!-- deck.status snippet -->
<p class="deck-status">
<span class="deck-status-current"></span>
/
<span class="deck-status-total"></span>
</p>
<!-- deck.goto snippet -->
<form action="." method="get" class="goto-form">
<label for="goto-slide">Go to slide:</label>
<input type="text" name="slidenum" id="goto-slide" list="goto-datalist">
<datalist id="goto-datalist"></datalist>
<input type="submit" value="Go">
</form>
<!-- deck.hash snippet -->
<a href="." title="Permalink to this slide" class="deck-permalink">#</a>
<!-- End extension snippets. -->
<!-- Required JS files. -->
<script src="jquery-1.7.2.min.js"></script>
<script src="core/deck.core.js"></script>
<!-- Extension JS files. Add or remove as needed. -->
<script src="core/deck.core.js"></script>
<script src="extensions/hash/deck.hash.js"></script>
<script src="extensions/menu/deck.menu.js"></script>
<script src="extensions/goto/deck.goto.js"></script>
<script src="extensions/status/deck.status.js"></script>
<script src="extensions/navigation/deck.navigation.js"></script>
<!-- <script src="extensions/scale/deck.scale.js"></script> -->
<!-- Initialize the deck. You can put this in an external file if desired. -->
<script>
$(function() {
$.deck('.slide');
});
</script>
<!-- Y theme -->
<script src="js/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script src="js/highlight/highlight.pack.js"></script>
<script>
hljs.initHighlightingOnLoad();
</script>
</body>
</html>