description: Overview of peeling back layers of abstraction in GHC Haskell
first-written: 2015-02-24
last-updated: 2015-02-24
last-reviewed: 2015-02-24
---
The point of this chapter is to help you peel back some of the layers of
abstraction in Haskell coding, with the goal of understanding things like
primitive operations, evaluation order, and mutation. Some concepts covered
here are generally "common knowledge" in the community, while others are less
well understood. The goal is to cover the entire topic in a cohesive manner. If
a specific section seems like it's not covering anything you don't already
know, skim through it and move on to the next one.
While this chapter is called "Primitive Haskell," the topics are very much
GHC-specific. I avoided calling it "Primitive GHC" for fear of people assuming
it was about the internals of GHC itself. To be clear: these topics apply to
anyone compiling their Haskell code using the GHC compiler.
Note that we will not be fully covering all topics here. There is a "further
reading" section at the end of this chapter with links for more details.
## Let's do addition
Let's start with a really simple question: tell me how GHC deals with the
expression `1 + 2`. What *actually* happens inside GHC? Well, that's a bit of a
trick question, since the expression is polymorphic. Let's instead use the more
concrete expression `1 + 2 :: Int`.
The `+` operator is actually a method of [the `Num` type class](http://www.stackage.org/haddock/lts-1.0/base-4.7.0.2/Prelude.html#t:Num), so we need to look at [the `Num Int` instance](http://www.stackage.org/haddock/lts-1.0/base-4.7.0.2/src/GHC-Num.html#Num):
```haskell
instance Num Int where
I# x + I# y = I# (x +# y)
```
Huh... well *that* looks somewhat magical. Now we need to understand both the
`I#` constructor and the `+#` operator (and what's with the hashes all of a
sudden?). If we [do a Hoogle
search](http://www.stackage.org/snapshot/lts-1.0/hoogle?q=I%23), we can easily
> Most types in GHC are boxed, which means that values of that type are
> represented by a pointer to a heap object. The representation of a Haskell
> `Int`, for example, is a two-word heap object. An unboxed type, however, is
> represented by the value itself, no pointers or heap allocation are involved.
See those docs for more information on distinctions between boxed and unboxed
types. It is vital to understand those differences when working with unboxed
values. However, we're not going to go into those details now. Instead, let's
sum up what we've learnt so far:
*`Int` addition is just normal Haskell code in a typeclass
*`Int` itself is a normal Haskell datatype
* GHC provides `Int#` and `+#` as an unboxed `long int` and addition on that type, respectively. This is exported by `GHC.Prim`, but the real implementation is "inside" GHC.
* An `Int` contains an `Int#`, which is an unboxed type.
* Addition of `Int`s takes advantage of the `+#` primop.
## More addition
Alright, we understand basic addition! Let's make things a bit more
complicated. Consider the program:
```haskell
main = do
let x = 1 + 2
y = 3 + 4
print x
print y
```
We know for certain that the program will first print `3`, and then print `7`.
But let me ask you a different question. Which operation will GHC perform
first: `1 + 2` or `3 + 4`? If you guessed `1 + 2`, you're *probably* right, but
not necessarily! Thanks to referential transparency, GHC is fully within its
rights to rearrange evaluation of those expressions and add `3 + 4` before
`1 + 2`. Since neither expression depends on the result of the other, we
know that it is irrelevant which evaluation occurs first.
Note: This is covered in much more detail on the GHC wiki's [evaluation order
That begs the question: if GHC is free to rearrange evaluation like that, how
could I say in the previous paragraph that the program will always print `3`
before printing `7`? After all, it doesn't appear that `print y` uses the
result of `print x` at all, so we not rearrange the calls? To answer that, we
again need to unwrap some layers of abstraction. First, let's evaluate and
inline `x` and `y` and get rid of the `do`-notation sugar. We end up with the
program:
```haskell
main = print 3 >> print 7
```
We know that `print 3` and `print 7` each have type `IO ()`, so the `>>` operator being used comes from the `Monad IO` instance. Before we can understand that, though, we need to look at [the definition of `IO` itself](http://www.stackage.org/haddock/lts-1.0/ghc-prim-0.3.1.0/src/GHC-Types.html#IO)
```haskell
newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
```
We have a few things to understand about this line. Firstly,