A three-stage program you definitely want to write
Posted on February 14, 2019
Writing programs explicitly in stages gives you guarantees that abstraction will be removed. A guarantee that the optimiser most certainly does not give you.
After spending the majority of my early 20s inside the optimiser, I decided enough was enough and it was time to gain back control over how my programs were partially evaluated.
So in this post I’ll give an example of how I took back control and eliminated two levels of abstraction for an interpreter by writing a program which runs in three stages.
Enter: An applicative interpreter for Hutton’s razor.
data Expr = Val Int | Add Expr Expr
eval :: Applicative m => Expr -> m Int
eval (Val n) = pure n
eval (Add e1 e2) = (+) <$> eval e1 <*> eval e2
Written simply at one level, there are two levels of abstraction which could be failed to be eliminated.
- If we statically know the expression we can eliminate
Expr
. - If we statically know which
Applicative
then we can remove the indirection from the typeclass.
Using typed Template Haskell we’ll work out how to remove both of these layers.
Eliminating the Expression
First we’ll have a look at how to stage the program just to eliminate the expression without discussion the application fragment. This is a two-stage program.
module Two where
import Language.Haskell.TH
data Expr = Val Int | Add Expr Expr
eval :: Expr -> TExpQ Int
eval (Val n) = [|| n ||]
eval (Add e1 e2) = [|| $$(eval e1) + $$(eval e2) ||]
The eval function takes an expression and generates code which unrolls the expression that needs to be evaluated.
Splicing in eval
gives us a chain of additions which are computed at run-time.
$$(eval (Add (Val 1) (Val 2)))
=> 1 + 2
By explicitly separating the program into stages we know that there will be no mention of Expr
in the resulting program.
Eliminating the Applicative Functor
That’s good. Eliminating the Expr
data type was easy. We’ll have to work a bit more to eliminate the applicative.
In the first stage, we will eliminate the expression in the same manner but instead of producing an Int
, we will produce a SynApplicative
which is a syntactic representation of an applicative. This allows us to inspect the structure of the program in the second stage and remove that overhead as well.
data SynApplicative a where
Return :: WithCode a -> SynApplicative a
App :: SynApplicative (a -> b) -> SynApplicative a -> SynApplicative b
data WithCode a = WithCode { _val :: a, _code :: TExpQ a }
WithCode
is a wrapper which pairs a value with a code fragment which was used to produce that value.
If you notice in the earlier example, this wasn’t necessary when it was known that we needed to persist an Int
, as there is a Lift
instance for Int
. However, in general, not all values can be persisted so using WithCode
is more general and flexible, if a bit more verbose.
elimExpr
eliminates the first layer of abstraction and returns code which generates a SynApplicative
.
elimExpr :: Expr -> TExpQ (SynApplicative Int)
elimExpr (Val n) = [|| Return (WithCode n (liftT n)) ||]
elimExpr (Add e1 e2) =
[|| Return (WithCode (+) codePlus)
`App` $$(elimExpr e1)
`App` $$(elimExpr e2) ||]
liftT :: Lift a => a -> TExpQ a
liftT = unsafeTExpCoerce . lift
codePlus = [|| (+) ||]
In the case for Add
we encounter a situation where we would have liked to use nested brackets to persist the value of [|| (+) ||]
. Instead you have to lift it to the top level and then persist that identifier.
Next, it’s time to provide an interpreter to remove the abstraction of the applicative. In order to do this, we need to provide a dictionary which will be used to give the interpretation of the applicative commands.
data ApplicativeDict m =
ApplicativeDict
{ _return :: (forall a . WithCode (a -> m a)),
_ap :: (forall a b . WithCode (m (a -> b) -> m a -> m b))
}
WithCode
is necessary again as it will be used to generate a program so it’s necessary to know how to implement the methods.
elimApplicative
:: SynApplicative a
-> ApplicativeDict m
-> TExpQ (m a)
elimApplicative (Return v) d@ApplicativeDict{..}
= [|| $$(_code _return) $$(_code v) ||]
elimApplicative (App e1 e2) d@ApplicativeDict{..}
= [|| $$(_code _ap) $$(elimApplicative e1 d) $$(elimApplicative e2 d) ||]
This interpretation is very boring as it just amounts to replacing all the constructors with their implementations. However, it is exciting that we have guaranteed the removal of the overhead of the applicative abstraction.
Running the Splice
Now that we’ve written two functions independently to to eliminate the two layers, they need to be combined together. This is the birth of our three-stage program.
import Three
elim :: Identity Int
elim = $$(elimApplicative $$(elimExpr (Add (Val 1) (Val 2))) identityDict)
identityDict = ApplicativeDict{..}
where
_return = WithCode Identity [|| Identity ||]
_ap = WithCode idAp [|| idAp ||]
idAp :: Identity (a -> b) -> Identity a -> Identity b
idAp (Identity f) (Identity a) = Identity (f a)
elim
is the combination of elimApplicative
and elimExpr
. The nested splices indicate that the program is more than two levels.
Using -ddump-splices
we can have a look at the program that gets generated.
Test.hs:10:30-59: Splicing expression
elimExpr (Add (Val 1) (Val 2))
======>
((Return ((WithCode (+)) codePlus)
`App` Return ((WithCode 1) (liftT 1)))
`App` Return ((WithCode 2) (liftT 2)))
Test.hs:10:11-73: Splicing expression
elimApplicative $$(elimExpr (Add (Val 1) (Val 2))) identityDict
======>
(idAp ((idAp (Identity (+))) (Identity 1))) (Identity 2)
Both steps appear in the debug output with the code which was produced at each step. Notice that we had very precise control over what code was generated and that functions like idAp
are not inlined. In this case, the compiler will certainly inline idAp
and so on but in general it might be useful to generate code which contains calls to GHC.Exts.inline
to force even recursive functions to be inlined once.
Conclusion
In general, splitting your program up into stages is quite difficult so mechanisms like type class specialisation will be easier to achieve. In controlled situations though, staging gives you the guarantees you need.