Each rule has a name, enclosed in double quotes. The name itself has
no significance at all. It is only used when reporting how many times the rule fired.
There may be zero or more rules in a RULES pragma.
Layout applies in a RULES pragma. Currently no new indentation level
is set, so you must lay out your rules starting in the same column as the
enclosing definitions.
Each variable mentioned in a rule must either be in scope (e.g. map),
or bound by the forall (e.g. f, g, xs). The variables bound by
the forall are called the pattern variables. They are separated
by spaces, just like in a type forall.
A pattern variable may optionally have a type signature.
If the type of the pattern variable is polymorphic, it must have a type signature.
For example, here is the foldr/build rule:
"fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) .
foldr k z (build g) = g k z
Since g has a polymorphic type, it must have a type signature.
The left hand side of a rule must consist of a top-level variable applied
to arbitrary expressions. For example, this is not OK:
"wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1
"wrong2" forall f. f True = True
In "wrong1", the LHS is not an application; in "wrong2", the LHS has a pattern variable
in the head.
A rule does not need to be in the same module as (any of) the
variables it mentions, though of course they need to be in scope.
Rules are automatically exported from a module, just as instance declarations are.
Rules are regarded as left-to-right rewrite rules.
When GHC finds an expression that is a substitution instance of the LHS
of a rule, it replaces the expression by the (appropriately-substituted) RHS.
By "a substitution instance" we mean that the LHS can be made equal to the
expression by substituting for the pattern variables.
The LHS and RHS of a rule are typechecked, and must have the
same type.
GHC makes absolutely no attempt to verify that the LHS and RHS
of a rule have the same meaning. That is undecideable in general, and
infeasible in most interesting cases. The responsibility is entirely the programmer's!
GHC makes no attempt to make sure that the rules are confluent or
terminating. For example:
"loop" forall x,y. f x y = f y x
This rule will cause the compiler to go into an infinite loop.
If more than one rule matches a call, GHC will choose one arbitrarily to apply.
GHC currently uses a very simple, syntactic, matching algorithm
for matching a rule LHS with an expression. It seeks a substitution
which makes the LHS and expression syntactically equal modulo alpha
conversion. The pattern (rule), but not the expression, is eta-expanded if
necessary. (Eta-expanding the epression can lead to laziness bugs.)
But not beta conversion (that's called higher-order matching).
Matching is carried out on GHC's intermediate language, which includes
type abstractions and applications. So a rule only matches if the
types match too. See Section 7.7.4 below.
GHC keeps trying to apply the rules as it optimises the program.
For example, consider:
let s = map f
t = map g
in
s (t xs)
The expression s (t xs) does not match the rule "map/map", but GHC
will substitute for s and t, giving an expression which does match.
If s or t was (a) used more than once, and (b) large or a redex, then it would
not be substituted, and the rule would not fire.
In the earlier phases of compilation, GHC inlines nothing
that appears on the LHS of a rule, because once you have substituted
for something you can't match against it (given the simple minded
matching). So if you write the rule
"map/map" forall f,g. map f . map g = map (f.g)
this won't match the expression map f (map g xs).
It will only match something written with explicit use of ".".
Well, not quite. It will match the expression
wibble f g xs
where wibble is defined:
wibble f g = map f . map g
because wibble will be inlined (it's small).
Later on in compilation, GHC starts inlining even things on the
LHS of rules, but still leaves the rules enabled. This inlining
policy is controlled by the per-simplification-pass flag -finline-phasen.
All rules are implicitly exported from the module, and are therefore
in force in any module that imports the module that defined the rule, directly
or indirectly. (That is, if A imports B, which imports C, then C's rules are
in force when compiling A.) The situation is very similar to that for instance
declarations.
The RULES mechanism is used to implement fusion (deforestation) of common list functions.
If a "good consumer" consumes an intermediate list constructed by a "good producer", the
intermediate list should be eliminated entirely.
The following are good producers:
List comprehensions
Enumerations of Int and Char (e.g. ['a'..'z']).
Explicit lists (e.g. [True, False])
The cons constructor (e.g 3:4:[])
++
map
filter
iterate, repeat
zip, zipWith
The following are good consumers:
List comprehensions
array (on its second argument)
length
++ (on its first argument)
foldr
map
filter
concat
unzip, unzip2, unzip3, unzip4
zip, zipWith (but on one argument only; if both are good producers, zip
will fuse with one but not the other)
partition
head
and, or, any, all
sequence_
msum
sortBy
So, for example, the following should generate no intermediate lists:
array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]
This list could readily be extended; if there are Prelude functions that you use
a lot which are not included, please tell us.
If you want to write your own good consumers or producers, look at the
Prelude definitions of the above functions to see how to do so.
This told GHC to use int8ToInt16 instead of fromIntegral whenever
the latter was called with type Int8 -> Int16. That is, rather than
specialising the original definition of fromIntegral the programmer is
promising that it is safe to use int8ToInt16 instead.
This feature is no longer in GHC. But rewrite rules let you do the
same thing:
This slightly odd-looking rule instructs GHC to replace fromIntegral
by int8ToInt16whenever the types match. Speaking more operationally,
GHC adds the type and dictionary applications to get the typed rule
What is more,
this rule does not need to be in the same file as fromIntegral,
unlike the SPECIALISE pragmas which currently do (so that they
have an original definition available to specialise).
Use -ddump-rules to see what transformation rules GHC is using.
Use -ddump-simpl-stats to see what rules are being fired.
If you add -dppr-debug you get a more detailed listing.
The defintion of (say) build in PrelBase.lhs looks llike this:
build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
{-# INLINE build #-}
build g = g (:) []
Notice the INLINE! That prevents (:) from being inlined when compiling
PrelBase, so that an importing module will “see” the (:), and can
match it on the LHS of a rule. INLINE prevents any inlining happening
in the RHS of the INLINE thing. I regret the delicacy of this.
In ghc/lib/std/PrelBase.lhs look at the rules for map to
see how to write rules that will do fusion and yet give an efficient
program even if fusion doesn't happen. More rules in PrelList.lhs.