With the -fglasgow-exts flag, GHC lets you declare
a data type with no constructors. For example:
data S -- S :: *
data T a -- T :: * -> *
Syntactically, the declaration lacks the "= constrs" part. The
type can be parameterised over types of any kind, but if the kind is
not * then an explicit kind annotation must be used
(see Section 7.3.3).
Such data types have only one value, namely bottom.
Nevertheless, they can be useful when defining "phantom types".
GHC allows type constructors to be operators, and to be written infix, very much
like expressions. More specifically:
A type constructor can be an operator, beginning with a colon; e.g. :*:.
The lexical syntax is the same as that for data constructors.
Types can be written infix. For example Int :*: Bool.
Back-quotes work
as for expressions, both for type constructors and type variables; e.g. Int `Either` Bool, or
Int `a` Bool. Similarly, parentheses work the same; e.g. (:*:) Int Bool.
Fixities may be declared for type constructors just as for data constructors. However,
one cannot distinguish between the two in a fixity declaration; a fixity declaration
sets the fixity for a data constructor and the corresponding type constructor. For example:
infixl 7 T, :*:
sets the fixity for both type constructor T and data constructor T,
and similarly for :*:.
Int `a` Bool.
Function arrow is infixr with fixity 0. (This might change; I'm not sure what it should be.)
Data type and type-synonym declarations can be written infix. E.g.
data a :*: b = Foo a b
type a :+: b = Either a b
The only thing that differs between operators in types and operators in expressions is that
ordinary non-constructor operators, such as + and *
are not allowed in types. Reason: the uniform thing to do would be to make them type
variables, but that's not very useful. A less uniform but more useful thing would be to
allow them to be type constructors. But that gives trouble in export
lists. So for now we just exclude them.
Haskell infers the kind of each type variable. Sometimes it is nice to be able
to give the kind explicitly as (machine-checked) documentation,
just as it is nice to give a type signature for a function. On some occasions,
it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999)
John Hughes had to define the data type:
data Set cxt a = Set [a]
| Unused (cxt a -> ())
The only use for the Unused constructor was to force the correct
kind for the type variable cxt.
GHC now instead allows you to specify the kind of a type variable directly, wherever
a type variable is explicitly bound. Namely:
data declarations:
data Set (cxt :: * -> *) a = Set [a]
type declarations:
type T (f :: * -> *) = f Int
class declarations:
class (Eq a) => C (f :: * -> *) a where ...
forall's in type signatures:
f :: forall (cxt :: * -> *). Set cxt Int
The parentheses are required. Some of the spaces are required too, to
separate the lexemes. If you write (f::*->*) you
will get a parse error, because "::*->*" is a
single lexeme in Haskell.
As part of the same extension, you can put kind annotations in types
as well. Thus:
f :: (Int :: *) -> Int
g :: forall a. a -> (a :: *)
This section documents GHC's implementation of multi-parameter type
classes. There's lots of background in the paper Type
classes: exploring the design space (Simon Peyton Jones, Mark
Jones, Erik Meijer).
I'd like to thank people who reported shorcomings in the GHC 3.02
implementation. Our default decisions were all conservative ones, and
the experience of these heroic pioneers has given useful concrete
examples to support several generalisations. (These appear below as
design choices not implemented in 3.02.)
I've discussed these notes with Mark Jones, and I believe that Hugs
will migrate towards the same design choices as I outline here.
Thanks to him, and to many others who have offered very useful
feedback.
There are the following restrictions on the form of a qualified
type:
forall tv1..tvn (c1, ...,cn) => type
(Here, I write the "foralls" explicitly, although the Haskell source
language omits them; in Haskell 1.4, all the free type variables of an
explicit source-language type signature are universally quantified,
except for the class type variables in a class declaration. However,
in GHC, you can give the foralls if you want. See Section 7.3.9).
Each universally quantified type variable
tvi must be mentioned (i.e. appear free) in type.
The reason for this is that a value with a type that does not obey
this restriction could not be used without introducing
ambiguity. Here, for example, is an illegal type:
forall a. Eq a => Int
When a value with this type was used, the constraint Eq tv
would be introduced where tv is a fresh type variable, and
(in the dictionary-translation implementation) the value would be
applied to a dictionary for Eq tv. The difficulty is that we
can never know which instance of Eq to use because we never
get any more information about tv.
Every constraint ci must mention at least one of the
universally quantified type variables tvi.
For example, this type is OK because C a b mentions the
universally quantified type variable b:
forall a. C a b => burble
The next type is illegal because the constraint Eq b does not
mention a:
forall a. Eq b => burble
The reason for this restriction is milder than the other one. The
excluded types are never useful or necessary (because the offending
context doesn't need to be witnessed at this point; it can be floated
out). Furthermore, floating them out increases sharing. Lastly,
excluding them is a conservative choice; it leaves a patch of
territory free in case we need it later.
These restrictions apply to all types, whether declared in a type signature
or inferred.
Unlike Haskell 1.4, constraints in types do not have to be of
the form (class type-variables). Thus, these type signatures
are perfectly OK
f :: Eq (m a) => [m a] -> [m a]
g :: Eq [a] => ...
This choice recovers principal types, a property that Haskell 1.4 does not have.
Multi-parameter type classes are permitted. For example:
class Collection c a where
union :: c a -> c a -> c a
...etc.
The class hierarchy must be acyclic. However, the definition
of "acyclic" involves only the superclass relationships. For example,
this is OK:
class C a where {
op :: D b => a -> b -> b
}
class C a => D a where { ... }
Here, C is a superclass of D, but it's OK for a
class operation op of C to mention D. (It
would not be OK for D to be a superclass of C.)
There are no restrictions on the context in a class declaration
(which introduces superclasses), except that the class hierarchy must
be acyclic. So these class declarations are OK:
class Functor (m k) => FiniteMap m k where
...
class (Monad m, Monad (t m)) => Transform t m where
lift :: m a -> (t m) a
In the signature of a class operation, every constraint
must mention at least one type variable that is not a class type
variable.
Thus:
class Collection c a where
mapC :: Collection c b => (a->b) -> c a -> c b
is OK because the constraint (Collection a b) mentions
b, even though it also mentions the class variable
a. On the other hand:
class C a where
op :: Eq a => (a,b) -> (a,b)
is not OK because the constraint (Eq a) mentions on the class
type variable a, but not b. However, any such
example is easily fixed by moving the offending context up to the
superclass context:
class Eq a => C a where
op ::(a,b) -> (a,b)
A yet more relaxed rule would allow the context of a class-op signature
to mention only class type variables. However, that conflicts with
Rule 1(b) for types above.
The type of each class operation must mention all of
the class type variables. For example:
class Coll s a where
empty :: s
insert :: s -> a -> s
is not OK, because the type of empty doesn't mention
a. This rule is a consequence of Rule 1(a), above, for
types, and has the same motivation.
Sometimes, offending class declarations exhibit misunderstandings. For
example, Coll might be rewritten
class Coll s a where
empty :: s a
insert :: s a -> a -> s a
which makes the connection between the type of a collection of
a's (namely (s a)) and the element type a.
Occasionally this really doesn't work, in which case you can split the
class like this:
class CollE s where
empty :: s
class CollE s => Coll s a where
insert :: s -> a -> s
Instance declarations may not overlap. The two instance
declarations
instance context1 => C type1 where ...
instance context2 => C type2 where ...
"overlap" if type1 and type2 unify
However, if you give the command line option
-fallow-overlapping-instances then overlapping instance declarations are permitted.
However, GHC arranges never to commit to using an instance declaration
if another instance declaration also applies, either now or later.
EITHER type1 and type2 do not unify
OR type2 is a substitution instance of type1
(but not identical to type1), or vice versa.
Notice that these rules
make it clear which instance decl to use
(pick the most specific one that matches)
do not mention the contexts context1, context2
Reason: you can pick which instance decl
"matches" based on the type.
However the rules are over-conservative. Two instance declarations can overlap,
but it can still be clear in particular situations which to use. For example:
instance C (Int,a) where ...
instance C (a,Bool) where ...
These are rejected by GHC's rules, but it is clear what to do when trying
to solve the constraint C (Int,Int) because the second instance
cannot apply. Yell if this restriction bites you.
GHC is also conservative about committing to an overlapping instance. For example:
class C a where { op :: a -> a }
instance C [Int] where ...
instance C a => C [a] where ...
f :: C b => [b] -> [b]
f x = op x
From the RHS of f we get the constraint C [b]. But
GHC does not commit to the second instance declaration, because in a paricular
call of f, b might be instantiate to Int, so the first instance declaration
would be appropriate. So GHC rejects the program. If you add -fallow-incoherent-instances
GHC will instead silently pick the second instance, without complaining about
the problem of subsequent instantiations.
Regrettably, GHC doesn't guarantee to detect overlapping instance
declarations if they appear in different modules. GHC can "see" the
instance declarations in the transitive closure of all the modules
imported by the one being compiled, so it can "see" all instance decls
when it is compiling Main. However, it currently chooses not
to look at ones that can't possibly be of use in the module currently
being compiled, in the interests of efficiency. (Perhaps we should
change that decision, at least for Main.)
There are no restrictions on the type in an instance
head, except that at least one must not be a type variable.
The instance "head" is the bit after the "=>" in an instance decl. For
example, these are OK:
instance C Int a where ...
instance D (Int, Int) where ...
instance E [[a]] where ...
Note that instance heads may contain repeated type variables.
For example, this is OK:
instance Stateful (ST s) (MutVar s) where ...
The "at least one not a type variable" restriction is to ensure that
context reduction terminates: each reduction step removes one type
constructor. For example, the following would make the type checker
loop if it wasn't excluded:
instance C a => C a where ...
There are two situations in which the rule is a bit of a pain. First,
if one allows overlapping instance declarations then it's quite
convenient to have a "default instance" declaration that applies if
something more specific does not:
instance C a where
op = ... -- Default
Second, sometimes you might want to use the following to get the
effect of a "class synonym":
class (C1 a, C2 a, C3 a) => C a where { }
instance (C1 a, C2 a, C3 a) => C a where { }
This allows you to write shorter signatures:
f :: C a => ...
instead of
f :: (C1 a, C2 a, C3 a) => ...
I'm on the lookout for a simple rule that preserves decidability while
allowing these idioms. The experimental flag
-fallow-undecidable-instances lifts this restriction, allowing all the types in an
instance head to be type variables.
Unlike Haskell 1.4, instance heads may use type
synonyms. As always, using a type synonym is just shorthand for
writing the RHS of the type synonym definition. For example:
type Point = (Int,Int)
instance C Point where ...
instance C [Point] where ...
is legal. However, if you added
instance C (Int,Int) where ...
as well, then the compiler will complain about the overlapping
(actually, identical) instance declarations. As always, type synonyms
must be fully applied. You cannot, for example, write:
type P a = [[a]]
instance Monad P where ...
This design decision is independent of all the others, and easily
reversed, but it makes sense to me.
The types in an instance-declaration context must all
be type variables. Thus
instance C a b => Eq (a,b) where ...
is OK, but
instance C Int b => Foo b where ...
is not OK. Again, the intent here is to make sure that context
reduction terminates.
Voluminous correspondence on the Haskell mailing list has convinced me
that it's worth experimenting with a more liberal rule. If you use
the flag -fallow-undecidable-instances can use arbitrary
types in an instance context. Termination is ensured by having a
fixed-depth recursion stack. If you exceed the stack depth you get a
sort of backtrace, and the opportunity to increase the stack depth
with -fcontext-stackN.
Implicit paramters are implemented as described in
"Implicit parameters: dynamic scoping with static types",
J Lewis, MB Shields, E Meijer, J Launchbury,
27th ACM Symposium on Principles of Programming Languages (POPL'00),
Boston, Jan 2000.
(Most of the following, stil rather incomplete, documentation is due to Jeff Lewis.)
A variable is called dynamically bound when it is bound by the calling
context of a function and statically bound when bound by the callee's
context. In Haskell, all variables are statically bound. Dynamic
binding of variables is a notion that goes back to Lisp, but was later
discarded in more modern incarnations, such as Scheme. Dynamic binding
can be very confusing in an untyped language, and unfortunately, typed
languages, in particular Hindley-Milner typed languages like Haskell,
only support static scoping of variables.
However, by a simple extension to the type class system of Haskell, we
can support dynamic binding. Basically, we express the use of a
dynamically bound variable as a constraint on the type. These
constraints lead to types of the form (?x::t') => t, which says "this
function uses a dynamically-bound variable ?x
of type t'". For
example, the following expresses the type of a sort function,
implicitly parameterized by a comparison function named cmp.
sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
The dynamic binding constraints are just a new form of predicate in the type class system.
An implicit parameter is introduced by the special form ?x,
where x is
any valid identifier. Use if this construct also introduces new
dynamic binding constraints. For example, the following definition
shows how we can define an implicitly parameterized sort function in
terms of an explicitly parameterized sortBy function:
sortBy :: (a -> a -> Bool) -> [a] -> [a]
sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
sort = sortBy ?cmp
Dynamic binding constraints behave just like other type class
constraints in that they are automatically propagated. Thus, when a
function is used, its implicit parameters are inherited by the
function that called it. For example, our sort function might be used
to pick out the least value in a list:
least :: (?cmp :: a -> a -> Bool) => [a] -> a
least xs = fst (sort xs)
Without lifting a finger, the ?cmp parameter is
propagated to become a parameter of least as well. With explicit
parameters, the default is that parameters must always be explicit
propagated. With implicit parameters, the default is to always
propagate them.
An implicit parameter differs from other type class constraints in the
following way: All uses of a particular implicit parameter must have
the same type. This means that the type of (?x, ?x)
is (?x::a) => (a,a), and not
(?x::a, ?x::b) => (a, b), as would be the case for type
class constraints.
An implicit parameter is bound using the standard
let binding form, where the bindings must be a
collection of simple bindings to implicit-style variables (no
function-style bindings, and no type signatures); these bindings are
neither polymorphic or recursive. This form binds the implicit
parameters arising in the body, not the free variables as a
let or where would do. For
example, we define the min function by binding
cmp.
min :: [a] -> a
min = let ?cmp = (<=) in least
Note the following additional constraints:
You can't have an implicit parameter in the context of a class or instance
declaration. For example, both these declarations are illegal:
class (?x::Int) => C a where ...
instance (?x::a) => Foo [a] where ...
Reason: exactly which implicit parameter you pick up depends on exactly where
you invoke a function. But the ``invocation'' of instance declarations is done
behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
Easiest thing is to outlaw the offending types.
Linear implicit parameters are an idea developed by Koen Claessen,
Mark Shields, and Simon PJ. They address the long-standing
problem that monads seem over-kill for certain sorts of problem, notably:
distributing a supply of unique names
distributing a suppply of random numbers
distributing an oracle (as in QuickCheck)
Linear implicit parameters are just like ordinary implicit parameters,
except that they are "linear" -- that is, they cannot be copied, and
must be explicitly "split" instead. Linear implicit parameters are
written '%x' instead of '?x'.
(The '/' in the '%' suggests the split!)
For example:
import GHC.Exts( Splittable )
data NameSupply = ...
splitNS :: NameSupply -> (NameSupply, NameSupply)
newName :: NameSupply -> Name
instance Splittable NameSupply where
split = splitNS
f :: (%ns :: NameSupply) => Env -> Expr -> Expr
f env (Lam x e) = Lam x' (f env e)
where
x' = newName %ns
env' = extend env x x'
...more equations for f...
Notice that the implicit parameter %ns is consumed
once by the call to newName
once by the recursive call to f
So the translation done by the type checker makes
the parameter explicit:
f :: NameSupply -> Env -> Expr -> Expr
f ns env (Lam x e) = Lam x' (f ns1 env e)
where
(ns1,ns2) = splitNS ns
x' = newName ns2
env = extend env x x'
Notice the call to 'split' introduced by the type checker.
How did it know to use 'splitNS'? Because what it really did
was to introduce a call to the overloaded function 'split',
defined by the class Splittable:
class Splittable a where
split :: a -> (a,a)
The instance for Splittable NameSupply tells GHC how to implement
split for name supplies. But we can simply write
g x = (x, %ns, %ns)
and GHC will infer
g :: (Splittable a, %ns :: a) => b -> (b,a,a)
The Splittable class is built into GHC. It's exported by module
GHC.Exts.
Other points:
'?x' and '%x'
are entirely distinct implicit parameters: you
can use them together and they won't intefere with each other.
You can bind linear implicit parameters in 'with' clauses.
You cannot have implicit parameters (whether linear or not)
in the context of a class or instance declaration.
The monomorphism restriction is even more important than usual.
Consider the example above:
f :: (%ns :: NameSupply) => Env -> Expr -> Expr
f env (Lam x e) = Lam x' (f env e)
where
x' = newName %ns
env' = extend env x x'
If we replaced the two occurrences of x' by (newName %ns), which is
usually a harmless thing to do, we get:
f :: (%ns :: NameSupply) => Env -> Expr -> Expr
f env (Lam x e) = Lam (newName %ns) (f env e)
where
env' = extend env x (newName %ns)
But now the name supply is consumed in three places
(the two calls to newName,and the recursive call to f), so
the result is utterly different. Urk! We don't even have
the beta rule.
Well, this is an experimental change. With implicit
parameters we have already lost beta reduction anyway, and
(as John Launchbury puts it) we can't sensibly reason about
Haskell programs without knowing their typing.
Linear implicit parameters can be particularly tricky when you have a recursive function
Consider
foo :: %x::T => Int -> [Int]
foo 0 = []
foo n = %x : foo (n-1)
where T is some type in class Splittable.
Do you get a list of all the same T's or all different T's
(assuming that split gives two distinct T's back)?
If you supply the type signature, taking advantage of polymorphic
recursion, you get what you'd probably expect. Here's the
translated term, where the implicit param is made explicit:
foo x 0 = []
foo x n = let (x1,x2) = split x
in x1 : foo x2 (n-1)
But if you don't supply a type signature, GHC uses the Hindley
Milner trick of using a single monomorphic instance of the function
for the recursive calls. That is what makes Hindley Milner type inference
work. So the translation becomes
foo x = let
foom 0 = []
foom n = x : foom (n-1)
in
foom
Result: 'x' is not split, and you get a list of identical T's. So the
semantics of the program depends on whether or not foo has a type signature.
Yikes!
You may say that this is a good reason to dislike linear implicit parameters
and you'd be right. That is why they are an experimental feature.
Functional dependencies are implemented as described by Mark Jones
in “Type Classes with Functional Dependencies”, Mark P. Jones,
In Proceedings of the 9th European Symposium on Programming,
ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782,
.
There should be more documentation, but there isn't (yet). Yell if you need it.
Haskell type signatures are implicitly quantified. The new keyword forall
allows us to say exactly what this means. For example:
g :: b -> b
means this:
g :: forall b. (b -> b)
The two are treated identically.
However, GHC's type system supports arbitrary-rank
explicit universal quantification in
types.
For example, all the following types are legal:
f1 :: forall a b. a -> b -> a
g1 :: forall a b. (Ord a, Eq b) => a -> b -> a
f2 :: (forall a. a->a) -> Int -> Int
g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int
f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool
Here, f1 and g1 are rank-1 types, and
can be written in standard Haskell (e.g. f1 :: a->b->a).
The forall makes explicit the universal quantification that
is implicitly added by Haskell.
The functions f2 and g2 have rank-2 types;
the forall is on the left of a function arrrow. As g2
shows, the polymorphic type on the left of the function arrow can be overloaded.
The functions f3 and g3 have rank-3 types;
they have rank-2 types on the left of a function arrow.
GHC allows types of arbitrary rank; you can nest foralls
arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but
that restriction has now been lifted.)
In particular, a forall-type (also called a "type scheme"),
including an operational type class context, is legal:
As the argument of a constructor, or type of a field, in a data type declaration. For
example, any of the f1,f2,f3,g1,g2,g3 above would be valid
field type signatures.
There is one place you cannot put a forall:
you cannot instantiate a type variable with a forall-type. So you cannot
make a forall-type the argument of a type constructor. So these types are illegal:
x1 :: [forall a. a->a]
x2 :: (forall a. a->a, Int)
x3 :: Maybe (forall a. a->a)
Of course forall becomes a keyword; you can't use forall as
a type variable any more!
In a data or newtype declaration one can quantify
the types of the constructor arguments. Here are several examples:
data T a = T1 (forall b. b -> b -> b) a
data MonadT m = MkMonad { return :: forall a. a -> m a,
bind :: forall a b. m a -> (a -> m b) -> m b
}
newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
The constructors have rank-2 types:
T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
MkMonad :: forall m. (forall a. a -> m a)
-> (forall a b. m a -> (a -> m b) -> m b)
-> MonadT m
MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
Notice that you don't need to use a forall if there's an
explicit context. For example in the first argument of the
constructor MkSwizzle, an implicit "forall a." is
prefixed to the argument type. The implicit forall
quantifies all type variables that are not already in scope, and are
mentioned in the type quantified over.
As for type signatures, implicit quantification happens for non-overloaded
types too. So if you write this:
data T a = MkT (Either a b) (b -> b)
it's just as if you had written this:
data T a = MkT (forall b. Either a b) (forall b. b -> b)
That is, since the type variable b isn't in scope, it's
implicitly universally quantified. (Arguably, it would be better
to require explicit quantification on constructor arguments
where that is what is wanted. Feedback welcomed.)
You construct values of types T1, MonadT, Swizzle by applying
the constructor to suitable values, just as usual. For example,
a1 :: T Int
a1 = T1 (\xy->x) 3
a2, a3 :: Swizzle
a2 = MkSwizzle sort
a3 = MkSwizzle reverse
a4 :: MonadT Maybe
a4 = let r x = Just x
b m k = case m of
Just y -> k y
Nothing -> Nothing
in
MkMonad r b
mkTs :: (forall b. b -> b -> b) -> a -> [T a]
mkTs f x y = [T1 f x, T1 f y]
The type of the argument can, as usual, be more general than the type
required, as (MkSwizzle reverse) shows. (reverse
does not need the Ord constraint.)
When you use pattern matching, the bound variables may now have
polymorphic types. For example:
f :: T a -> a -> (a, Char)
f (T1 w k) x = (w k x, w 'c' 'd')
g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
g (MkSwizzle s) xs f = s (map f (s xs))
h :: MonadT m -> [m a] -> m [a]
h m [] = return m []
h m (x:xs) = bind m x $ \y ->
bind m (h m xs) $ \ys ->
return m (y:ys)
In the function h we use the record selectors return
and bind to extract the polymorphic bind and return functions
from the MonadT data structure, rather than using pattern
matching.
In general, type inference for arbitrary-rank types is undecideable.
GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96)
to get a decidable algorithm by requiring some help from the programmer.
We do not yet have a formal specification of "some help" but the rule is this:
For a lambda-bound or case-bound variable, x, either the programmer
provides an explicit polymorphic type for x, or GHC's type inference will assume
that x's type has no foralls in it.
What does it mean to "provide" an explicit type for x? You can do that by
giving a type signature for x directly, using a pattern type signature
(Section 7.3.13), thus:
\ f :: (forall a. a->a) -> (f True, f 'c')
Alternatively, you can give a type signature to the enclosing
context, which GHC can "push down" to find the type for the variable:
(\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char)
Here the type signature on the expression can be pushed inwards
to give a type signature for f. Similarly, and more commonly,
one can give a type signature for the function itself:
h :: (forall a. a->a) -> (Bool,Char)
h f = (f True, f 'c')
You don't need to give a type signature if the lambda bound variable
is a constructor argument. Here is an example we saw earlier:
f :: T a -> a -> (a, Char)
f (T1 w k) x = (w k x, w 'c' 'd')
Here we do not need to give a type signature to w, because
it is an argument of constructor T1 and that tells GHC all
it needs to know.
GHC performs implicit quantification as follows. At the top level (only) of
user-written types, if and only if there is no explicit forall,
GHC finds all the type variables mentioned in the type that are not already
in scope, and universally quantifies them. For example, the following pairs are
equivalent:
f :: a -> a
f :: forall a. a -> a
g (x::a) = let
h :: a -> b -> b
h x y = y
in ...
g (x::a) = let
h :: forall b. a -> b -> b
h x y = y
in ...
Notice that GHC does not find the innermost possible quantification
point. For example:
f :: (a -> a) -> Int
-- MEANS
f :: forall a. (a -> a) -> Int
-- NOT
f :: (forall a. a -> a) -> Int
g :: (Ord a => a -> a) -> Int
-- MEANS the illegal type
g :: forall a. (Ord a => a -> a) -> Int
-- NOT
g :: (forall a. Ord a => a -> a) -> Int
The latter produces an illegal type, which you might think is silly,
but at least the rule is simple. If you want the latter type, you
can write your for-alls explicitly. Indeed, doing so is strongly advised
for rank-2 types.
Type synonmys are like macros at the type level, and
GHC does validity checking on types only after expanding type synonyms.
That means that GHC can be very much more liberal about type synonyms than Haskell 98:
You can write a forall (including overloading)
in a type synonym, thus:
type Discard a = forall b. Show b => a -> b -> (a, String)
f :: Discard a
f x y = (x, show y)
g :: Discard Int -> (Int,Bool) -- A rank-2 type
g f = f Int True
You can write an unboxed tuple in a type synonym:
type Pr = (# Int, Int #)
h :: Int -> Pr
h x = (# x, x #)
You can apply a type synonym to a forall type:
type Foo a = a -> a -> Bool
f :: Foo (forall b. b->b)
After expanding the synonym, f has the legal (in GHC) type:
f :: (forall b. b->b) -> (forall b. b->b) -> Bool
You can apply a type synonym to a partially applied type synonym:
type Generic i o = forall x. i x -> o x
type Id x = x
foo :: Generic Id []
After epxanding the synonym, foo has the legal (in GHC) type:
foo :: forall x. x -> [x]
GHC currently does kind checking before expanding synonyms (though even that
could be changed.)
After expanding type synonyms, GHC does validity checking on types, looking for
the following mal-formedness which isn't detected simply by kind checking:
Type constructor applied to a type involving for-alls.
Unboxed tuple on left of an arrow.
Partially-applied type synonym.
So, for example,
this will be rejected:
type Pr = (# Int, Int #)
h :: Pr -> Int
h x = ...
because GHC does not allow unboxed tuples on the left of a function arrow.
It is often convenient to use generalised type synonyms at the right hand
end of an arrow, thus:
type Discard a = forall b. a -> b -> a
g :: Int -> Discard Int
g x y z = x+y
Simply expanding the type synonym would give
g :: Int -> (forall b. Int -> b -> Int)
but GHC "hoists" the forall to give the isomorphic type
g :: forall b. Int -> Int -> b -> Int
In general, the rule is this: to determine the type specified by any explicit
user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
performs the transformation:
(In fact, GHC tries to retain as much synonym information as possible for use in
error messages, but that is a usability issue.) This rule applies, of course, whether
or not the forall comes from a synonym. For example, here is another
valid way to write g's type signature:
g :: Int -> Int -> forall b. b -> Int
When doing this hoisting operation, GHC eliminates duplicate constraints. For
example:
type Foo a = (?x::Int) => Bool -> a
g :: Foo (Foo Int)
The idea of using existential quantification in data type declarations
was suggested by Laufer (I believe, thought doubtless someone will
correct me), and implemented in Hope+. It's been in Lennart
Augustsson's hbc Haskell compiler for several years, and
proved very useful. Here's the idea. Consider the declaration:
data Foo = forall a. MkFoo a (a -> Bool)
| Nil
The data type Foo has two constructors with types:
MkFoo :: forall a. a -> (a -> Bool) -> Foo
Nil :: Foo
Notice that the type variable a in the type of MkFoo
does not appear in the data type itself, which is plain Foo.
For example, the following expression is fine:
[MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
Here, (MkFoo 3 even) packages an integer with a function
even that maps an integer to Bool; and MkFoo 'c'
isUpper packages a character with a compatible function. These
two things are each of type Foo and can be put in a list.
What can we do with a value of type Foo?. In particular,
what happens when we pattern-match on MkFoo?
f (MkFoo val fn) = ???
Since all we know about val and fn is that they
are compatible, the only (useful) thing we can do with them is to
apply fn to val to get a boolean. For example:
f :: Foo -> Bool
f (MkFoo val fn) = fn val
What this allows us to do is to package heterogenous values
together with a bunch of functions that manipulate them, and then treat
that collection of packages in a uniform manner. You can express
quite a bit of object-oriented-like programming this way.
What has this to do with existential quantification?
Simply that MkFoo has the (nearly) isomorphic type
MkFoo :: (exists a . (a, a -> Bool)) -> Foo
But Haskell programmers can safely think of the ordinary
universally quantified type given above, thereby avoiding
adding a new existential quantification construct.
An easy extension (implemented in hbc) is to allow
arbitrary contexts before the constructor. For example:
data Baz = forall a. Eq a => Baz1 a a
| forall b. Show b => Baz2 b (b -> b)
The two constructors have the types you'd expect:
Baz1 :: forall a. Eq a => a -> a -> Baz
Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
But when pattern matching on Baz1 the matched values can be compared
for equality, and when pattern matching on Baz2 the first matched
value can be converted to a string (as well as applying the function to it).
So this program is legal:
f :: Baz -> String
f (Baz1 p q) | p == q = "Yes"
| otherwise = "No"
f (Baz2 v fn) = show (fn v)
Operationally, in a dictionary-passing implementation, the
constructors Baz1 and Baz2 must store the
dictionaries for Eq and Show respectively, and
extract it on pattern matching.
Notice the way that the syntax fits smoothly with that used for
universal quantification earlier.
There are several restrictions on the ways in which existentially-quantified
constructors can be use.
When pattern matching, each pattern match introduces a new,
distinct, type for each existential type variable. These types cannot
be unified with any other type, nor can they escape from the scope of
the pattern match. For example, these fragments are incorrect:
f1 (MkFoo a f) = a
Here, the type bound by MkFoo "escapes", because a
is the result of f1. One way to see why this is wrong is to
ask what type f1 has:
f1 :: Foo -> a -- Weird!
What is this "a" in the result type? Clearly we don't mean
this:
f1 :: forall a. Foo -> a -- Wrong!
The original program is just plain wrong. Here's another sort of error
f2 (Baz1 a b) (Baz1 p q) = a==q
It's ok to say a==b or p==q, but
a==q is wrong because it equates the two distinct types arising
from the two Baz1 constructors.
You can't pattern-match on an existentially quantified
constructor in a let or where group of
bindings. So this is illegal:
f3 x = a==b where { Baz1 a b = x }
Instead, use a case expression:
f3 x = case x of Baz1 a b -> a==b
In general, you can only pattern-match
on an existentially-quantified constructor in a case expression or
in the patterns of a function definition.
The reason for this restriction is really an implementation one.
Type-checking binding groups is already a nightmare without
existentials complicating the picture. Also an existential pattern
binding at the top level of a module doesn't make sense, because it's
not clear how to prevent the existentially-quantified type "escaping".
So for now, there's a simple-to-state restriction. We'll see how
annoying it is.
You can't use existential quantification for newtype
declarations. So this is illegal:
newtype T = forall a. Ord a => MkT a
Reason: a value of type T must be represented as a pair
of a dictionary for Ord t and a value of type t.
That contradicts the idea that newtype should have no
concrete representation. You can get just the same efficiency and effect
by using data instead of newtype. If there is no
overloading involved, then there is more of a case for allowing
an existentially-quantified newtype, because the data
because the data version does carry an implementation cost,
but single-field existentially quantified constructors aren't much
use. So the simple restriction (no existential stuff on newtype)
stands, unless there are convincing reasons to change it.
You can't use deriving to define instances of a
data type with existentially quantified data constructors.
Reason: in most cases it would not make sense. For example:#
data T = forall a. MkT [a] deriving( Eq )
To derive Eq in the standard way we would need to have equality
between the single component of two MkT constructors:
instance Eq T where
(MkT a) == (MkT b) = ???
But a and b have distinct types, and so can't be compared.
It's just about possible to imagine examples in which the derived instance
would make sense, but it seems altogether simpler simply to prohibit such
declarations. Define your own instances!
A pattern type signature can introduce a scoped type
variable. For example
f (xs::[a]) = ys ++ ys
where
ys :: [a]
ys = reverse xs
The pattern (xs::[a]) includes a type signature for xs.
This brings the type variable a into scope; it scopes over
all the patterns and right hand sides for this equation for f.
In particular, it is in scope at the type signature for y.
Pattern type signatures are completely orthogonal to ordinary, separate
type signatures. The two can be used independently or together.
At ordinary type signatures, such as that for ys, any type variables
mentioned in the type signature that are not in scope are
implicitly universally quantified. (If there are no type variables in
scope, all type variables mentioned in the signature are universally
quantified, which is just as in Haskell 98.) In this case, since a
is in scope, it is not universally quantified, so the type of ys is
the same as that of xs. In Haskell 98 it is not possible to declare
a type for ys; a major benefit of scoped type variables is that
it becomes possible to do so.
Scoped type variables are implemented in both GHC and Hugs. Where the
implementations differ from the specification below, those differences
are noted.
A type variable brought into scope by a pattern type signature is simply
the name for a type. The restriction they express is that all occurrences
of the same name mean the same type. For example:
f :: [Int] -> Int -> Int
f (xs::[a]) (y::a) = (head xs + y) :: a
The pattern type signatures on the left hand side of
f express the fact that xs
must be a list of things of some type a; and that y
must have this same type. The type signature on the expression (head xs)
specifies that this expression must have the same type a.
There is no requirement that the type named by "a" is
in fact a type variable. Indeed, in this case, the type named by "a" is
Int. (This is a slight liberalisation from the original rather complex
rules, which specified that a pattern-bound type variable should be universally quantified.)
For example, all of these are legal:
t (x::a) (y::a) = x+y*2
f (x::a) (y::b) = [x,y] -- a unifies with b
g (x::a) = x + 1::Int -- a unifies with Int
h x = let k (y::a) = [x,y] -- a is free in the
in k x -- environment
k (x::a) True = ... -- a unifies with Int
k (x::Int) False = ...
w :: [b] -> [b]
w (x::a) = x -- a unifies with [b]
All the type variables mentioned in a pattern,
that are not already in scope,
are brought into scope by the pattern. We describe this set as
the type variables bound by the pattern.
For example:
f (x::a) = let g (y::(a,b)) = fst y
in
g (x,True)
The pattern (x::a) brings the type variable
a into scope, as well as the term
variable x. The pattern (y::(a,b))
contains an occurrence of the already-in-scope type variable a,
and brings into scope the type variable b.
The type variable(s) bound by the pattern have the same scope
as the term variable(s) bound by the pattern. For example:
let
f (x::a) = <...rhs of f...>
(p::b, q::b) = (1,2)
in <...body of let...>
Here, the type variable a scopes over the right hand side of f,
just like x does; while the type variable b scopes over the
body of the let, and all the other definitions in the let,
just like p and q do.
Indeed, the newly bound type variables also scope over any ordinary, separate
type signatures in the let group.
The type variables bound by the pattern may be
mentioned in ordinary type signatures or pattern
type signatures anywhere within their scope.
In ordinary type signatures, any type variable mentioned in the
signature that is in scope is not universally quantified.
Ordinary type signatures do not bring any new type variables
into scope (except in the type signature itself!). So this is illegal:
f :: a -> a
f x = x::a
It's illegal because a is not in scope in the body of f,
so the ordinary signature x::a is equivalent to x::forall a.a;
and that is an incorrect typing.
The pattern type signature is a monotype:
A pattern type signature cannot contain any explicit forall quantification.
The type variables bound by a pattern type signature can only be instantiated to monotypes,
not to type schemes.
There is no implicit universal quantification on pattern type signatures (in contrast to
ordinary type signatures).
The type variables in the head of a class or instance declaration
scope over the methods defined in the where part. For example:
class C a where
op :: [a] -> a
op xs = let ys::[a]
ys = reverse xs
in
head ys
A pattern type signature can occur in any pattern. For example:
A pattern type signature can be on an arbitrary sub-pattern, not
ust on a variable:
f ((x,y)::(a,b)) = (y,x) :: (b,a)
Pattern type signatures, including the result part, can be used
in lambda abstractions:
(\ (x::a, y) :: a -> x)
Pattern type signatures, including the result part, can be used
in case expressions:
case e of { (x::a, y) :: a -> x }
To avoid ambiguity, the type after the “::” in a result
pattern signature on a lambda or case must be atomic (i.e. a single
token or a parenthesised type of some sort). To see why,
consider how one would parse this:
\ x :: a -> b -> x
Pattern type signatures can bind existential type variables.
For example:
data T = forall a. MkT [a]
f :: T -> T
f (MkT [t::a]) = MkT t3
where
t3::[a] = [t,t,t]
Pattern type signatures
can be used in pattern bindings:
f x = let (y, z::a) = x in ...
f1 x = let (y, z::Int) = x in ...
f2 (x::(Int,a)) = let (y, z::a) = x in ...
f3 :: (b->b) = \x -> x
In all such cases, the binding is not generalised over the pattern-bound
type variables. Thus f3 is monomorphic; f3
has type b -> b for some type b,
and notforall b. b -> b.
In contrast, the binding
f4 :: b->b
f4 = \x -> x
makes a polymorphic function, but b is not in scope anywhere
in f4's scope.