Google

NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.64 ">

7.3. Type system extensions

7.3.1. Data types with no constructors

With the -fglasgow-exts flag, GHC lets you declare a data type with no constructors. For example:

  data S      -- S :: *
  data T a    -- T :: * -> *

Syntactically, the declaration lacks the "= constrs" part. The type can be parameterised over types of any kind, but if the kind is not * then an explicit kind annotation must be used (see Section 7.3.3).

Such data types have only one value, namely bottom. Nevertheless, they can be useful when defining "phantom types".

7.3.2. Infix type constructors

GHC allows type constructors to be operators, and to be written infix, very much like expressions. More specifically:

  • A type constructor can be an operator, beginning with a colon; e.g. :*:. The lexical syntax is the same as that for data constructors.

  • Types can be written infix. For example Int :*: Bool.

  • Back-quotes work as for expressions, both for type constructors and type variables; e.g. Int `Either` Bool, or Int `a` Bool. Similarly, parentheses work the same; e.g. (:*:) Int Bool.

  • Fixities may be declared for type constructors just as for data constructors. However, one cannot distinguish between the two in a fixity declaration; a fixity declaration sets the fixity for a data constructor and the corresponding type constructor. For example:
      infixl 7 T, :*:
    sets the fixity for both type constructor T and data constructor T, and similarly for :*:. Int `a` Bool.

  • Function arrow is infixr with fixity 0. (This might change; I'm not sure what it should be.)

  • Data type and type-synonym declarations can be written infix. E.g.
      data a :*: b = Foo a b
      type a :+: b = Either a b

  • The only thing that differs between operators in types and operators in expressions is that ordinary non-constructor operators, such as + and * are not allowed in types. Reason: the uniform thing to do would be to make them type variables, but that's not very useful. A less uniform but more useful thing would be to allow them to be type constructors. But that gives trouble in export lists. So for now we just exclude them.

7.3.3. Explicitly-kinded quantification

Haskell infers the kind of each type variable. Sometimes it is nice to be able to give the kind explicitly as (machine-checked) documentation, just as it is nice to give a type signature for a function. On some occasions, it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999) John Hughes had to define the data type:
     data Set cxt a = Set [a]
                    | Unused (cxt a -> ())
The only use for the Unused constructor was to force the correct kind for the type variable cxt.

GHC now instead allows you to specify the kind of a type variable directly, wherever a type variable is explicitly bound. Namely:

  • data declarations:
      data Set (cxt :: * -> *) a = Set [a]

  • type declarations:
      type T (f :: * -> *) = f Int

  • class declarations:
      class (Eq a) => C (f :: * -> *) a where ...

  • forall's in type signatures:
      f :: forall (cxt :: * -> *). Set cxt Int

The parentheses are required. Some of the spaces are required too, to separate the lexemes. If you write (f::*->*) you will get a parse error, because "::*->*" is a single lexeme in Haskell.

As part of the same extension, you can put kind annotations in types as well. Thus:
   f :: (Int :: *) -> Int
   g :: forall a. a -> (a :: *)
The syntax is
   atype ::= '(' ctype '::' kind ')
The parentheses are required.

7.3.4. Class method types

Haskell 98 prohibits class method types to mention constraints on the class type variable, thus:
  class Seq s a where
    fromList :: [a] -> s a
    elem     :: Eq a => a -> s a -> Bool
The type of elem is illegal in Haskell 98, because it contains the constraint Eq a, constrains only the class type variable (in this case a).

With the -fglasgow-exts GHC lifts this restriction.

7.3.5. Multi-parameter type classes

This section documents GHC's implementation of multi-parameter type classes. There's lots of background in the paper Type classes: exploring the design space (Simon Peyton Jones, Mark Jones, Erik Meijer).

I'd like to thank people who reported shorcomings in the GHC 3.02 implementation. Our default decisions were all conservative ones, and the experience of these heroic pioneers has given useful concrete examples to support several generalisations. (These appear below as design choices not implemented in 3.02.)

I've discussed these notes with Mark Jones, and I believe that Hugs will migrate towards the same design choices as I outline here. Thanks to him, and to many others who have offered very useful feedback.

7.3.5.1. Types

There are the following restrictions on the form of a qualified type:

  forall tv1..tvn (c1, ...,cn) => type

(Here, I write the "foralls" explicitly, although the Haskell source language omits them; in Haskell 1.4, all the free type variables of an explicit source-language type signature are universally quantified, except for the class type variables in a class declaration. However, in GHC, you can give the foralls if you want. See Section 7.3.9).

  1. Each universally quantified type variable tvi must be mentioned (i.e. appear free) in type. The reason for this is that a value with a type that does not obey this restriction could not be used without introducing ambiguity. Here, for example, is an illegal type:
      forall a. Eq a => Int
    When a value with this type was used, the constraint Eq tv would be introduced where tv is a fresh type variable, and (in the dictionary-translation implementation) the value would be applied to a dictionary for Eq tv. The difficulty is that we can never know which instance of Eq to use because we never get any more information about tv.

  2. Every constraint ci must mention at least one of the universally quantified type variables tvi. For example, this type is OK because C a b mentions the universally quantified type variable b:
      forall a. C a b => burble
    The next type is illegal because the constraint Eq b does not mention a:
      forall a. Eq b => burble
    The reason for this restriction is milder than the other one. The excluded types are never useful or necessary (because the offending context doesn't need to be witnessed at this point; it can be floated out). Furthermore, floating them out increases sharing. Lastly, excluding them is a conservative choice; it leaves a patch of territory free in case we need it later.

These restrictions apply to all types, whether declared in a type signature or inferred.

Unlike Haskell 1.4, constraints in types do not have to be of the form (class type-variables). Thus, these type signatures are perfectly OK

  f :: Eq (m a) => [m a] -> [m a]
  g :: Eq [a] => ...

This choice recovers principal types, a property that Haskell 1.4 does not have.

7.3.5.2. Class declarations

  1. Multi-parameter type classes are permitted. For example:
      class Collection c a where
        union :: c a -> c a -> c a
        ...etc.

  2. The class hierarchy must be acyclic. However, the definition of "acyclic" involves only the superclass relationships. For example, this is OK:
      class C a where {
        op :: D b => a -> b -> b
      }
    
      class C a => D a where { ... }
    Here, C is a superclass of D, but it's OK for a class operation op of C to mention D. (It would not be OK for D to be a superclass of C.)

  3. There are no restrictions on the context in a class declaration (which introduces superclasses), except that the class hierarchy must be acyclic. So these class declarations are OK:
      class Functor (m k) => FiniteMap m k where
        ...
    
      class (Monad m, Monad (t m)) => Transform t m where
        lift :: m a -> (t m) a

  4. In the signature of a class operation, every constraint must mention at least one type variable that is not a class type variable. Thus:
      class Collection c a where
        mapC :: Collection c b => (a->b) -> c a -> c b
    is OK because the constraint (Collection a b) mentions b, even though it also mentions the class variable a. On the other hand:
      class C a where
        op :: Eq a => (a,b) -> (a,b)
    is not OK because the constraint (Eq a) mentions on the class type variable a, but not b. However, any such example is easily fixed by moving the offending context up to the superclass context:
      class Eq a => C a where
        op ::(a,b) -> (a,b)
    A yet more relaxed rule would allow the context of a class-op signature to mention only class type variables. However, that conflicts with Rule 1(b) for types above.

  5. The type of each class operation must mention all of the class type variables. For example:
      class Coll s a where
        empty  :: s
        insert :: s -> a -> s
    is not OK, because the type of empty doesn't mention a. This rule is a consequence of Rule 1(a), above, for types, and has the same motivation. Sometimes, offending class declarations exhibit misunderstandings. For example, Coll might be rewritten
      class Coll s a where
        empty  :: s a
        insert :: s a -> a -> s a
    which makes the connection between the type of a collection of a's (namely (s a)) and the element type a. Occasionally this really doesn't work, in which case you can split the class like this:
      class CollE s where
        empty  :: s
    
      class CollE s => Coll s a where
        insert :: s -> a -> s

7.3.5.3. Instance declarations

  1. Instance declarations may not overlap. The two instance declarations
      instance context1 => C type1 where ...
      instance context2 => C type2 where ...
    "overlap" if type1 and type2 unify However, if you give the command line option -fallow-overlapping-instances then overlapping instance declarations are permitted. However, GHC arranges never to commit to using an instance declaration if another instance declaration also applies, either now or later.

    • EITHER type1 and type2 do not unify

    • OR type2 is a substitution instance of type1 (but not identical to type1), or vice versa.

    Notice that these rules

    • make it clear which instance decl to use (pick the most specific one that matches)

    • do not mention the contexts context1, context2 Reason: you can pick which instance decl "matches" based on the type.

    However the rules are over-conservative. Two instance declarations can overlap, but it can still be clear in particular situations which to use. For example:
      instance C (Int,a) where ...
      instance C (a,Bool) where ...
    These are rejected by GHC's rules, but it is clear what to do when trying to solve the constraint C (Int,Int) because the second instance cannot apply. Yell if this restriction bites you.

    GHC is also conservative about committing to an overlapping instance. For example:
      class C a where { op :: a -> a }
      instance C [Int] where ...
      instance C a => C [a] where ...
      
      f :: C b => [b] -> [b]
      f x = op x
    From the RHS of f we get the constraint C [b]. But GHC does not commit to the second instance declaration, because in a paricular call of f, b might be instantiate to Int, so the first instance declaration would be appropriate. So GHC rejects the program. If you add -fallow-incoherent-instances GHC will instead silently pick the second instance, without complaining about the problem of subsequent instantiations.

    Regrettably, GHC doesn't guarantee to detect overlapping instance declarations if they appear in different modules. GHC can "see" the instance declarations in the transitive closure of all the modules imported by the one being compiled, so it can "see" all instance decls when it is compiling Main. However, it currently chooses not to look at ones that can't possibly be of use in the module currently being compiled, in the interests of efficiency. (Perhaps we should change that decision, at least for Main.)

  2. There are no restrictions on the type in an instance head, except that at least one must not be a type variable. The instance "head" is the bit after the "=>" in an instance decl. For example, these are OK:
      instance C Int a where ...
    
      instance D (Int, Int) where ...
    
      instance E [[a]] where ...
    Note that instance heads may contain repeated type variables. For example, this is OK:
      instance Stateful (ST s) (MutVar s) where ...
    The "at least one not a type variable" restriction is to ensure that context reduction terminates: each reduction step removes one type constructor. For example, the following would make the type checker loop if it wasn't excluded:
      instance C a => C a where ...
    There are two situations in which the rule is a bit of a pain. First, if one allows overlapping instance declarations then it's quite convenient to have a "default instance" declaration that applies if something more specific does not:
      instance C a where
        op = ... -- Default
    Second, sometimes you might want to use the following to get the effect of a "class synonym":
      class (C1 a, C2 a, C3 a) => C a where { }
    
      instance (C1 a, C2 a, C3 a) => C a where { }
    This allows you to write shorter signatures:
      f :: C a => ...
    instead of
      f :: (C1 a, C2 a, C3 a) => ...
    I'm on the lookout for a simple rule that preserves decidability while allowing these idioms. The experimental flag -fallow-undecidable-instances lifts this restriction, allowing all the types in an instance head to be type variables.

  3. Unlike Haskell 1.4, instance heads may use type synonyms. As always, using a type synonym is just shorthand for writing the RHS of the type synonym definition. For example:
      type Point = (Int,Int)
      instance C Point   where ...
      instance C [Point] where ...
    is legal. However, if you added
      instance C (Int,Int) where ...
    as well, then the compiler will complain about the overlapping (actually, identical) instance declarations. As always, type synonyms must be fully applied. You cannot, for example, write:
      type P a = [[a]]
      instance Monad P where ...
    This design decision is independent of all the others, and easily reversed, but it makes sense to me.

  4. The types in an instance-declaration context must all be type variables. Thus
    instance C a b => Eq (a,b) where ...
    is OK, but
    instance C Int b => Foo b where ...
    is not OK. Again, the intent here is to make sure that context reduction terminates. Voluminous correspondence on the Haskell mailing list has convinced me that it's worth experimenting with a more liberal rule. If you use the flag -fallow-undecidable-instances can use arbitrary types in an instance context. Termination is ensured by having a fixed-depth recursion stack. If you exceed the stack depth you get a sort of backtrace, and the opportunity to increase the stack depth with -fcontext-stackN.

7.3.6. Implicit parameters

Implicit paramters are implemented as described in "Implicit parameters: dynamic scoping with static types", J Lewis, MB Shields, E Meijer, J Launchbury, 27th ACM Symposium on Principles of Programming Languages (POPL'00), Boston, Jan 2000.

(Most of the following, stil rather incomplete, documentation is due to Jeff Lewis.)

A variable is called dynamically bound when it is bound by the calling context of a function and statically bound when bound by the callee's context. In Haskell, all variables are statically bound. Dynamic binding of variables is a notion that goes back to Lisp, but was later discarded in more modern incarnations, such as Scheme. Dynamic binding can be very confusing in an untyped language, and unfortunately, typed languages, in particular Hindley-Milner typed languages like Haskell, only support static scoping of variables.

However, by a simple extension to the type class system of Haskell, we can support dynamic binding. Basically, we express the use of a dynamically bound variable as a constraint on the type. These constraints lead to types of the form (?x::t') => t, which says "this function uses a dynamically-bound variable ?x of type t'". For example, the following expresses the type of a sort function, implicitly parameterized by a comparison function named cmp.
  sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
The dynamic binding constraints are just a new form of predicate in the type class system.

An implicit parameter is introduced by the special form ?x, where x is any valid identifier. Use if this construct also introduces new dynamic binding constraints. For example, the following definition shows how we can define an implicitly parameterized sort function in terms of an explicitly parameterized sortBy function:
  sortBy :: (a -> a -> Bool) -> [a] -> [a]

  sort   :: (?cmp :: a -> a -> Bool) => [a] -> [a]
  sort    = sortBy ?cmp
Dynamic binding constraints behave just like other type class constraints in that they are automatically propagated. Thus, when a function is used, its implicit parameters are inherited by the function that called it. For example, our sort function might be used to pick out the least value in a list:
  least   :: (?cmp :: a -> a -> Bool) => [a] -> a
  least xs = fst (sort xs)
Without lifting a finger, the ?cmp parameter is propagated to become a parameter of least as well. With explicit parameters, the default is that parameters must always be explicit propagated. With implicit parameters, the default is to always propagate them.

An implicit parameter differs from other type class constraints in the following way: All uses of a particular implicit parameter must have the same type. This means that the type of (?x, ?x) is (?x::a) => (a,a), and not (?x::a, ?x::b) => (a, b), as would be the case for type class constraints.

An implicit parameter is bound using the standard let binding form, where the bindings must be a collection of simple bindings to implicit-style variables (no function-style bindings, and no type signatures); these bindings are neither polymorphic or recursive. This form binds the implicit parameters arising in the body, not the free variables as a let or where would do. For example, we define the min function by binding cmp.

  min :: [a] -> a
  min  = let ?cmp = (<=) in least

Note the following additional constraints:

  • You can't have an implicit parameter in the context of a class or instance declaration. For example, both these declarations are illegal:
      class (?x::Int) => C a where ...
      instance (?x::a) => Foo [a] where ...
    Reason: exactly which implicit parameter you pick up depends on exactly where you invoke a function. But the ``invocation'' of instance declarations is done behind the scenes by the compiler, so it's hard to figure out exactly where it is done. Easiest thing is to outlaw the offending types.

7.3.7. Linear implicit parameters

Linear implicit parameters are an idea developed by Koen Claessen, Mark Shields, and Simon PJ. They address the long-standing problem that monads seem over-kill for certain sorts of problem, notably:

  • distributing a supply of unique names

  • distributing a suppply of random numbers

  • distributing an oracle (as in QuickCheck)

Linear implicit parameters are just like ordinary implicit parameters, except that they are "linear" -- that is, they cannot be copied, and must be explicitly "split" instead. Linear implicit parameters are written '%x' instead of '?x'. (The '/' in the '%' suggests the split!)

For example:
    import GHC.Exts( Splittable )

    data NameSupply = ...
    
    splitNS :: NameSupply -> (NameSupply, NameSupply)
    newName :: NameSupply -> Name

    instance Splittable NameSupply where
	split = splitNS


    f :: (%ns :: NameSupply) => Env -> Expr -> Expr
    f env (Lam x e) = Lam x' (f env e)
		    where
		      x'   = newName %ns
		      env' = extend env x x'
    ...more equations for f...
Notice that the implicit parameter %ns is consumed

  • once by the call to newName

  • once by the recursive call to f

So the translation done by the type checker makes the parameter explicit:
    f :: NameSupply -> Env -> Expr -> Expr
    f ns env (Lam x e) = Lam x' (f ns1 env e)
		       where
	 		 (ns1,ns2) = splitNS ns
			 x' = newName ns2
			 env = extend env x x'
Notice the call to 'split' introduced by the type checker. How did it know to use 'splitNS'? Because what it really did was to introduce a call to the overloaded function 'split', defined by the class Splittable:
	class Splittable a where
	  split :: a -> (a,a)
The instance for Splittable NameSupply tells GHC how to implement split for name supplies. But we can simply write
	g x = (x, %ns, %ns)
and GHC will infer
	g :: (Splittable a, %ns :: a) => b -> (b,a,a)
The Splittable class is built into GHC. It's exported by module GHC.Exts.

Other points:

  • '?x' and '%x' are entirely distinct implicit parameters: you can use them together and they won't intefere with each other.

  • You can bind linear implicit parameters in 'with' clauses.

  • You cannot have implicit parameters (whether linear or not) in the context of a class or instance declaration.

7.3.7.1. Warnings

The monomorphism restriction is even more important than usual. Consider the example above:
    f :: (%ns :: NameSupply) => Env -> Expr -> Expr
    f env (Lam x e) = Lam x' (f env e)
		    where
		      x'   = newName %ns
		      env' = extend env x x'
If we replaced the two occurrences of x' by (newName %ns), which is usually a harmless thing to do, we get:
    f :: (%ns :: NameSupply) => Env -> Expr -> Expr
    f env (Lam x e) = Lam (newName %ns) (f env e)
		    where
		      env' = extend env x (newName %ns)
But now the name supply is consumed in three places (the two calls to newName,and the recursive call to f), so the result is utterly different. Urk! We don't even have the beta rule.

Well, this is an experimental change. With implicit parameters we have already lost beta reduction anyway, and (as John Launchbury puts it) we can't sensibly reason about Haskell programs without knowing their typing.

7.3.7.2. Recursive functions

Linear implicit parameters can be particularly tricky when you have a recursive function Consider
        foo :: %x::T => Int -> [Int]
        foo 0 = []
        foo n = %x : foo (n-1)
where T is some type in class Splittable.

Do you get a list of all the same T's or all different T's (assuming that split gives two distinct T's back)?

If you supply the type signature, taking advantage of polymorphic recursion, you get what you'd probably expect. Here's the translated term, where the implicit param is made explicit:
        foo x 0 = []
        foo x n = let (x1,x2) = split x
                  in x1 : foo x2 (n-1)
But if you don't supply a type signature, GHC uses the Hindley Milner trick of using a single monomorphic instance of the function for the recursive calls. That is what makes Hindley Milner type inference work. So the translation becomes
        foo x = let
                  foom 0 = []
                  foom n = x : foom (n-1)
                in
                foom
Result: 'x' is not split, and you get a list of identical T's. So the semantics of the program depends on whether or not foo has a type signature. Yikes!

You may say that this is a good reason to dislike linear implicit parameters and you'd be right. That is why they are an experimental feature.

7.3.8. Functional dependencies

Functional dependencies are implemented as described by Mark Jones in “Type Classes with Functional Dependencies”, Mark P. Jones, In Proceedings of the 9th European Symposium on Programming, ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, .

There should be more documentation, but there isn't (yet). Yell if you need it.

7.3.9. Arbitrary-rank polymorphism

Haskell type signatures are implicitly quantified. The new keyword forall allows us to say exactly what this means. For example:

        g :: b -> b
means this:
        g :: forall b. (b -> b)
The two are treated identically.

However, GHC's type system supports arbitrary-rank explicit universal quantification in types. For example, all the following types are legal:
    f1 :: forall a b. a -> b -> a
    g1 :: forall a b. (Ord a, Eq  b) => a -> b -> a

    f2 :: (forall a. a->a) -> Int -> Int
    g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int

    f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool
Here, f1 and g1 are rank-1 types, and can be written in standard Haskell (e.g. f1 :: a->b->a). The forall makes explicit the universal quantification that is implicitly added by Haskell.

The functions f2 and g2 have rank-2 types; the forall is on the left of a function arrrow. As g2 shows, the polymorphic type on the left of the function arrow can be overloaded.

The functions f3 and g3 have rank-3 types; they have rank-2 types on the left of a function arrow.

GHC allows types of arbitrary rank; you can nest foralls arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but that restriction has now been lifted.) In particular, a forall-type (also called a "type scheme"), including an operational type class context, is legal:

  • On the left of a function arrow

  • On the right of a function arrow (see Section 7.3.11)

  • As the argument of a constructor, or type of a field, in a data type declaration. For example, any of the f1,f2,f3,g1,g2,g3 above would be valid field type signatures.

  • As the type of an implicit parameter

  • In a pattern type signature (see Section 7.3.13)

There is one place you cannot put a forall: you cannot instantiate a type variable with a forall-type. So you cannot make a forall-type the argument of a type constructor. So these types are illegal:
    x1 :: [forall a. a->a]
    x2 :: (forall a. a->a, Int)
    x3 :: Maybe (forall a. a->a)
Of course forall becomes a keyword; you can't use forall as a type variable any more!

7.3.9.1. Examples

In a data or newtype declaration one can quantify the types of the constructor arguments. Here are several examples:

data T a = T1 (forall b. b -> b -> b) a

data MonadT m = MkMonad { return :: forall a. a -> m a,
                          bind   :: forall a b. m a -> (a -> m b) -> m b
                        }

newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])

The constructors have rank-2 types:

T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
MkMonad :: forall m. (forall a. a -> m a)
                  -> (forall a b. m a -> (a -> m b) -> m b)
                  -> MonadT m
MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle

Notice that you don't need to use a forall if there's an explicit context. For example in the first argument of the constructor MkSwizzle, an implicit "forall a." is prefixed to the argument type. The implicit forall quantifies all type variables that are not already in scope, and are mentioned in the type quantified over.

As for type signatures, implicit quantification happens for non-overloaded types too. So if you write this:
  data T a = MkT (Either a b) (b -> b)
it's just as if you had written this:
  data T a = MkT (forall b. Either a b) (forall b. b -> b)
That is, since the type variable b isn't in scope, it's implicitly universally quantified. (Arguably, it would be better to require explicit quantification on constructor arguments where that is what is wanted. Feedback welcomed.)

You construct values of types T1, MonadT, Swizzle by applying the constructor to suitable values, just as usual. For example,

    a1 :: T Int
    a1 = T1 (\xy->x) 3
    
    a2, a3 :: Swizzle
    a2 = MkSwizzle sort
    a3 = MkSwizzle reverse
    
    a4 :: MonadT Maybe
    a4 = let r x = Just x
	     b m k = case m of
		       Just y -> k y
		       Nothing -> Nothing
         in
         MkMonad r b

    mkTs :: (forall b. b -> b -> b) -> a -> [T a]
    mkTs f x y = [T1 f x, T1 f y]

The type of the argument can, as usual, be more general than the type required, as (MkSwizzle reverse) shows. (reverse does not need the Ord constraint.)

When you use pattern matching, the bound variables may now have polymorphic types. For example:

    f :: T a -> a -> (a, Char)
    f (T1 w k) x = (w k x, w 'c' 'd')

    g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
    g (MkSwizzle s) xs f = s (map f (s xs))

    h :: MonadT m -> [m a] -> m [a]
    h m [] = return m []
    h m (x:xs) = bind m x          $ \y ->
                 bind m (h m xs)   $ \ys ->
                 return m (y:ys)

In the function h we use the record selectors return and bind to extract the polymorphic bind and return functions from the MonadT data structure, rather than using pattern matching.

7.3.9.2. Type inference

In general, type inference for arbitrary-rank types is undecideable. GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96) to get a decidable algorithm by requiring some help from the programmer. We do not yet have a formal specification of "some help" but the rule is this:

For a lambda-bound or case-bound variable, x, either the programmer provides an explicit polymorphic type for x, or GHC's type inference will assume that x's type has no foralls in it.

What does it mean to "provide" an explicit type for x? You can do that by giving a type signature for x directly, using a pattern type signature (Section 7.3.13), thus:
     \ f :: (forall a. a->a) -> (f True, f 'c')
Alternatively, you can give a type signature to the enclosing context, which GHC can "push down" to find the type for the variable:
     (\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char)
Here the type signature on the expression can be pushed inwards to give a type signature for f. Similarly, and more commonly, one can give a type signature for the function itself:
     h :: (forall a. a->a) -> (Bool,Char)
     h f = (f True, f 'c')
You don't need to give a type signature if the lambda bound variable is a constructor argument. Here is an example we saw earlier:
    f :: T a -> a -> (a, Char)
    f (T1 w k) x = (w k x, w 'c' 'd')
Here we do not need to give a type signature to w, because it is an argument of constructor T1 and that tells GHC all it needs to know.

7.3.9.3. Implicit quantification

GHC performs implicit quantification as follows. At the top level (only) of user-written types, if and only if there is no explicit forall, GHC finds all the type variables mentioned in the type that are not already in scope, and universally quantifies them. For example, the following pairs are equivalent:
  f :: a -> a
  f :: forall a. a -> a

  g (x::a) = let
                h :: a -> b -> b
                h x y = y
             in ...
  g (x::a) = let
                h :: forall b. a -> b -> b
                h x y = y
             in ...

Notice that GHC does not find the innermost possible quantification point. For example:
  f :: (a -> a) -> Int
           -- MEANS
  f :: forall a. (a -> a) -> Int
           -- NOT
  f :: (forall a. a -> a) -> Int


  g :: (Ord a => a -> a) -> Int
           -- MEANS the illegal type
  g :: forall a. (Ord a => a -> a) -> Int
           -- NOT
  g :: (forall a. Ord a => a -> a) -> Int
The latter produces an illegal type, which you might think is silly, but at least the rule is simple. If you want the latter type, you can write your for-alls explicitly. Indeed, doing so is strongly advised for rank-2 types.

7.3.10. Liberalised type synonyms

Type synonmys are like macros at the type level, and GHC does validity checking on types only after expanding type synonyms. That means that GHC can be very much more liberal about type synonyms than Haskell 98:

  • You can write a forall (including overloading) in a type synonym, thus:
      type Discard a = forall b. Show b => a -> b -> (a, String)
    
      f :: Discard a
      f x y = (x, show y)
    
      g :: Discard Int -> (Int,Bool)    -- A rank-2 type
      g f = f Int True

  • You can write an unboxed tuple in a type synonym:
      type Pr = (# Int, Int #)
    
      h :: Int -> Pr
      h x = (# x, x #)

  • You can apply a type synonym to a forall type:
      type Foo a = a -> a -> Bool
     
      f :: Foo (forall b. b->b)
    After expanding the synonym, f has the legal (in GHC) type:
      f :: (forall b. b->b) -> (forall b. b->b) -> Bool

  • You can apply a type synonym to a partially applied type synonym:
      type Generic i o = forall x. i x -> o x
      type Id x = x
      
      foo :: Generic Id []
    After epxanding the synonym, foo has the legal (in GHC) type:
      foo :: forall x. x -> [x]

GHC currently does kind checking before expanding synonyms (though even that could be changed.)

After expanding type synonyms, GHC does validity checking on types, looking for the following mal-formedness which isn't detected simply by kind checking:

  • Type constructor applied to a type involving for-alls.

  • Unboxed tuple on left of an arrow.

  • Partially-applied type synonym.

So, for example, this will be rejected:
  type Pr = (# Int, Int #)

  h :: Pr -> Int
  h x = ...
because GHC does not allow unboxed tuples on the left of a function arrow.

7.3.11. For-all hoisting

It is often convenient to use generalised type synonyms at the right hand end of an arrow, thus:
  type Discard a = forall b. a -> b -> a

  g :: Int -> Discard Int
  g x y z = x+y
Simply expanding the type synonym would give
  g :: Int -> (forall b. Int -> b -> Int)
but GHC "hoists" the forall to give the isomorphic type
  g :: forall b. Int -> Int -> b -> Int
In general, the rule is this: to determine the type specified by any explicit user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly performs the transformation:
  type1 -> forall a1..an. context2 => type2
==>
  forall a1..an. context2 => type1 -> type2
(In fact, GHC tries to retain as much synonym information as possible for use in error messages, but that is a usability issue.) This rule applies, of course, whether or not the forall comes from a synonym. For example, here is another valid way to write g's type signature:
  g :: Int -> Int -> forall b. b -> Int

When doing this hoisting operation, GHC eliminates duplicate constraints. For example:
  type Foo a = (?x::Int) => Bool -> a
  g :: Foo (Foo Int)
means
  g :: (?x::Int) => Bool -> Bool -> Int

7.3.12. Existentially quantified data constructors

The idea of using existential quantification in data type declarations was suggested by Laufer (I believe, thought doubtless someone will correct me), and implemented in Hope+. It's been in Lennart Augustsson's hbc Haskell compiler for several years, and proved very useful. Here's the idea. Consider the declaration:

  data Foo = forall a. MkFoo a (a -> Bool)
           | Nil

The data type Foo has two constructors with types:

  MkFoo :: forall a. a -> (a -> Bool) -> Foo
  Nil   :: Foo

Notice that the type variable a in the type of MkFoo does not appear in the data type itself, which is plain Foo. For example, the following expression is fine:

  [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]

Here, (MkFoo 3 even) packages an integer with a function even that maps an integer to Bool; and MkFoo 'c' isUpper packages a character with a compatible function. These two things are each of type Foo and can be put in a list.

What can we do with a value of type Foo?. In particular, what happens when we pattern-match on MkFoo?

  f (MkFoo val fn) = ???

Since all we know about val and fn is that they are compatible, the only (useful) thing we can do with them is to apply fn to val to get a boolean. For example:

  f :: Foo -> Bool
  f (MkFoo val fn) = fn val

What this allows us to do is to package heterogenous values together with a bunch of functions that manipulate them, and then treat that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way.

7.3.12.1. Why existential?

What has this to do with existential quantification? Simply that MkFoo has the (nearly) isomorphic type

  MkFoo :: (exists a . (a, a -> Bool)) -> Foo

But Haskell programmers can safely think of the ordinary universally quantified type given above, thereby avoiding adding a new existential quantification construct.

7.3.12.2. Type classes

An easy extension (implemented in hbc) is to allow arbitrary contexts before the constructor. For example:

data Baz = forall a. Eq a => Baz1 a a
         | forall b. Show b => Baz2 b (b -> b)

The two constructors have the types you'd expect:

Baz1 :: forall a. Eq a => a -> a -> Baz
Baz2 :: forall b. Show b => b -> (b -> b) -> Baz

But when pattern matching on Baz1 the matched values can be compared for equality, and when pattern matching on Baz2 the first matched value can be converted to a string (as well as applying the function to it). So this program is legal:

  f :: Baz -> String
  f (Baz1 p q) | p == q    = "Yes"
               | otherwise = "No"
  f (Baz2 v fn)            = show (fn v)

Operationally, in a dictionary-passing implementation, the constructors Baz1 and Baz2 must store the dictionaries for Eq and Show respectively, and extract it on pattern matching.

Notice the way that the syntax fits smoothly with that used for universal quantification earlier.

7.3.12.3. Restrictions

There are several restrictions on the ways in which existentially-quantified constructors can be use.

  • When pattern matching, each pattern match introduces a new, distinct, type for each existential type variable. These types cannot be unified with any other type, nor can they escape from the scope of the pattern match. For example, these fragments are incorrect:
    f1 (MkFoo a f) = a
    Here, the type bound by MkFoo "escapes", because a is the result of f1. One way to see why this is wrong is to ask what type f1 has:
      f1 :: Foo -> a             -- Weird!
    What is this "a" in the result type? Clearly we don't mean this:
      f1 :: forall a. Foo -> a   -- Wrong!
    The original program is just plain wrong. Here's another sort of error
      f2 (Baz1 a b) (Baz1 p q) = a==q
    It's ok to say a==b or p==q, but a==q is wrong because it equates the two distinct types arising from the two Baz1 constructors.

  • You can't pattern-match on an existentially quantified constructor in a let or where group of bindings. So this is illegal:
      f3 x = a==b where { Baz1 a b = x }
    Instead, use a case expression:
      f3 x = case x of Baz1 a b -> a==b
    In general, you can only pattern-match on an existentially-quantified constructor in a case expression or in the patterns of a function definition. The reason for this restriction is really an implementation one. Type-checking binding groups is already a nightmare without existentials complicating the picture. Also an existential pattern binding at the top level of a module doesn't make sense, because it's not clear how to prevent the existentially-quantified type "escaping". So for now, there's a simple-to-state restriction. We'll see how annoying it is.

  • You can't use existential quantification for newtype declarations. So this is illegal:
      newtype T = forall a. Ord a => MkT a
    Reason: a value of type T must be represented as a pair of a dictionary for Ord t and a value of type t. That contradicts the idea that newtype should have no concrete representation. You can get just the same efficiency and effect by using data instead of newtype. If there is no overloading involved, then there is more of a case for allowing an existentially-quantified newtype, because the data because the data version does carry an implementation cost, but single-field existentially quantified constructors aren't much use. So the simple restriction (no existential stuff on newtype) stands, unless there are convincing reasons to change it.

  • You can't use deriving to define instances of a data type with existentially quantified data constructors. Reason: in most cases it would not make sense. For example:#
    data T = forall a. MkT [a] deriving( Eq )
    To derive Eq in the standard way we would need to have equality between the single component of two MkT constructors:
    instance Eq T where
      (MkT a) == (MkT b) = ???
    But a and b have distinct types, and so can't be compared. It's just about possible to imagine examples in which the derived instance would make sense, but it seems altogether simpler simply to prohibit such declarations. Define your own instances!

7.3.13. Scoped type variables

A pattern type signature can introduce a scoped type variable. For example

f (xs::[a]) = ys ++ ys
           where
              ys :: [a]
              ys = reverse xs

The pattern (xs::[a]) includes a type signature for xs. This brings the type variable a into scope; it scopes over all the patterns and right hand sides for this equation for f. In particular, it is in scope at the type signature for y.

Pattern type signatures are completely orthogonal to ordinary, separate type signatures. The two can be used independently or together. At ordinary type signatures, such as that for ys, any type variables mentioned in the type signature that are not in scope are implicitly universally quantified. (If there are no type variables in scope, all type variables mentioned in the signature are universally quantified, which is just as in Haskell 98.) In this case, since a is in scope, it is not universally quantified, so the type of ys is the same as that of xs. In Haskell 98 it is not possible to declare a type for ys; a major benefit of scoped type variables is that it becomes possible to do so.

Scoped type variables are implemented in both GHC and Hugs. Where the implementations differ from the specification below, those differences are noted.

So much for the basic idea. Here are the details.

7.3.13.1. What a pattern type signature means

A type variable brought into scope by a pattern type signature is simply the name for a type. The restriction they express is that all occurrences of the same name mean the same type. For example:
  f :: [Int] -> Int -> Int
  f (xs::[a]) (y::a) = (head xs + y) :: a
The pattern type signatures on the left hand side of f express the fact that xs must be a list of things of some type a; and that y must have this same type. The type signature on the expression (head xs) specifies that this expression must have the same type a. There is no requirement that the type named by "a" is in fact a type variable. Indeed, in this case, the type named by "a" is Int. (This is a slight liberalisation from the original rather complex rules, which specified that a pattern-bound type variable should be universally quantified.) For example, all of these are legal:

  t (x::a) (y::a) = x+y*2

  f (x::a) (y::b) = [x,y]       -- a unifies with b

  g (x::a) = x + 1::Int         -- a unifies with Int

  h x = let k (y::a) = [x,y]    -- a is free in the
        in k x                  -- environment

  k (x::a) True    = ...        -- a unifies with Int
  k (x::Int) False = ...

  w :: [b] -> [b]
  w (x::a) = x                  -- a unifies with [b]

7.3.13.2. Scope and implicit quantification

  • All the type variables mentioned in a pattern, that are not already in scope, are brought into scope by the pattern. We describe this set as the type variables bound by the pattern. For example:
      f (x::a) = let g (y::(a,b)) = fst y
                 in
                 g (x,True)
    The pattern (x::a) brings the type variable a into scope, as well as the term variable x. The pattern (y::(a,b)) contains an occurrence of the already-in-scope type variable a, and brings into scope the type variable b.

  • The type variable(s) bound by the pattern have the same scope as the term variable(s) bound by the pattern. For example:
      let
        f (x::a) = <...rhs of f...>
        (p::b, q::b) = (1,2)
      in <...body of let...>
    Here, the type variable a scopes over the right hand side of f, just like x does; while the type variable b scopes over the body of the let, and all the other definitions in the let, just like p and q do. Indeed, the newly bound type variables also scope over any ordinary, separate type signatures in the let group.

  • The type variables bound by the pattern may be mentioned in ordinary type signatures or pattern type signatures anywhere within their scope.

  • In ordinary type signatures, any type variable mentioned in the signature that is in scope is not universally quantified.

  • Ordinary type signatures do not bring any new type variables into scope (except in the type signature itself!). So this is illegal:
      f :: a -> a
      f x = x::a
    It's illegal because a is not in scope in the body of f, so the ordinary signature x::a is equivalent to x::forall a.a; and that is an incorrect typing.

  • The pattern type signature is a monotype:

    • A pattern type signature cannot contain any explicit forall quantification.

    • The type variables bound by a pattern type signature can only be instantiated to monotypes, not to type schemes.

    • There is no implicit universal quantification on pattern type signatures (in contrast to ordinary type signatures).

  • The type variables in the head of a class or instance declaration scope over the methods defined in the where part. For example:
      class C a where
        op :: [a] -> a
    
        op xs = let ys::[a]
                    ys = reverse xs
                in
                head ys
    (Not implemented in Hugs yet, Dec 98).

7.3.13.3. Result type signatures

  • The result type of a function can be given a signature, thus:
      f (x::a) :: [a] = [x,x,x]
    The final :: [a] after all the patterns gives a signature to the result type. Sometimes this is the only way of naming the type variable you want:
      f :: Int -> [a] -> [a]
      f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
                            in \xs -> map g (reverse xs `zip` xs)

Result type signatures are not yet implemented in Hugs.

7.3.13.4. Where a pattern type signature can occur

A pattern type signature can occur in any pattern. For example:

  • A pattern type signature can be on an arbitrary sub-pattern, not ust on a variable:
      f ((x,y)::(a,b)) = (y,x) :: (b,a)

  • Pattern type signatures, including the result part, can be used in lambda abstractions:
      (\ (x::a, y) :: a -> x)

  • Pattern type signatures, including the result part, can be used in case expressions:
      case e of { (x::a, y) :: a -> x }

  • To avoid ambiguity, the type after the “::” in a result pattern signature on a lambda or case must be atomic (i.e. a single token or a parenthesised type of some sort). To see why, consider how one would parse this:
      \ x :: a -> b -> x

  • Pattern type signatures can bind existential type variables. For example:
      data T = forall a. MkT [a]
    
      f :: T -> T
      f (MkT [t::a]) = MkT t3
                     where
                       t3::[a] = [t,t,t]

  • Pattern type signatures can be used in pattern bindings:
      f x = let (y, z::a) = x in ...
      f1 x                = let (y, z::Int) = x in ...
      f2 (x::(Int,a))     = let (y, z::a)   = x in ...
      f3 :: (b->b)        = \x -> x
    In all such cases, the binding is not generalised over the pattern-bound type variables. Thus f3 is monomorphic; f3 has type b -> b for some type b, and not forall b. b -> b. In contrast, the binding
      f4 :: b->b
      f4 = \x -> x
    makes a polymorphic function, but b is not in scope anywhere in f4's scope.