>Debugging the compiler

4.18. Debugging the compiler

HACKER TERRITORY. HACKER TERRITORY. (You were warned.)

4.18.1. Dumping out compiler intermediate structures

-ddump-pass

Make a debugging dump after pass <pass> (may be common enough to need a short form…). You can get all of these at once (lots of output) by using -ddump-all, or most of them with -ddump-most. Some of the most useful ones are:

-ddump-parsed:

parser output

-ddump-rn:

renamer output

-ddump-tc:

typechecker output

-ddump-types:

Dump a type signature for each value defined at the top level of the module. The list is sorted alphabetically. Using -dppr-debug dumps a type signature for all the imported and system-defined things as well; useful for debugging the compiler.

-ddump-deriv:

derived instances

-ddump-ds:

desugarer output

-ddump-spec:

output of specialisation pass

-ddump-rules:

dumps all rewrite rules (including those generated by the specialisation pass)

-ddump-simpl:

simplifer output (Core-to-Core passes)

-ddump-inlinings:

inlining info from the simplifier

-ddump-usagesp:

UsageSP inference pre-inf and output

-ddump-cpranal:

CPR analyser output

-ddump-stranal:

strictness analyser output

-ddump-cse:

CSE pass output

-ddump-workwrap:

worker/wrapper split output

-ddump-occur-anal:

`occurrence analysis' output

-ddump-sat:

output of “saturate” pass

-ddump-stg:

output of STG-to-STG passes

-ddump-absC:

unflattened Abstract C

-ddump-flatC:

flattened Abstract C

-ddump-realC:

same as what goes to the C compiler

-ddump-stix:

native-code generator intermediate form

-ddump-asm:

assembly language from the native-code generator

-ddump-bcos:

byte code compiler output

-ddump-foreign:

dump foreign export stubs

-dverbose-core2core, -dverbose-stg2stg

Show the output of the intermediate Core-to-Core and STG-to-STG passes, respectively. (Lots of output!) So: when we're really desperate:

% ghc -noC -O -ddump-simpl -dverbose-simpl -dcore-lint Foo.hs
-ddump-simpl-iterations:

Show the output of each iteration of the simplifier (each run of the simplifier has a maximum number of iterations, normally 4). Used when even -dverbose-simpl doesn't cut it.

-dppr-debug

Debugging output is in one of several “styles.” Take the printing of types, for example. In the “user” style (the default), the compiler's internal ideas about types are presented in Haskell source-level syntax, insofar as possible. In the “debug” style (which is the default for debugging output), the types are printed in with explicit foralls, and variables have their unique-id attached (so you can check for things that look the same but aren't). This flag makes debugging output appear in the more verbose debug style.

-dppr-user-length

In error messages, expressions are printed to a certain “depth”, with subexpressions beyond the depth replaced by ellipses. This flag sets the depth.

-ddump-simpl-stats

Dump statistics about how many of each kind of transformation too place. If you add -dppr-debug you get more detailed information.

-ddump-rn-trace

Make the renamer be *real* chatty about what it is upto.

-ddump-rn-stats

Print out summary of what kind of information the renamer had to bring in.

-dshow-unused-imports

Have the renamer report what imports does not contribute.

4.18.2. Checking for consistency

-dcore-lint

Turn on heavyweight intra-pass sanity-checking within GHC, at Core level. (It checks GHC's sanity, not yours.)

-dstg-lint:

Ditto for STG level. (NOTE: currently doesn't work).

-dusagesp-lint:

Turn on checks around UsageSP inference (-fusagesp). This verifies various simple properties of the results of the inference, and also warns if any identifier with a used-once annotation before the inference has a used-many annotation afterwards; this could indicate a non-worksafe transformation is being applied.

4.18.3. How to read Core syntax (from some -ddump flags)

Let's do this by commenting an example. It's from doing -ddump-ds on this code:
skip2 m = m : skip2 (m+2)
Before we jump in, a word about names of things. Within GHC, variables, type constructors, etc., are identified by their “Uniques.” These are of the form `letter' plus `number' (both loosely interpreted). The `letter' gives some idea of where the Unique came from; e.g., _ means “built-in type variable”; t means “from the typechecker”; s means “from the simplifier”; and so on. The `number' is printed fairly compactly in a `base-62' format, which everyone hates except me (WDP).

Remember, everything has a “Unique” and it is usually printed out when debugging, in some form or another. So here we go…

Desugared:
Main.skip2{-r1L6-} :: _forall_ a$_4 =>{{Num a$_4}} -> a$_4 -> [a$_4]

--# `r1L6' is the Unique for Main.skip2;
--# `_4' is the Unique for the type-variable (template) `a'
--# `{{Num a$_4}}' is a dictionary argument

_NI_

--# `_NI_' means "no (pragmatic) information" yet; it will later
--# evolve into the GHC_PRAGMA info that goes into interface files.

Main.skip2{-r1L6-} =
    /\ _4 -> \ d.Num.t4Gt ->
        let {
          {- CoRec -}
          +.t4Hg :: _4 -> _4 -> _4
          _NI_
          +.t4Hg = (+{-r3JH-} _4) d.Num.t4Gt

          fromInt.t4GS :: Int{-2i-} -> _4
          _NI_
          fromInt.t4GS = (fromInt{-r3JX-} _4) d.Num.t4Gt

--# The `+' class method (Unique: r3JH) selects the addition code
--# from a `Num' dictionary (now an explicit lamba'd argument).
--# Because Core is 2nd-order lambda-calculus, type applications
--# and lambdas (/\) are explicit.  So `+' is first applied to a
--# type (`_4'), then to a dictionary, yielding the actual addition
--# function that we will use subsequently...

--# We play the exact same game with the (non-standard) class method
--# `fromInt'.  Unsurprisingly, the type `Int' is wired into the
--# compiler.

          lit.t4Hb :: _4
          _NI_
          lit.t4Hb =
              let {
                ds.d4Qz :: Int{-2i-}
                _NI_
                ds.d4Qz = I#! 2#
              } in  fromInt.t4GS ds.d4Qz

--# `I# 2#' is just the literal Int `2'; it reflects the fact that
--# GHC defines `data Int = I# Int#', where Int# is the primitive
--# unboxed type.  (see relevant info about unboxed types elsewhere...)

--# The `!' after `I#' indicates that this is a *saturated*
--# application of the `I#' data constructor (i.e., not partially
--# applied).

          skip2.t3Ja :: _4 -> [_4]
          _NI_
          skip2.t3Ja =
              \ m.r1H4 ->
                  let { ds.d4QQ :: [_4]
                        _NI_
                        ds.d4QQ =
                    let {
                      ds.d4QY :: _4
                      _NI_
                      ds.d4QY = +.t4Hg m.r1H4 lit.t4Hb
                    } in  skip2.t3Ja ds.d4QY
                  } in
                  :! _4 m.r1H4 ds.d4QQ

          {- end CoRec -}
        } in  skip2.t3Ja

(“It's just a simple functional language” is an unregisterised trademark of Peyton Jones Enterprises, plc.)

4.18.4. Unregisterised compilation

The term "unregisterised" really means "compile via vanilla C", disabling some of the platform-specific tricks that GHC normally uses to make programs go faster. When compiling unregisterised, GHC simply generates a C file which is compiled via gcc.

Unregisterised compilation can be useful when porting GHC to a new machine, since it reduces the prerequisite tools to gcc, as, and ld and nothing more, and furthermore the amount of platform-specific code that needs to be written in order to get unregisterised compilation going is usually fairly small.

-unreg:

Compile via vanilla ANSI C only, turning off platform-specific optimisations. NOTE: in order to use -unreg, you need to have a set of libraries (including the RTS) built for unregisterised compilation. This amounts to building GHC with way "u" enabled.