6.8. Filenames and separate compilation¶
This section describes what files GHC expects to find, what files it creates, where these files are stored, and what options affect this behaviour.
Note that this section is written with hierarchical modules in mind (see
Hierarchical Modules); hierarchical modules are an extension to
Haskell 98 which extends the lexical syntax of module names to include a
dot .
. Non-hierarchical modules are thus a special case in which none
of the module names contain dots.
Pathname conventions vary from system to system. In particular, the
directory separator is “/
” on Unix systems and “\
” on
Windows systems. In the sections that follow, we shall consistently use
“/
” as the directory separator; substitute this for the
appropriate character for your system.
6.8.1. Haskell source files¶
Each Haskell source module should be placed in a file on its own.
Usually, the file should be named after the module name, replacing dots
in the module name by directory separators. For example, on a Unix
system, the module A.B.C
should be placed in the file A/B/C.hs
,
relative to some base directory. If the module is not going to be
imported by another module (Main
, for example), then you are free to
use any filename for it.
GHC assumes that source files are ASCII or UTF-8 only, other encoding are not recognised. However, invalid UTF-8 sequences will be ignored in comments, so it is possible to use other encodings such as Latin-1, as long as the non-comment source code is ASCII only.
6.8.2. Output files¶
When asked to compile a source file, GHC normally generates two files: an object file, and an interface file.
The object file, which normally ends in a .o
suffix, contains the
compiled code for the module.
The interface file, which normally ends in a .hi
suffix, contains
the information that GHC needs in order to compile further modules that
depend on this module. It contains things like the types of exported
functions, definitions of data types, and so on. It is stored in a
binary format, so don’t try to read one; use the --show-iface
option
instead (see Other options related to interface files).
You should think of the object file and the interface file as a pair, since the interface file is in a sense a compiler-readable description of the contents of the object file. If the interface file and object file get out of sync for any reason, then the compiler may end up making assumptions about the object file that aren’t true; trouble will almost certainly follow. For this reason, we recommend keeping object files and interface files in the same place (GHC does this by default, but it is possible to override the defaults as we’ll explain shortly).
Every module has a module name defined in its source code
(module A.B.C where ...
).
The name of the object file generated by GHC is derived according to the
following rules, where ⟨osuf⟩ is the object-file suffix (this can be
changed with the -osuf
option).
- If there is no
-odir
option (the default), then the object filename is derived from the source filename (ignoring the module name) by replacing the suffix with ⟨osuf⟩. - If
-odir ⟨dir⟩
has been specified, then the object filename is ⟨dir⟩/⟨mod⟩.⟨osuf⟩, where ⟨mod⟩ is the module name with dots replaced by slashes. GHC will silently create the necessary directory structure underneath ⟨dir⟩, if it does not already exist.
The name of the interface file is derived using the same rules, except
that the suffix is ⟨hisuf⟩ (.hi
by default) instead of ⟨osuf⟩, and
the relevant options are -hidir
and -hisuf
instead of -odir
and -osuf
respectively.
For example, if GHC compiles the module A.B.C
in the file
src/A/B/C.hs
, with no -odir
or -hidir
flags, the interface
file will be put in src/A/B/C.hi
and the object file in
src/A/B/C.o
.
For any module that is imported, GHC requires that the name of the module in the import statement exactly matches the name of the module in the interface file (or source file) found using the strategy specified in The search path. This means that for most modules, the source file name should match the module name.
However, note that it is reasonable to have a module Main
in a file
named foo.hs
, but this only works because GHC never needs to search
for the interface for module Main
(because it is never imported). It
is therefore possible to have several Main
modules in separate
source files in the same directory, and GHC will not get confused.
In batch compilation mode, the name of the object file can also be
overridden using the -o
option, and the name of the interface file
can be specified directly using the -ohi
option.
6.8.3. The search path¶
In your program, you import a module Foo
by saying import Foo
.
In --make
mode or GHCi, GHC will look for a source file for Foo
and arrange to compile it first. Without --make
, GHC will look for
the interface file for Foo
, which should have been created by an
earlier compilation of Foo
. GHC uses the same strategy in each of
these cases for finding the appropriate file.
This strategy is as follows: GHC keeps a list of directories called the
search path. For each of these directories, it tries appending
⟨basename⟩.⟨extension⟩
to the directory, and checks whether the
file exists. The value of ⟨basename⟩ is the module name with dots
replaced by the directory separator (“/
” or “\\"
, depending on the
system), and ⟨extension⟩ is a source extension (hs
, lhs
) if we
are in --make
mode or GHCi, or ⟨hisuf⟩ otherwise.
For example, suppose the search path contains directories d1
,
d2
, and d3
, and we are in --make
mode looking for the source
file for a module A.B.C
. GHC will look in d1/A/B/C.hs
,
d1/A/B/C.lhs
, d2/A/B/C.hs
, and so on.
The search path by default contains a single directory: “.
” (i.e. the
current directory). The following options can be used to add to or change the
contents of the search path:
-i⟨dir⟩[:⟨dir⟩]*
This flag appends a colon-separated list of
dirs
to the search path.-i
- resets the search path back to nothing.
This isn’t the whole story: GHC also looks for modules in pre-compiled libraries, known as packages. See the section on packages (Packages) for details.
6.8.4. Redirecting the compilation output(s)¶
-o ⟨file⟩
GHC’s compiled output normally goes into a
.hc
,.o
, etc., file, depending on the last-run compilation phase. The option-o file
re-directs the output of that last-run phase to ⟨file⟩.Note
This “feature” can be counterintuitive:
ghc -C -o foo.o foo.hs
will put the intermediate C code in the filefoo.o
, name notwithstanding!This option is most often used when creating an executable file, to set the filename of the executable. For example:
ghc -o prog --make Main
will compile the program starting with module
Main
and put the executable in the fileprog
.Note: on Windows, if the result is an executable file, the extension “
.exe
” is added if the specified filename does not already have an extension. Thusghc -o foo Main.hs
will compile and link the module
Main.hs
, and put the resulting executable infoo.exe
(notfoo
).If you use
ghc --make
and you don’t use the-o
, the name GHC will choose for the executable will be based on the name of the file containing the moduleMain
. Note that with GHC theMain
module doesn’t have to be put in fileMain.hs
. Thus bothghc --make Prog
and
ghc --make Prog.hs
will produce
Prog
(orProg.exe
if you are on Windows).-odir ⟨dir⟩
Redirects object files to directory ⟨dir⟩. For example:
$ ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir `uname -m`
The object files,
Foo.o
,Bar.o
, andBumble.o
would be put into a subdirectory named after the architecture of the executing machine (x86
,mips
, etc).Note that the
-odir
option does not affect where the interface files are put; use the-hidir
option for that. In the above example, they would still be put inparse/Foo.hi
,parse/Bar.hi
, andgurgle/Bumble.hi
.-ohi ⟨file⟩
The interface output may be directed to another file
bar2/Wurble.iface
with the option-ohi bar2/Wurble.iface
(not recommended).Warning
If you redirect the interface file somewhere that GHC can’t find it, then the recompilation checker may get confused (at the least, you won’t get any recompilation avoidance). We recommend using a combination of
-hidir
and-hisuf
options instead, if possible.To avoid generating an interface at all, you could use this option to redirect the interface into the bit bucket:
-ohi /dev/null
, for example.-hidir ⟨dir⟩
Redirects all generated interface files into ⟨dir⟩, instead of the default.
-stubdir ⟨dir⟩
Redirects all generated FFI stub files into ⟨dir⟩. Stub files are generated when the Haskell source contains a
foreign export
orforeign import "&wrapper"
declaration (see Using foreign export and foreign import ccall “wrapper” with GHC). The-stubdir
option behaves in exactly the same way as-odir
and-hidir
with respect to hierarchical modules.-dumpdir ⟨dir⟩
Redirects all dump files into ⟨dir⟩. Dump files are generated when
-ddump-to-file
is used with other-ddump-*
flags.-outputdir ⟨dir⟩
The
-outputdir
option is shorthand for the combination of-odir
,-hidir
,-stubdir
and-dumpdir
.-osuf ⟨suffix⟩; \ ``-hisuf
⟨suffix⟩;-hcsuf
⟨suffix⟩``The
-osuf
⟨suffix⟩ will change the.o
file suffix for object files to whatever you specify. We use this when compiling libraries, so that objects for the profiling versions of the libraries don’t clobber the normal ones.Similarly, the
-hisuf
⟨suffix⟩ will change the.hi
file suffix for non-system interface files (see Other options related to interface files).Finally, the option
-hcsuf
⟨suffix⟩ will change the.hc
file suffix for compiler-generated intermediate C files.The
-hisuf
/-osuf
game is particularly useful if you want to compile a program both with and without profiling, in the same directory. You can say:ghc ...
to get the ordinary version, and
ghc ... -osuf prof.o -hisuf prof.hi -prof -auto-all
to get the profiled version.
6.8.5. Keeping Intermediate Files¶
The following options are useful for keeping certain intermediate files around, when normally GHC would throw these away after compilation:
-keep-hc-file
Keep intermediate
.hc
files when doing.hs
-to-.o
compilations via C (Note:.hc
files are only generated by unregisterised compilers).-keep-llvm-file
Keep intermediate
.ll
files when doing.hs
-to-.o
compilations via LLVM (Note:.ll
files aren’t generated when using the native code generator, you may need to use-fllvm
to force them to be produced).-keep-s-file
Keep intermediate
.s
files.-keep-tmp-files
Instructs the GHC driver not to delete any of its temporary files, which it normally keeps in
/tmp
(or possibly elsewhere; see Redirecting temporary files). Running GHC with-v
will show you what temporary files were generated along the way.
6.8.6. Redirecting temporary files¶
-tmpdir
If you have trouble because of running out of space in
/tmp
(or wherever your installation thinks temporary files should go), you may use the-tmpdir <dir>
-tmpdir <dir> option option to specify an alternate directory. For example,-tmpdir .
says to put temporary files in the current working directory.Alternatively, use your
TMPDIR
environment variable.TMPDIR environment variable Set it to the name of the directory where temporary files should be put. GCC and other programs will honour theTMPDIR
variable as well.Even better idea: Set the
DEFAULT_TMPDIR
make variable when building GHC, and never worry aboutTMPDIR
again. (see the build documentation).
6.8.8. The recompilation checker¶
-fforce-recomp
Turn off recompilation checking (which is on by default). Recompilation checking normally stops compilation early, leaving an existing
.o
file in place, if it can be determined that the module does not need to be recompiled.
In the olden days, GHC compared the newly-generated .hi
file with
the previous version; if they were identical, it left the old one alone
and didn’t change its modification date. In consequence, importers of a
module with an unchanged output .hi
file were not recompiled.
This doesn’t work any more. Suppose module C
imports module B
,
and B
imports module A
. So changes to module A
might require
module C
to be recompiled, and hence when A.hi
changes we should
check whether C
should be recompiled. However, the dependencies of
C
will only list B.hi
, not A.hi
, and some changes to A
(changing the definition of a function that appears in an inlining of a
function exported by B
, say) may conceivably not change B.hi
one
jot. So now…
GHC calculates a fingerprint (in fact an MD5 hash) of each interface
file, and of each declaration within the interface file. It also keeps
in every interface file a list of the fingerprints of everything it used
when it last compiled the file. If the source file’s modification date
is earlier than the .o
file’s date (i.e. the source hasn’t changed
since the file was last compiled), and the recompilation checking is on,
GHC will be clever. It compares the fingerprints on the things it needs
this time with the fingerprints on the things it needed last time
(gleaned from the interface file of the module being compiled); if they
are all the same it stops compiling early in the process saying
“Compilation IS NOT required”. What a beautiful sight!
You can read about how all this works in the GHC commentary.
6.8.9. How to compile mutually recursive modules¶
GHC supports the compilation of mutually recursive modules. This section explains how.
Every cycle in the module import graph must be broken by a hs-boot
file. Suppose that modules A.hs
and B.hs
are Haskell source
files, thus:
module A where
import B( TB(..) )
newtype TA = MkTA Int
f :: TB -> TA
f (MkTB x) = MkTA x
module B where
import {-# SOURCE #-} A( TA(..) )
data TB = MkTB !Int
g :: TA -> TB
g (MkTA x) = MkTB x
hs-boot
files importing, hi-boot
files Here A
imports B
,
but B
imports A
with a {-# SOURCE #-}
pragma, which breaks
the circular dependency. Every loop in the module import graph must be
broken by a {-# SOURCE #-}
import; or, equivalently, the module
import graph must be acyclic if {-# SOURCE #-}
imports are ignored.
For every module A.hs
that is {-# SOURCE #-}
-imported in this
way there must exist a source file A.hs-boot
. This file contains an
abbreviated version of A.hs
, thus:
module A where
newtype TA = MkTA Int
To compile these three files, issue the following commands:
ghc -c A.hs-boot -- Produces A.hi-boot, A.o-boot
ghc -c B.hs -- Consumes A.hi-boot, produces B.hi, B.o
ghc -c A.hs -- Consumes B.hi, produces A.hi, A.o
ghc -o foo A.o B.o -- Linking the program
There are several points to note here:
The file
A.hs-boot
is a programmer-written source file. It must live in the same directory as its parent source fileA.hs
. Currently, if you use a literate source fileA.lhs
you must also use a literate boot file,A.lhs-boot
; and vice versa.A
hs-boot
file is compiled by GHC, just like ahs
file:ghc -c A.hs-boot
When a hs-boot file
A.hs-boot
is compiled, it is checked for scope and type errors. When its parent moduleA.hs
is compiled, the two are compared, and an error is reported if the two are inconsistent.Just as compiling
A.hs
produces an interface fileA.hi
, and an object fileA.o
, so compilingA.hs-boot
produces an interface fileA.hi-boot
, and an pseudo-object fileA.o-boot
:- The pseudo-object file
A.o-boot
is empty (don’t link it!), but it is very useful when using a Makefile, to record when theA.hi-boot
was last brought up to date (see Using make). - The
hi-boot
generated by compiling ahs-boot
file is in the same machine-generated binary format as any other GHC-generated interface file (e.g.B.hi
). You can display its contents withghc --show-iface
. If you specify a directory for interface files, the-ohidir
flag, then that affectshi-boot
files too.
- The pseudo-object file
If hs-boot files are considered distinct from their parent source files, and if a
{-# SOURCE #-}
import is considered to refer to the hs-boot file, then the module import graph must have no cycles. The commandghc -M
will report an error if a cycle is found.A module
M
that is{-# SOURCE #-}
-imported in a program will usually also be ordinarily imported elsewhere. If not,ghc --make
automatically addsM
to the set of modules it tries to compile and link, to ensure thatM
‘s implementation is included in the final program.
A hs-boot file need only contain the bare minimum of information needed
to get the bootstrapping process started. For example, it doesn’t need
to contain declarations for everything that module A
exports, only
the things required by the module(s) that import A
recursively.
A hs-boot file is written in a subset of Haskell:
The module header (including the export list), and import statements, are exactly as in Haskell, and so are the scoping rules. Hence, to mention a non-Prelude type or class, you must import it.
There must be no value declarations, but there can be type signatures for values. For example:
double :: Int -> Int
Fixity declarations are exactly as in Haskell.
Vanilla type synonym declarations are exactly as in Haskell.
Open type and data family declarations are exactly as in Haskell.
A closed type family may optionally omit its equations, as in the following example:
type family ClosedFam a where ..
The
..
is meant literally – you should write two dots in your file. Note that thewhere
clause is still necessary to distinguish closed families from open ones. If you give any equations of a closed family, you must give all of them, in the same order as they appear in the accompanying Haskell file.A data type declaration can either be given in full, exactly as in Haskell, or it can be given abstractly, by omitting the ‘=’ sign and everything that follows. For example:
data T a b
In a source program this would declare TA to have no constructors (a GHC extension: see Data types with no constructors), but in an hi-boot file it means “I don’t know or care what the constructors are”. This is the most common form of data type declaration, because it’s easy to get right. You can also write out the constructors but, if you do so, you must write it out precisely as in its real definition.
If you do not write out the constructors, you may need to give a kind annotation (Explicitly-kinded quantification), to tell GHC the kind of the type variable, if it is not “*”. (In source files, this is worked out from the way the type variable is used in the constructors.) For example:
data R (x :: * -> *) y
You cannot use
deriving
on a data type declaration; write aninstance
declaration instead.Class declarations is exactly as in Haskell, except that you may not put default method declarations. You can also omit all the superclasses and class methods entirely; but you must either omit them all or put them all in.
You can include instance declarations just as in Haskell; but omit the “where” part.
The default role for abstract datatype parameters is now representational. (An abstract datatype is one with no constructors listed.) To get another role, use a role annotation. (See Roles.)
6.8.10. Module signatures¶
GHC supports the specification of module signatures, which both
implementations and users can typecheck against separately. This
functionality should be considered experimental for now; some details,
especially for type classes and type families, may change. This system
was originally described in Backpack: Retrofitting Haskell with
Interfaces. Signature files are
somewhat similar to hs-boot
files, but have the hsig
extension
and behave slightly differently.
Suppose that I have modules String.hs
and A.hs
, thus:
module Text where
data Text = Text String
empty :: Text
empty = Text ""
toString :: Text -> String
toString (Text s) = s
module A where
import Text
z = toString empty
Presently, module A
depends explicitly on a concrete implementation
of Text
. What if we wanted to a signature Text
, so we could vary
the implementation with other possibilities (e.g. packed UTF-8 encoded
bytestrings)? To do this, we can write a signature TextSig.hsig
, and
modify A
to include the signature instead:
module TextSig where
data Text
empty :: Text
toString :: Text -> String
module A where
import TextSig
z = toString empty
To compile these two files, we need to specify what module we would like
to use to implement the signature. This can be done by compiling the
implementation, and then using the -sig-of
flag to specify the
implementation backing a signature:
ghc -c Text.hs
ghc -c TextSig.hsig -sig-of "TextSig is main:Text"
ghc -c A.hs
To specify multiple signatures, use a comma-separated list. The
-sig-of
parameter is required to specify the backing implementations
of all home modules, even in one-shot compilation mode. At the moment,
you must specify the full module name (unit ID, colon, and then
module name), although in the future we may support more user-friendly
syntax.
To just type-check an interface file, no -sig-of
is necessary;
instead, just pass the options -fno-code -fwrite-interface
. hsig
files will generate normal interface files which other files can also
use to type-check against. However, at the moment, we always assume that
an entity defined in a signature is a unique identifier (even though we
may happen to know it is type equal with another identifier). In the
future, we will support passing shaping information to the compiler in
order to let it know about these type equalities.
Just like hs-boot
files, when an hsig
file is compiled it is
checked for type consistency against the backing implementation.
Signature files are also written in a subset of Haskell essentially
identical to that of hs-boot
files.
There is one important gotcha with the current implementation: currently, instances from backing implementations will “leak” code that uses signatures, and explicit instance declarations in signatures are forbidden. This behavior will be subject to change.
6.8.11. Using make
¶
It is reasonably straightforward to set up a Makefile
to use with
GHC, assuming you name your source files the same as your modules. Thus:
HC = ghc
HC_OPTS = -cpp $(EXTRA_HC_OPTS)
SRCS = Main.lhs Foo.lhs Bar.lhs
OBJS = Main.o Foo.o Bar.o
.SUFFIXES : .o .hs .hi .lhs .hc .s
cool_pgm : $(OBJS)
rm -f $@
$(HC) -o $@ $(HC_OPTS) $(OBJS)
# Standard suffix rules
.o.hi:
@:
.lhs.o:
$(HC) -c $< $(HC_OPTS)
.hs.o:
$(HC) -c $< $(HC_OPTS)
.o-boot.hi-boot:
@:
.lhs-boot.o-boot:
$(HC) -c $< $(HC_OPTS)
.hs-boot.o-boot:
$(HC) -c $< $(HC_OPTS)
# Inter-module dependencies
Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz
Main.o Main.hc Main.s : Foo.hi Baz.hi # Main imports Foo and Baz
Note
Sophisticated make
variants may achieve some of the above more
elegantly. Notably, gmake
‘s pattern rules let you write the more
comprehensible:
%.o : %.lhs
$(HC) -c $< $(HC_OPTS)
What we’ve shown should work with any make
.
Note the cheesy .o.hi
rule: It records the dependency of the
interface (.hi
) file on the source. The rule says a .hi
file can
be made from a .o
file by doing…nothing. Which is true.
Note that the suffix rules are all repeated twice, once for normal
Haskell source files, and once for hs-boot
files (see
How to compile mutually recursive modules).
Note also the inter-module dependencies at the end of the Makefile, which take the form
Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz
They tell make
that if any of Foo.o
, Foo.hc
or Foo.s
have an earlier modification date than Baz.hi
, then the out-of-date
file must be brought up to date. To bring it up to date, make
looks
for a rule to do so; one of the preceding suffix rules does the job
nicely. These dependencies can be generated automatically by ghc
;
see Dependency generation
6.8.12. Dependency generation¶
Putting inter-dependencies of the form Foo.o : Bar.hi
into your
Makefile
by hand is rather error-prone. Don’t worry, GHC has support
for automatically generating the required dependencies. Add the
following to your Makefile
:
depend :
ghc -M $(HC_OPTS) $(SRCS)
Now, before you start compiling, and any time you change the imports
in your program, do make depend
before you do make cool_pgm
. The command
ghc -M
will append the needed dependencies to your Makefile
.
In general, ghc -M Foo
does the following. For each module M
in
the set Foo
plus all its imports (transitively), it adds to the
Makefile:
A line recording the dependence of the object file on the source file.
M.o : M.hs
(or
M.lhs
if that is the filename you used).For each import declaration
import X
inM
, a line recording the dependence ofM
onX
:M.o : X.hi
For each import declaration
import {-# SOURCE #-} X
inM
, a line recording the dependence ofM
onX
:M.o : X.hi-boot
(See How to compile mutually recursive modules for details of
hi-boot
style interface files.)
If M
imports multiple modules, then there will be multiple lines
with M.o
as the target.
There is no need to list all of the source files as arguments to the
ghc -M
command; ghc
traces the dependencies, just like
ghc --make
(a new feature in GHC 6.4).
Note that ghc -M
needs to find a source file for each module in
the dependency graph, so that it can parse the import declarations and
follow dependencies. Any pre-compiled modules without source files must
therefore belong to a package [1].
By default, ghc -M
generates all the dependencies, and then
concatenates them onto the end of makefile
(or Makefile
if
makefile
doesn’t exist) bracketed by the lines
“# DO NOT DELETE: Beginning of Haskell dependencies
” and
“# DO NOT DELETE: End of Haskell dependencies
”. If these lines
already exist in the makefile
, then the old dependencies are deleted
first.
Don’t forget to use the same -package
options on the ghc -M
command line as you would when compiling; this enables the dependency
generator to locate any imported modules that come from packages. The
package modules won’t be included in the dependencies generated, though
(but see the -include-pkg-deps
option below).
The dependency generation phase of GHC can take some additional options, which you may find useful. The options which affect dependency generation are:
-ddump-mod-cycles
- Display a list of the cycles in the module graph. This is useful when trying to eliminate such cycles.
-v2
- Print a full list of the module dependencies to stdout. (This is the
standard verbosity flag, so the list will also be displayed with
-v3
and-v4
; Verbosity options.) -dep-makefile ⟨file⟩
- Use ⟨file⟩ as the makefile, rather than
makefile
orMakefile
. If ⟨file⟩ doesn’t exist,mkdependHS
creates it. We often use-dep-makefile .depend
to put the dependencies in.depend
and theninclude
the file.depend
intoMakefile
. -dep-suffix <suf>
- Make extra dependencies that declare that files with suffix
.<suf>_<osuf>
depend on interface files with suffix.<suf>_hi
, or (for{-# SOURCE #-}
imports) on.hi-boot
. Multiple-dep-suffix
flags are permitted. For example,-dep-suffix a -dep-suffix b
will make dependencies for.hs
on.hi
,.a_hs
on.a_hi
, and.b_hs
on.b_hi
. (Useful in conjunction with NoFib “ways”.) --exclude-module=<file>
- Regard
<file>
as “stable”; i.e., exclude it from having dependencies on it. -include-pkg-deps
- Regard modules imported from packages as unstable, i.e., generate
dependencies on any imported package modules (including
Prelude
, and all other standard Haskell libraries). Dependencies are not traced recursively into packages; dependencies are only generated for home-package modules on external-package modules directly imported by the home package module. This option is normally only used by the various system libraries.
6.8.13. Orphan modules and instance declarations¶
Haskell specifies that when compiling module M
, any instance declaration
in any module “below” M
is visible. (Module A
is “below” M
if A
is
imported directly by M
, or if A
is below a module that M
imports
directly.) In principle, GHC must therefore read the interface files of
every module below M
, just in case they contain an instance declaration
that matters to M
. This would be a disaster in practice, so GHC tries to
be clever.
In particular, if an instance declaration is in the same module as the
definition of any type or class mentioned in the head of the instance
declaration (the part after the “=>
”; see Relaxed rules for instance contexts), then GHC
has to visit that interface file anyway. Example:
module A where
instance C a => D (T a) where ...
data T a = ...
The instance declaration is only relevant if the type T
is in use, and
if so, GHC will have visited A
‘s interface file to find T
‘s definition.
The only problem comes when a module contains an instance declaration and GHC has no other reason for visiting the module. Example:
module Orphan where
instance C a => D (T a) where ...
class C a where ...
Here, neither D
nor T
is declared in module Orphan
. We call such modules
“orphan modules”. GHC identifies orphan modules, and visits the
interface file of every orphan module below the module being compiled.
This is usually wasted work, but there is no avoiding it. You should
therefore do your best to have as few orphan modules as possible.
Functional dependencies complicate matters. Suppose we have:
module B where
instance E T Int where ...
data T = ...
Is this an orphan module? Apparently not, because T
is declared in
the same module. But suppose class E
had a functional dependency:
module Lib where
class E x y | y -> x where ...
Then in some importing module M
, the constraint (E a Int)
should be
“improved” by setting a = T
, even though there is no explicit
mention of T
in M
.
These considerations lead to the following definition of an orphan module:
An orphan module orphan module contains at least one orphan instance or at least one orphan rule.
An instance declaration in a module
M
is an orphan instance if orphan instance- The class of the instance declaration is not declared in
M
, and - Either the class has no functional dependencies, and none of the
type constructors in the instance head is declared in
M
; or there is a functional dependency for which none of the type constructors mentioned in the non-determined part of the instance head is defined inM
.
Only the instance head counts. In the example above, it is not good enough for
C
‘s declaration to be in moduleA
; it must be the declaration ofD
orT
.- The class of the instance declaration is not declared in
A rewrite rule in a module
M
is an orphan rule orphan rule if none of the variables, type constructors, or classes that are free in the left hand side of the rule are declared inM
.
If you use the flag -fwarn-orphans
, GHC will warn you if you are
creating an orphan module. Like any warning, you can switch the warning
off with -fno-warn-orphans
, and -Werror
will make the
compilation fail if the warning is issued.
You can identify an orphan module by looking in its interface file,
M.hi
, using the --show-iface
mode. If there is a
[orphan module]
on the first line, GHC considers it an orphan
module.
[1] | This is a change in behaviour relative to 6.2 and earlier. |