Running compiled programs
Program flags
A Haskell program compiled with hbc automatically decodes a number of flags:
- -C
Produce a ``core'' file when a signal occurs.
- -f
Prints a description of the flags; the program is not executed.
- -Hsize
Set maximum heap size. Default is 8M.
- -hsize
Set minimum heap size. Default is 500k.
- -Asize
Set pointer stack size. Default is 100k.
- -Vsize
SPARC only! Set return stack size. Default is 50000.
- -T
Enter the runtime tracer. This only works if some of the files
were compiled with the ``-T'' flag. If this flag is used twice
a trace is produced without any user interaction.
- -gc-gen
Use a generational garbage collector.
- -gc-slide
Use an in-place compacting garbage collector.
- -X
Debug mode (gives some additional messages).
- -
Marks the end of decoded arguments.
If the runtime system was compiled with dumping enabled there are some
additional flags:
- -d
Print a stack dump on error.
- -G
Print a stack dump before and after each garbage collection.
- -Kaddr
When GC reaches stack location addr the routine debstop is called.
- -Mn
The maximum number of dumped nodes is set to n.
- -tn
The depth of dump is set to n.
If the runtime system was compiled with GC statistics enabled there
are some additional flags:
- -B
Sound the bell at the start of garbage collection.
- -Sfile
Produce a garbage collection statistics file. If no file name is given
it is written in ``STAT.program''. If the file name is ``stderr'' it is written
on standard error.
The file ``STAT.program'' (when produced)
will contain various (selfexplanatory?) statistics.
Heap profiling (Authors: Colin Runciman and David Wakeling)
If the program has been compiled with the heap profiling turned of (the -ph flag
to the compiler) it decodes the following flags:
- -ifloat
set the sampling interval of the heap profiler to float seconds.
Normally the profiler runs with an exponentially increasing profiling
interval.
- -g[{g,...}]
Give profiling information by module group.
When the graph is sampled the space occupied by each node (in bytes) is
charged to the particular group of modules that produced the node. In
some cases, only certain groups of modules may be of interest, and
these groups can be named in an optional restriction set following the
-g flag.
- -m[{m,...}]
Give profiling information by module. Similar to the -g flag.
In this case though, the space occupied by each node is charged to the
module that produced the node. Once again, only certain modules may be
of interest, and those can be named in a restriction set.
- -p[{p,...}]
Give profiling information by producer.
In this case, the space occupied by each node is charged to the
function that produced the node.
- -c[{c,...}]
Give profiling information by construction.
As for the -p flag. In this case the space occupied by each node is charged to the
construction that it represents, with the function component being used
for closures.
- -t[{t,...}]
Give profiling information by type.
In this case, the space occupied by each node is charged to the
type of the node.
Two or more of the -g, -m, -p, and -c
flags may be used together. In this case, the first flag specifies what
kind of profile is to be produced, and the remainder are used to specify
restrictions.
During reduction the graph is periodically sampled and the samples
are written to a file whose name is that of the program, extended
with a ``.hp'' suffix. This file can be converted to a PostScript
file by the hp2ps program.
A notable feature is that there is absolutely no concept of scope.
Profiles do not distinguish between nested functions with the same
name, or between functions in different modules with the same name.
The only way to make such distinctions is to copy and rename function
bodies. Some names get lost during compilation, and obscure
identifiers appear in their place. This happens most often in programs
that make heavy use of higher-order functions; my apologies.
Examples:
Running the program ``a.out -p'' gives a producer profile.
``a.out -p -c{(.)}'' gives a producer profile, in which the only producers of interest are those
of ``(.)'' nodes.
``a.out -c -m{lex,parse,typecheck}'' gives a construction profile,
but only for constructions produced by the modules ``lex'', ``parse'' and ``typecheck''.
``a.out -c -m{lex,parse,typecheck} -p{tokeniser,syntax,tcheck}''
gives a construction profile, but only for constructions produced by the
modules ``lex'', ``parse'' and ``typecheck'', and only for the functions
``tokeniser'', ``syntax'' and ``tcheck'' within these modules.
Tracing
There is no debugger available that can handle programs produced by lmlc/hbc,
but if programs are compiled with the ``-T'' flag there is a simple interactive
tracer that can be used. The tracer is invoked by giving the ``-T'' flag to a
compiled program. Unfortunately the tracer cannot be used together with
the interactive system (yet).
The tracer has an interactive interface where the user can turn tracing of and
off, run till a certain point etc.
The following commands are available:
- help
Print a help message describing the commands.
- quit
Quit the tracer and the program.
- leave
Leave a recursive invokation of the tracer.
- next
Trace (i.e.\ print messages) until the next function is entered.
- cont
Run (i.e.\ do not print messages) the program to completion.
- rcont
Trace the program to completion.
- exit
Trace until the current function exits.
- rexit
Run until the currentfunction exits.
- stop re
Set breakpoints on all functions matching re.
- nostop re
Remove breakpoints from all functions matching re.
- arg n
Evaluate (to WHNF) and print argument number n.
- farg n
Fully evaluate and print argument number n.
- on re
Turn on tracing for functions matching re.
Re may contain ``*'' which matches any number of characters.
- off
re
Turn off tracing for functions matching re.
Re may contain ``*'' which matches any number of characters.
- where
Show call stack.
- depth n
Set print depth to n. Default value is 3.
- file y/n
Turn on/off file (module) name printing.
Identifier may be prefixed by their file (module) name followed by a dot
to make them unique. An empty command will repeat the previous command.
Any kind of error or call to fail will cause the tracer to be entered.
The tracer prints the following messages:
- Enter expr
A function is just about to be entered. The expr shows the function
with its arguments.
- Return expr
A function is just about to return with an evaluated expression.
- Return variable
A function is just about to return with variable that might evaluate
to a function.
- Jump (unknown)
A function is just about to tail call another function that was not known at compile time
(or had a number of arguments that did not correspond to its arity).
- Jump
A function is just about to tail call a function known at compile time.
Each message is indented with a depth corresponding to the call stack depth.
In the case of a tail called the function which is called can be seen from
the following ``Enter'' message, provided that function has been compiled with the
trace flag on. The enter and exit messages always come in pairs, even if
tracing is turned off for a particular function between.
The tracer is not able to determine the type for all constructed values.
If it cannot then it uses the name CONn for the n:th constructor of a type.
Traced and non-traced modules can be mixed, but only calls to traced code can be observed.
Each module in a program can be compiled for tracing, but then the trace flag
can be omitted when linking the program. In this case the program will run with
a moderate slowdown (it will take about 25\% longer). If it is linked for tracing,
but run without the ``-T'' flag it may run as much as 5 times slower.
A problem with understanding the trace messages is that they refer to the program
after the extensive transformations performed by the compiler. An aid to understanding
is to look at the program after all transformations; given the ``-ftransformed'' flag
the compiler will show this.
Memory allocation
The current strategy for memory allocation is as follows:
On startup a heap is allocated, the size of this never changes during
execution. The size is the heapsize
given as argument, or a default of 8 megabytes.
Only part of this memory is used during exection to lower the working
set of the program. How large part is determined after each garbage
collection. The amount is used (i.e., available for allocation) is
the amount that was copied when the collection occured multiplied by 4.
In this way the working set is
adapted to the amount of heap that is actually in use.
Tips to get efficient programs
NOTHING MUCH WRITTEN YET (maybe it's impossible?)!!!
Haskell overloading
Overloaded Haskell functions are nice but slow. The compiler tries to
remove overloading where the types are known, so type signatures help
to improve efficiency. It is a good idea, both from efficiency and
also from a programming point of view, to include type signatures for
all top level functions in a module. With optimization turned on the
compiler tries to make specialized functions for all possible types it
is used at. This is currently impossible across module boundaries, so
for exported functions the SPECIALIZE pragma should be used.
It is generally not a good idea to worry too much about what the
compiler does, but here are a few general tips.
Elementary functions, as well as certain idioms, on simple data basic
types turn into a few machine instructions.
The number of machine instructions cited below does not include those
that compute the value, check if it is computed, loads it into a
register etc. These instructions usually take much longer than the
operation itself, but the table gives an indication of which operations
you can expect reasonable efficiency.
- Bool
- ==,/=,<,<=,>,>=
turn into single machine instruction.
- Char
- chr,ord
usually turn into nothing at all.
- ==,/=,<,<=,>,>=
turn into single machine instruction.
- Int
- ==,/=,<,<=,>,>=,+,-,*,negate,quot,rem
turn into single machine instruction.
- max,min,abs,signum,div,mod,even,odd
turn into a few machine instructions.
- Float,Double
- ==,/=,<,<=,>,>=,+,-,*,negate,/,fromRational.toRational,fromInt
turn into single machine instructions. The composition
fromRational.toRational turns into a single instructions if the
argument/result type is Float/Double or Double/Float.
- truncate
turns into a single machine instruction if the result type is Int.
- max,min,abs,signum
turn into a few machine instructions.
- exp,log,sqrt,sin,cos,tan,asin,acos,atan,sinh,cosh,tanh
turn into calls to C. For type Float the argument is first
converted to Double, then the C function is called, and the
result is converted back again.
- Integer
All functions involve a call to C routines, so they are likely to be
slow for small values, but for large numbers they are quite efficient
as the actual computations tend to dominate the running time.
- Complex
Complex numbers based on Float and Double have specialized
instances everywhere in the Prelude and are fairly efficient. Part of
the efficiency comes from the strict data constructor for Complex.
- Rational
Has specialized instances, but is not as efficient as you could make
it by calling C functions to do the arithmetic.
There are a few things that should be currently avoided because they
are very slow:
- show,read
on numeric types in general, and floating
types in particular.
- atan2
it is a real beast.
- fromRational
is very slow for floating types. An exception is fromRational
applied to a constant with a result type which is Float or
Double which the compiler handles specially.
- toRational,fromInteger
can be slow. Again fromInteger on constants are handled
specially.
- gcd
uses Euclid's algorithm.
- Array
All operations on arrays are worse than you would hope for.
Some operations on arrays with Int as index are handled more efficiently.
Most Prelude functions involving numeric types have specialized
instances for all the numeric types in the Prelude, e.g.
sum can sum lists of any Prelude numeric type efficiently.
Last modified: Mon Jul 22 01:15:26 MET DST 1996