Running compiled programs

Program flags

A Haskell program compiled with hbc automatically decodes a number of flags:

-C
Produce a ``core'' file when a signal occurs.
-f
Prints a description of the flags; the program is not executed.
-Hsize
Set maximum heap size. Default is 8M.
-hsize
Set minimum heap size. Default is 500k.
-Asize
Set pointer stack size. Default is 100k.
-Vsize
SPARC only! Set return stack size. Default is 50000.
-T
Enter the runtime tracer. This only works if some of the files were compiled with the ``-T'' flag. If this flag is used twice a trace is produced without any user interaction.
-gc-gen
Use a generational garbage collector.
-gc-slide
Use an in-place compacting garbage collector.
-X
Debug mode (gives some additional messages).
-
Marks the end of decoded arguments.

If the runtime system was compiled with dumping enabled there are some additional flags:

-d
Print a stack dump on error.
-G
Print a stack dump before and after each garbage collection.
-Kaddr
When GC reaches stack location addr the routine debstop is called.
-Mn
The maximum number of dumped nodes is set to n.
-tn
The depth of dump is set to n.

If the runtime system was compiled with GC statistics enabled there are some additional flags:

-B
Sound the bell at the start of garbage collection.
-Sfile
Produce a garbage collection statistics file. If no file name is given it is written in ``STAT.program''. If the file name is ``stderr'' it is written on standard error.

The file ``STAT.program'' (when produced) will contain various (selfexplanatory?) statistics.

Heap profiling (Authors: Colin Runciman and David Wakeling)

If the program has been compiled with the heap profiling turned of (the -ph flag to the compiler) it decodes the following flags:

-ifloat
set the sampling interval of the heap profiler to float seconds. Normally the profiler runs with an exponentially increasing profiling interval.
-g[{g,...}]
Give profiling information by module group. When the graph is sampled the space occupied by each node (in bytes) is charged to the particular group of modules that produced the node. In some cases, only certain groups of modules may be of interest, and these groups can be named in an optional restriction set following the -g flag.
-m[{m,...}]
Give profiling information by module. Similar to the -g flag. In this case though, the space occupied by each node is charged to the module that produced the node. Once again, only certain modules may be of interest, and those can be named in a restriction set.
-p[{p,...}]
Give profiling information by producer. In this case, the space occupied by each node is charged to the function that produced the node.
-c[{c,...}]
Give profiling information by construction. As for the -p flag. In this case the space occupied by each node is charged to the construction that it represents, with the function component being used for closures.
-t[{t,...}]
Give profiling information by type. In this case, the space occupied by each node is charged to the type of the node.

Two or more of the -g, -m, -p, and -c flags may be used together. In this case, the first flag specifies what kind of profile is to be produced, and the remainder are used to specify restrictions.

During reduction the graph is periodically sampled and the samples are written to a file whose name is that of the program, extended with a ``.hp'' suffix. This file can be converted to a PostScript file by the hp2ps program.

A notable feature is that there is absolutely no concept of scope. Profiles do not distinguish between nested functions with the same name, or between functions in different modules with the same name. The only way to make such distinctions is to copy and rename function bodies. Some names get lost during compilation, and obscure identifiers appear in their place. This happens most often in programs that make heavy use of higher-order functions; my apologies.

Examples:

Running the program ``a.out -p'' gives a producer profile.

``a.out -p -c{(.)}'' gives a producer profile, in which the only producers of interest are those of ``(.)'' nodes.

``a.out -c -m{lex,parse,typecheck}'' gives a construction profile, but only for constructions produced by the modules ``lex'', ``parse'' and ``typecheck''.

``a.out -c -m{lex,parse,typecheck} -p{tokeniser,syntax,tcheck}'' gives a construction profile, but only for constructions produced by the modules ``lex'', ``parse'' and ``typecheck'', and only for the functions ``tokeniser'', ``syntax'' and ``tcheck'' within these modules.

Tracing

There is no debugger available that can handle programs produced by lmlc/hbc, but if programs are compiled with the ``-T'' flag there is a simple interactive tracer that can be used. The tracer is invoked by giving the ``-T'' flag to a compiled program. Unfortunately the tracer cannot be used together with the interactive system (yet). The tracer has an interactive interface where the user can turn tracing of and off, run till a certain point etc.

The following commands are available:

help
Print a help message describing the commands.
quit
Quit the tracer and the program.
leave
Leave a recursive invokation of the tracer.
next
Trace (i.e.\ print messages) until the next function is entered.
cont
Run (i.e.\ do not print messages) the program to completion.
rcont
Trace the program to completion.
exit
Trace until the current function exits.
rexit
Run until the currentfunction exits.
stop re
Set breakpoints on all functions matching re.
nostop re
Remove breakpoints from all functions matching re.
arg n
Evaluate (to WHNF) and print argument number n.
farg n
Fully evaluate and print argument number n.
on re
Turn on tracing for functions matching re. Re may contain ``*'' which matches any number of characters.
off re Turn off tracing for functions matching re. Re may contain ``*'' which matches any number of characters.
where Show call stack.
depth n Set print depth to n. Default value is 3.
file y/n Turn on/off file (module) name printing.

Identifier may be prefixed by their file (module) name followed by a dot to make them unique. An empty command will repeat the previous command. Any kind of error or call to fail will cause the tracer to be entered. The tracer prints the following messages: Enter expr A function is just about to be entered. The expr shows the function with its arguments. Return expr A function is just about to return with an evaluated expression. Return variable A function is just about to return with variable that might evaluate to a function. Jump (unknown) A function is just about to tail call another function that was not known at compile time (or had a number of arguments that did not correspond to its arity). Jump A function is just about to tail call a function known at compile time. Each message is indented with a depth corresponding to the call stack depth. In the case of a tail called the function which is called can be seen from the following ``Enter'' message, provided that function has been compiled with the trace flag on. The enter and exit messages always come in pairs, even if tracing is turned off for a particular function between. The tracer is not able to determine the type for all constructed values. If it cannot then it uses the name CONn for the n:th constructor of a type. Traced and non-traced modules can be mixed, but only calls to traced code can be observed. Each module in a program can be compiled for tracing, but then the trace flag can be omitted when linking the program. In this case the program will run with a moderate slowdown (it will take about 25\% longer). If it is linked for tracing, but run without the ``-T'' flag it may run as much as 5 times slower. A problem with understanding the trace messages is that they refer to the program after the extensive transformations performed by the compiler. An aid to understanding is to look at the program after all transformations; given the ``-ftransformed'' flag the compiler will show this. Memory allocation The current strategy for memory allocation is as follows: On startup a heap is allocated, the size of this never changes during execution. The size is the heapsize given as argument, or a default of 8 megabytes. Only part of this memory is used during exection to lower the working set of the program. How large part is determined after each garbage collection. The amount is used (i.e., available for allocation) is the amount that was copied when the collection occured multiplied by 4. In this way the working set is adapted to the amount of heap that is actually in use. Tips to get efficient programs NOTHING MUCH WRITTEN YET (maybe it's impossible?)!!! Haskell overloading Overloaded Haskell functions are nice but slow. The compiler tries to remove overloading where the types are known, so type signatures help to improve efficiency. It is a good idea, both from efficiency and also from a programming point of view, to include type signatures for all top level functions in a module. With optimization turned on the compiler tries to make specialized functions for all possible types it is used at. This is currently impossible across module boundaries, so for exported functions the SPECIALIZE pragma should be used. It is generally not a good idea to worry too much about what the compiler does, but here are a few general tips. Elementary functions, as well as certain idioms, on simple data basic types turn into a few machine instructions. The number of machine instructions cited below does not include those that compute the value, check if it is computed, loads it into a register etc. These instructions usually take much longer than the operation itself, but the table gives an indication of which operations you can expect reasonable efficiency. Bool ==,/=,<,<=,>,>= turn into single machine instruction. Char chr,ord usually turn into nothing at all. ==,/=,<,<=,>,>= turn into single machine instruction. Int ==,/=,<,<=,>,>=,+,-,*,negate,quot,rem turn into single machine instruction. max,min,abs,signum,div,mod,even,odd turn into a few machine instructions. Float,Double ==,/=,<,<=,>,>=,+,-,*,negate,/,fromRational.toRational,fromInt turn into single machine instructions. The composition fromRational.toRational turns into a single instructions if the argument/result type is Float/Double or Double/Float. truncate turns into a single machine instruction if the result type is Int. max,min,abs,signum turn into a few machine instructions. exp,log,sqrt,sin,cos,tan,asin,acos,atan,sinh,cosh,tanh turn into calls to C. For type Float the argument is first converted to Double, then the C function is called, and the result is converted back again. Integer All functions involve a call to C routines, so they are likely to be slow for small values, but for large numbers they are quite efficient as the actual computations tend to dominate the running time. Complex Complex numbers based on Float and Double have specialized instances everywhere in the Prelude and are fairly efficient. Part of the efficiency comes from the strict data constructor for Complex. Rational Has specialized instances, but is not as efficient as you could make it by calling C functions to do the arithmetic. There are a few things that should be currently avoided because they are very slow: show,read on numeric types in general, and floating types in particular. atan2 it is a real beast. fromRational is very slow for floating types. An exception is fromRational applied to a constant with a result type which is Float or Double which the compiler handles specially. toRational,fromInteger can be slow. Again fromInteger on constants are handled specially. gcd uses Euclid's algorithm. Array All operations on arrays are worse than you would hope for. Some operations on arrays with Int as index are handled more efficiently. Most Prelude functions involving numeric types have specialized instances for all the numeric types in the Prelude, e.g. sum can sum lists of any Prelude numeric type efficiently. Last modified: Mon Jul 22 01:15:26 MET DST 1996