=====================
Source Tracing Design
=====================

This requires the system to extract source information while parsing a file
for compilation, the compiler to generate code that retains the source
information, and tracer tools that make use of this information to display
the source.


A) Modifications to Parsing and Processing
==========================================

The new compiler use lib(source_processor) to parse and process source
files. lib(source_processor) was modified to use read_annotated/2 to read
in the source code, so that positional information on the terms being
compiled are retained. The unannotated term is then extracted from the
annotated term. as this is needed for source processing. The source_term
structure used by lib)source_processor) is expanded to include the
annotated form of the source term:

:- export struct(source_term(
	term,
        annotated_term,
	vars
   )).

readvar in source_processor.ecl replaced by read_annotated followed by a
call to expand_macros/2 to perform the term macro expansions. Named
Variables also need to be extracted.


Source code that is read in undergoes transformation/expansion before it is
compiled. This is done in three stages:

1) Macro transformations are applied to terms that are read in, then
2) Clause transformations are applied to clauses, then
3) Goal transformations are applied to the body goals in the clauses.

The final (possibly transformed) term has to retain the annotated
information for the compiler. The first attempt to implement this was to
add the annotated information back to the final term, with any transformed
(sub)terms retaining the annotations from the closest matching original
(sub)terms.  However, this approach needs to duplicate some of the work
done by the transformation and need to recognise which terms have been
transformed.  Both of these can lead to errors. It was decided a better
approach would be to modify the transformation predicates themselves so that
they can deal with annotated terms. The user visible predicates affected
are:

expand_macros/2 
expand_goal/2
expand_clause/2

Two alternatives to handle this:


1) Transform unannotated term

Transformation done with unannotated term, with an `input' annotated term 
which may be:

a)  uninstantiated (if no annotated information available), or
b)  match the unannotated input term

The transformation predicates modified to take in the input terms, and
produces a annotated/unannotated pair of transformed terms, e.g.

The annotated term can be supplied with the call:

  expand_macro_annotated_(+Term, +AnnotatedTerm, -TransformedTerm,
                       -AnnotatedTransformedTerm, +Module)

  Annotated term is used to extract position information, and this is
  passed to the AnnotatedTransformedTerm. 


or it could be uninstantiated:

  expand_macro_annotated_(+Term, -AnnotatedTerm, -TransformedTerm,
                       -AnnotatedTransformedTerm, +Module)

  for compatibility with existing code. No position information would be
  extracted. The resulting form of AnnotatedTransformedTerm is undefined
  and should be ignored.


2) Transform annotated terms:

Modify transform predicates to take annotated terms only.
Compatibility with existing code by adding annotations (with dummy
positional information) to input term before calling new transform predicates.
e.g.


expand_macro_annotated_(+AnnotatedTerm, -TransformedTerm,
                        -AnnotatedTransformedTerm, +Module)

**this would slow down existing calls to transformation: term needs to be
  traversed first to add annotations. 
  
Alternative 1) was implemented.

-----

The annotated term is a modified form of the current annotated term, in
that the from/to field corresponds to the position of the closest matching 
term in the source. An additional type `transformed' is used to indicate a 
transformed term or subterm, e.g.

:- local named_structure(field1,field2).

  named_structure with [field1:foo]

is transformed to 

   named_structure(foo,_)

and the corresponding transformed annotated term with the `transformed'
type given for the named_structure functor and two arguments (foo and _),
with position information from the `with' term. 
  
Thus, no real type information is given for transformed terms. This avoid
the duplication of the type determining code of read_annotated/2, which is
done in C.


modify lib(source_processor) to generate a source_term with annotated term:

:- export struct(source_term(
	term,
        annotated_term,
	vars
   )).


B) Modifications to Abstract Machine
====================================

New abstract machine instruction:

debug_scall(proc(P),port(Port),atom(Path),i(From),i(To))

to be generated by the compiler for debuggable code before a call to a 
procedure/built-in.

This is intended as a replacement for the existing debug_call, but a
different name is given (s for `source') to distinguish it from the old
instruction, which is still needed for existing code.

The new debug_scall provides Path (complete pathname of source file) and
position information From and To (supplied by the annotated term). Path is
an atom (instead of a string) to avoid duplication of the same pathname in
different instances of the debug_scall instruction.

If no source information is available, Path is the empty atom (''), and
From/To should be ignored (they are set to 0).

The use of a full path means that the source file will only be found on the
same file system as the one where the file was compile. With compliation
into memory, this is not a problem, but it may affect where .eco files can
be source traced. 

The information is passed to the debugging tools in the trace frame
(defined in tracer.pl and emu_export.h) that is pushed onto the global
stack. This is created when:

1) triggered by an event generated by the debug_scall instruction.
2) Failure while tracing. Triggers the creation of `fake' fail trace frames
3) Creating a suspension when tracing. Triggers the creation of a delay
   trace frame. 


The various *_Dbg_Frame macro in emu_export.h were modified with the extra
arguments. For compatibility, the old debug_call instruction also supply
dummy arguments ('' for path, 0 for from/to) for the frames. Also,
currently no source information is available from the suspension, so this
also passes dummy arguments.

The trace frame is passed to the tracer (tracer.pl, tracer_tty.pl,
tracer_tcl.pl), which generate the trace line (either as a tty line or in
the GUI). The GUI side needs to modified for the new trace frame, and can
use the information to show the source line.

Three new debugger registers for the path,from and to information were
added to the abstract machine. This is needed to preserve the debugging
information when the event is raised. 


C) Modifications to New Compiler
================================

The normalise form used by the compiler is modified to include the source
information:

:- export struct(goal(
    	kind,
	callpos,
	functor,
	args,
	envmap,
	envsize,
	state,
        from,
        to,
        path,
        lookup_module,
    	definition_module
    )).

The compiler predicates are modified to accept the annotated term as well
as the unannotated term. The annotated term can be uninstantiated if no
source information is available. The annotated and unannotated term are
traversed together by the compiler as the term is converted to its
normalised form. If any mismatch occurs between the term, the annotated
term is not traversed any further.

D) Modifications to the ECLiPSe Level
=====================================

The trace frame contains the three extra arguments, which the ECLiPSe level
tracer code can make use of. For the GUI tracer, the extra arguments are
sent to the external (GUI) side when a new trace line is constructed
(e.g. trace_line_handler_tcl/2 in tracer_tcl.pl). Various predicates which
return information about a trace frame are modified to return the source
information as well -- this will allow source information to be displayed
for the ancestor goals in the call stack.

E) Modifications to the GUI (TkECLiPSe)
=======================================

TkECLiPSe obtain the source information for each port from the
debug_traceline. The filepath is then passed back to ECLiPSe to obtain the
source. The source is not obtained directly on the Tk side because the
TkTools can be remotely connected to ECLiPSe, and be on a completely
different file system. 

Once the listing of the file is displayed, the from/to position information
is used to indicate where the current goal is. Currently this assumes
from/to positions maps into character displacements from the start of the
file. 

===========================================================

Some possible extensions to the current source tracing implementation. 1)
and 2) definitely should be implemented:

1) Tracing of transformed terms

Currently, transformed terms retains source information for the whole
source term that is transformed. If this source term contains more than one
goal, then those goals are not traced individually. One important example
of this are the do loops, which are transformed in recursion. Currently the
goals inside a do loop are not traced individually -- they map to the whole
do loop.

To allow goals inside transformations to be traced, the (user defined) macro
transformation have to be able to indicate how the original subterms are
mapped to the transformed term. This will then allow the source information
to be transferred to the transformed annotated term. A reasonable API for
the user to do this needs to be provided.

2) Tracing of built-ins

Source information are currently not provided for calls to built-ins. This
is because the calling mechanism used is different from the standard calls,
and no infrastructure exists yet to create a trace frame for these calls. 

The normalised form used by the compiler already contain the source
information for these calls, so if the infra-structure for creating trace
frames for built-ins are provided, they could be traced.

3) Tracing of suspensions

Currently, when a suspension is created, a delay trace frame is created,
but there is no source information on the goal being suspended. The
annotation carried by the normalised form for the compiler should already
have this source information, so it should be able to pass this information
to the creation of the delay trace frame.

4) Showing variable bindings for clause

The trace line shows the variables bindings for the current call only, and
variables may occur multiple times and be difficult to locate
precisely. The new compiler provides map of variables for the clause, and
this can be used to identify all the variables for the clause (and if they
are currently vailid), and a display can be used to show the variables
(and access to the inspector for detailed examination).
 
5) Tracing of head unification/matching

Source information are provided for the clause head as well as clause body,
so it should be possible to show the source of the clause head that is
being unified during head unifications. This is normally not shown in the
standard port-based debugging model of Prolog, but it may be useful to 
show this, perhaps as additional ports.
