lib(regex)


    This library implements an ECLiPSe API for POSIX 1003.2 regular
    expressions (on Unix systems it calls the regular expression
    functions from the standard library, on Windows it uses Henry
    Spencer's regex library version 3.8).

    Regular Expressions
    This is just a very brief summary of the essentials.  For details
    of regular expressions see any POSIX regex(7) man page.
    Two types of regular expressions are supported:
    
    Extended Regular Expressions (the default)
	These are described below and correspond essentially to those
	understood by the UNIX egrep command.
    Basic Regular Expressions
	These correspond essentially to those in the UNIX ed editor
	or the grep command, and are mostly obsolete.
    
    Note that our choice of default differs from the POSIX 1003.2 C API.

    Characters
    Every character stands for itself, except for the characters
    ^.[$()|*+?{\ which must be escaped with a \ to prevent
    them from having special meaning (and note that, since the ECLiPSe
    parser already interprets backslashes, you will have escape the
    backslash with another backslash in your ECLiPSe source string).
    
    .
	Matches any character
    [aeiou]
	Matches any of the characters between the brackets
    [^aeiou]
	Matches any character except those listed
    [a-z0-9]
	Matches any character in the given ranges
    

    Anchors
    
    ^
	Matches at the beginning of the string (or line)
    $
	Matches at the end of the string (or line)
    

    Repetition
    
    ?
	Matches the preceding element 0 or 1 times
    *
	Matches the preceding element 0 or more times
    +
	Matches the preceding element 1 or more times
    {3}
	Matches the preceding element 3 times
    {1,3}
	Matches the preceding element 1 to 3 times
    

    Grouping
    
    (subexpr)
	Matches the parenthesized expression. This grouping is used
	in connection with the repetition operators, or for indicating 
	subexpressions whose matches are to be captured and returned
    (one|two|three)
	Matches any of the alternative expressions
    

    Options
    Most of the predicates in this library accept a list of options.
    The accepted options are:
    
    basic
	Interpret the pattern as a Basic Regular Expression, rather
	than the default Extended Regular Expression.
    extended
	Interpret the pattern as an Extended Regular Expression
	(this flag is redundant since this is the default).
    icase
	Ignore case when matching.
    newline
	Treat newlines specially, i.e. don't treat them as normal
	characters and make ^ match after a newline and $ before a newline.
	By default, newlines are treated as ordinary characters.
    notbol
	Don't interpret the beginning of the string as the beginning
	of a line, i.e. don't let ^ match there.
    noteol
	Don't interpret the end of the string as the end of a line,
	i.e. don't let $ match there.
	
    
    Shortcomings
    
    
    Due to limitations of the underlying implementation, the predicates
    in this library do not handle embedded NUL characters in strings correctly
    (they are interpreted as the end of the string).
    
    POSIX regular expressions don't seem to have a notion of  "noncapturing
    parentheses", i.e. parentheses that are only used for grouping, not for
    indicating that one wants to capture the matching substring.
    
    In an environment like ECLiPSe, one would like to be able to do things like
    
	?- ideal_match("(/[^/]*)+", "/usr/local/eclipse", L).
	L = ["/usr", "/local", "/eclipse"]
	Yes
    
    i.e. capture every instance of a matching subexpression.  There seems
    to be no way to do that with a POSIX regexp implementation.
    


