% $Id$
% $Log$
%\documentstyle[makeidx,11pt,fleqn]{article}
\documentclass[11pt,fleqn]{article}
\usepackage{latexsym}
\usepackage{makeidx}
\usepackage{amssymb}
\setlength{\textwidth}{16.5cm}
\setlength{\textheight}{23cm}
\renewcommand{\baselinestretch}{1.1}
\setlength{\topmargin}{-1cm}
\setlength{\oddsidemargin}{0cm}
\setlength{\evensidemargin}{0cm}
\setlength{\marginparwidth}{0.0cm}
\setlength{\marginparsep}{0.0cm}
\newlength{\figurewidth}
\setlength{\figurewidth}{\textwidth}
\addtolength{\figurewidth}{-0.4cm}
% environment for typing program texts:
\makeatletter
\newenvironment{prog}{\par\vspace{1.5ex}
\setlength{\parindent}{1.0cm}
\setlength{\parskip}{-0.1ex}
\obeylines\@vobeyspaces\tt}{\vspace{1.5ex}\noindent
}
\makeatother
\newcommand{\startprog}{\begin{prog}}
\newcommand{\stopprog}{\end{prog}\noindent}
\newcommand{\pr}[1]{\mbox{\tt #1}} % program text in normal text
\newcommand{\Ac}{{\cal{A}}}
\newcommand{\Cc}{{\cal{C}}}
\newcommand{\Dc}{{\cal{D}}}
\newcommand{\Fc}{{\cal{F}}}
\newcommand{\Tc}{{\cal{T}}}
\newcommand{\Xc}{{\cal{X}}}
\newcommand{\var}{{\cal V}ar}
\newcommand{\Dom}{{\cal D}om}
\newcommand{\VRan}{{\cal VR}an}
\renewcommand{\emptyset}{\varnothing}
\newcommand{\ans}{\mathbin{\framebox[1mm]{\rule{0cm}{1.2mm}}}}
%\newcommand{\ans}{\;}
\newcommand{\todo}[1]{\fbox{\sc To do: #1}}
\newcommand{\dexp}{D} % a disjunctive expression
\newcommand{\cconj}{\ensuremath \mathop{\pr{\&}}} % concurrent conj. in math
\newcommand{\ttbs}{\mbox{\tt\char92}} % backslash in tt font
\newcommand{\ttus}{\mbox{\tt\char95}} % underscore in tt font
\newcommand{\sem}[1]{\ensuremath [\![#1]\!]} % double square brackets
\newcommand{\eval}[1]{{\cal E}\!val\sem{#1}}
\newcommand{\To}{\Rightarrow}
\newcommand{\infrule}[2]{\begin{array}{@{}c@{}} #1 \\ \hline #2 \end{array}}
\newcommand{\pindex}[1]{\index{#1@{\tt #1}}} % program elements in index
\newcommand{\comment}[1]{} % ignore the argument
\catcode`\_=\active
\let_=\sb
\catcode`\_=12
% commands for the syntax:
\newcommand{\term}[1]{{\tt #1}}
\newcommand{\opt}[1]{\rm [\it #1{\rm ]}}
\newcommand{\offside}[3]{\term{\{}\seq{#1}{#2}{#3} \term{\}}}
\newcommand{\seq}[3]{{\it #1$_1$ \term{#2} $\ldots$ \term{#2} #1$_#3$}}
\newcommand{\head}[1]{\makebox[1.3in][r]{\it #1}}
\newcommand{\por}{\\\head{}\makebox[3em][c]{$|$}}
\newcommand{\next}{\\\head{}\makebox[3em]{}}
\newcommand{\marg}[1]{\hspace{\fill}$(#1)$}
\newcommand{\lexicon}[1]{\production #1 {\rm see lexicon} end}
\def\production #1 #2 end {\head{#1}\makebox[3em]{\rm ::=}\it #2}
\makeindex
\begin{document}
\sloppy
\begin{titlepage}
\begin{center}
\fbox{
\begin{minipage}[t]{\figurewidth}
\begin{center}\vspace{10ex}
{\Huge\bf Curry}\\[4ex]
{\LARGE\bf An Integrated Functional Logic Language}\\[5ex]
{\large\bf Version 0.5.1}\\[1ex]
{\large \today}\\[8ex]
\Large
Michael Hanus$^1$ [editor] \\[3ex]
{\large Additional Contributors:}\\[2ex]
Sergio Antoy$^2$ \\
Herbert Kuchen$^3$ \\
Francisco J. L\'opez-Fraguas$^4$ \\
Juan Jos\'e Moreno Navarro$^5$ \\
Frank Steiner$^6$ \\[20ex]
\normalsize
(1) RWTH Aachen, Germany, {\tt hanus@informatik.rwth-aachen.de} \\
(2) Portland State University, USA, {\tt antoy@cs.pdx.edu} \\
(3) University of M\"unster, Germany, {\tt kuchen@uni-muenster.de} \\
(4) Universidad Complutense de Madrid, Spain, {\tt fraguas@dia.ucm.es} \\
(5) Universidad Polit\'ecnica de Madrid, Spain, {\tt jjmoreno@fi.upm.es} \\
(6) RWTH Aachen, Germany, {\tt steiner@i2.informatik.rwth-aachen.de} \\[5ex]~
\end{center}
\end{minipage}}
\end{center}
\end{titlepage}
\tableofcontents
\newpage
\section{Introduction}
Curry is a universal programming language aiming at the amalgamation
of the most important declarative programming paradigms,
namely functional programming and logic programming.
Curry combines in a seamless way features from functional programming
(nested expressions, lazy evaluation, higher-order functions),
logic programming (logical variables, partial data structures,
built-in search), and concurrent programming (concurrent evaluation
of constraints with synchronization on logical variables).
Moreover, Curry provides additional features in
comparison to the pure languages (compared to functional programming:
search, computing with partial information; compared to logic
programming: more efficient evaluation due to the deterministic
evaluation of functions).
Moreover, it also amalgamates the most
important operational principles developed in the area of integrated
functional logic languages: ``residuation''\index{residuation}
and ``narrowing''\index{narrowing} (see
\cite{Hanus94JLP} for a survey on functional logic programming).
The development of Curry is an international initiative intended to
provide a common platform for the research, teaching\footnote{%
Actually, Curry has been successfully applied to teach functional and
logic programming techniques in a single course without switching
between different programming languages. More details about
this aspect can be found in \cite{Hanus97DPLE}.}
and application
of integrated functional logic languages.
This document describes the features of Curry, its syntax and
operational semantics.
\section{Programs}
A Curry program specifies the semantics of expressions, where
goals, which occur in logic programming, are particular expressions
(of type \pr{Constraint}).
Executing a Curry program means simplifying an expression until
a value (or solution) is computed.
To distinguish between values and reducible expressions,
Curry has a strict distinction between (data)
\emph{constructors}\index{constructor} and
\emph{operations}\index{operation} or
\emph{defined functions}\index{defined function}\index{function!defined}
on these data.
Hence, a Curry program consists of a set of type and function declarations.
The type declarations define the computational domains (constructors)
and the function declarations the operations on these domains.
Predicates in the logic programming sense can be considered
as constraints, i.e., functions with result type \pr{Constraint}.
Modern functional languages (e.g., Haskell \cite{HudakPeytonJonesWadler92},
SML \cite{MilnerTofteHarper90}) allow the detection of many programming
errors at compile time by the use of polymorphic type systems.
Similar type systems are also used in modern logic languages
(e.g., G\"odel \cite{HillLloyd94}, $\lambda$Prolog \cite{NadathurMiller88}).
Therefore, Curry is a strongly typed language with a
Hindley/Milner-like polymorphic type system \cite{DamasMilner82}.\footnote{%
The extension of this type system to Haskell's type classes
is not included in the kernel language but will be considered
in a future version.}
Each object in a program has a unique type, where
the types of variables and operations can be omitted and are
reconstructed by a type inference mechanism.
\subsection{Datatype Declarations}
\label{sec-datatypes}
A datatype declaration\index{type declaration}\index{datatype!declaration}
\index{declaration!datatype} has the form
\startprog
data $T$ $\alpha_1$ \ldots $\alpha_n$ = $C_1$ $\tau_{11}$ \ldots $\tau_{1n_1}$ | $\cdots$ | $C_k$ $\tau_{k1}$ \ldots $\tau_{kn_k}$
\stopprog
and introduces a new $n$-ary \emph{type constructor}\index{type constructor}
$T$ and $k$
new (data) \emph{constructors} $C_1,\ldots,C_k$, where each $C_i$ has the type
\startprog
$\tau_{i1}$ -> $\cdots$ -> $\tau_{in_i}$ -> $T~\alpha_1\ldots\alpha_n$
\stopprog
($i=1,\ldots,k$). Each $\tau_{ij}$ is a
\emph{type expression}\index{type expression} built from the
\emph{type variables}\index{type variable} $\alpha_1,\ldots,\alpha_n$
and some type constructors. Curry has a number of built-in type constructors,
like \pr{Bool}, \pr{Int}, \pr{->} (function space), or, lists and tuples,
which are described in Section~\ref{sec-builtin-types}.
Since Curry is a higher-order language,
the types of functions (i.e., constructors or operations)
are written in their curried form
\pr{$\tau_1$ -> $\tau_2$ -> $\cdots$ -> $\tau_n$ -> $\tau$}
where $\tau$ is not a functional type.
In this case, $n$ is called the \emph{arity}\index{arity}\index{function!arity}
of the function.
For instance, the datatype declarations
\startprog
data Bool = True | False
data List a = [] | a : List a
data Tree a = Leaf a | Node (Tree a) a (Tree a)
\stopprog
introduces the datatype \pr{Bool} with the 0-ary constructors
(\emph{constants})\index{constant} \pr{True} and \pr{False},
and the polymorphic types \pr{List a} and \pr{Tree a} of lists
and binary trees. Here, ``\pr{:}'' is an infix operator, i.e.,
``\pr{a:List a}'' is another notation for ``\pr{(:) a (List a)}''.
Lists are predefined in Curry, where the notation ``\pr{[a]}''
is used to denote list types (instead of ``\pr{List a}'').
The usual convenient notations for lists are supported,
i.e., \pr{[0,1,2]} is an abbreviation for
\pr{0:(1:(2:[]))} (see also Section~\ref{sec-builtin-types}).
A \emph{data term}\index{data term} is a variable $x$ or a
constructor application
$c~t_1\ldots t_n$ where $c$ is an $n$-ary constructor and
$t_1,\ldots,t_n$ are data terms. An \emph{expression}\index{expression}
is a variable or a (partial) application $\varphi~e_1 \ldots e_m$
where $\varphi$ is an $n$-ary function or
constructor, $m\leq n$, and $e_1,\ldots,e_m$ are expressions.
A data term or expression is called \emph{ground}\index{ground expression}
\index{expression!ground}\index{ground term}\index{data term!ground}
if it does not contain any variable.
Ground data terms correspond to values in the intended domain,
and expressions containing operations
should be evaluated to data terms.
Note that traditional functional languages compute on ground expressions,
whereas logic languages also allow non-ground expressions.
\subsection{Function Declarations}
\label{sec-funcdecl}
A function is defined by a type declaration (which can be omitted)
followed by a list of defining equations.
A \emph{type declaration}\index{type declaration}\index{function declaration}
\index{declaration!function} for an $n$-ary function has the form
\startprog
$f$ :: $\tau_1$ -> $\tau_2$ -> $\cdots$ -> $\tau_n$ -> $\tau$
\stopprog
where $\tau_1,\ldots,\tau_n,\tau$ are type expressions
and $\tau$ is not a functional type.
The simplest form of a
\emph{defining equation}\index{equation}\index{defining equation}
(or \emph{rule})\index{rule}\index{function rule}\pindex{=}
for an $n$-ary function $f$ ($n \geq 0$) is
\startprog
$f~t_1\ldots t_n$ = $e$
\stopprog
where $t_1,\ldots,t_n$ are data terms and the
\emph{right-hand side}\index{right-hand side} $e$ is an expression.
The \emph{left-hand side}\index{left-hand side}
$f~t_1\ldots t_n$ must not contain multiple occurrences of a variable.
Functions can also be defined by
\emph{conditional equations}\index{conditional equation}%
\index{equation!conditional}\index{conditional rule}\index{rule!conditional}
which have the form
\startprog
$f~t_1\ldots t_n$ | $c$ = $e$
\stopprog
where the \emph{condition}\index{condition} $c$ is a constraint
(cf.\ Section~\ref{sec-equality}). In order to apply a conditional equation,
its condition must be solved. Conditional equations with
identical left-hand sides can be also written as
\startprog
$f~t_1\ldots t_n$ | $c_1$ = $e_1$
\,$\vdots$
\,| $c_k$ = $e_k$
\stopprog
which is an abbreviation for the $k$ conditional equations
\startprog
$f~t_1\ldots t_n$ | $c_1$ = $e_1$
$\vdots$
$f~t_1\ldots t_n$ | $c_k$ = $e_k$
\stopprog
%
Note that one can define functions with a non-determinate
behavior by providing several rules with overlapping left-hand sides
or \emph{free variables}\index{free variable}\index{variable!free}
(i.e., variables which do not occur in the left-hand
side) in the conditions or right-hand sides of rules.
For instance, the following non-deterministic function inserts an element
at an arbitrary position in a list:
\startprog
insert x [] = [x]
insert x (y:ys) = x : y : ys
insert x (y:ys) = y : insert x ys
\stopprog
Such \emph{non-deterministic functions}\index{non-deterministic function}
\index{function!non-deterministic}
can be given a perfect declarative semantics \cite{GonzalesEtAl96ESOP}
and their implementation causes no overhead in Curry
since techniques for handling non-determinism are already contained in
the logic programming part (see also \cite{Antoy97ALP}).
However, deterministic functions are advantageous since
they provide for more efficient implementations (like the
\emph{dynamic cut} \cite{LoogenWinkler95}).
If one is interested only in defining deterministic functions,
this can be ensured by the following restrictions:
\begin{enumerate}
\item
Each variable occurring in the right-hand side of a rule
must also occur in the corresponding left-hand side.
\item
If \pr{~$f~t_1\ldots t_n$ | $c$ = $e$~} and
\pr{~$f~t'_1\ldots t'_n$ | $c'$ = $e'$~} are rules defining $f$ and
$\sigma$ is a \emph{substitution}\index{substitution}\footnote{%
A \emph{substitution} $\sigma$ is a mapping from variables into expressions
which is extended to a homomorphism on expressions by defining
$\sigma(f~t_1\ldots t_n) = f~\sigma(t_1)\ldots \sigma(t_n)$.
$\{x_1 \mapsto e_1,\ldots,x_k \mapsto e_k\}$ denotes the substitution
$\sigma$ with $\sigma(x_i)=e_i$ ($i=1,\ldots,k$) and
$\sigma(x)=x$ for all variables $x\not\in\{x_1,\ldots,x_k\}$.}
with $\sigma(t_1\ldots t_n)=\sigma(t'_1\ldots t'_n)$, then at least
one of the following conditions holds:
\begin{enumerate}
\item $\sigma(e)=\sigma(e')$ (\emph{compatibility of right-hand sides}).
\item $\sigma(c)$ and $\sigma(c')$ are not simultaneously satisfiable
(\emph{incompatibility of conditions}). A decidable approximation
of this condition can be found in \cite{KuchenEtAl96NGC}.
\end{enumerate}
\end{enumerate}
These conditions ensure the confluence\index{confluence}
of the rules if they are considered as a conditional term rewriting
system \cite{SuzukiMiddeldorpIda95RTA}.
Implementations of Curry may check these conditions and warn the
user if they are not satisfied. There are also more general
conditions to ensure confluence \cite{SuzukiMiddeldorpIda95RTA}
which can be checked instead of the above conditions.
% Note that we do not require
% the termination of the rewrite system so that the computation with
% infinite data structures is supported as in lazy functional languages.
% In contrast to functional languages,
% \emph{extra variables}\index{extra variables} in the conditions
% (i.e., variables which do not occur in the left-hand
% side) are allowed. These extra variables provide the power of
% logic programming since a search for appropriate values is necessary
% in order to apply a conditional rule with extra variables.
Note that defining \emph{equations of higher-type},
\index{equation!higher-order}\index{higher-order equation}
e.g., \pr{f = g} if \pr{f} and \pr{g} are of type \pr{Bool -> Bool},
are formally excluded in order to define a simple operational semantics based
on first-order rewriting.\footnote{Note that this is also necessary
in an extension of Curry which allows higher-order rewrite rules,
since rules with lambda-abstractions in left-hand sides which are
not of base type may cause a gap between
the standard notion of higher-order rewriting and the corresponding equational
theory \cite{Nipkow91LICS}.}
For convenience, a defining equation \pr{f = g}
between functions is allowed but will be interpreted in Curry
as syntactic sugar for the corresponding defining equation \pr{f x = g x}
on base types.
\subsubsection{Functions vs.\ Variables}
\label{sec-variable-sharing}
In lazy functional languages, different occurrences of the same
variable are shared to avoid multiple evaluations of
identical expressions. For instance, if we apply the rule
\startprog
double x = x+x
\stopprog
to an expression \pr{double$\,t$}, we obtain the new expression
\pr{$t$+$t$} but both occurrences of $t$ denote the identical
expression, i.e., these subterms will be simultaneously evaluated.
Thus, \emph{several occurrences of the same variable are
always shared.} This is necessary not only for efficiency reasons
but it has also an influence on the soundness of the operational semantics
in the presence of non-deterministic functions
(see also \cite{GonzalesEtAl96ESOP}). For instance,
consider the non-deterministic function \pr{coin} defined by the rules
\startprog
coin = 0
coin = 1
\stopprog
Thus, the expression \pr{coin} evaluates to \pr{0} or \pr{1}.
However, the result values of the expression \pr{(double\,\,coin)}
depend on the sharing of the two occurrences of \pr{coin}
after applying the rule for \pr{double}: if both occurrences
are shared (as in Curry), the results are \pr{0} or \pr{2},
otherwise (without sharing) the results are \pr{0}, \pr{1}, or \pr{2}.
The sharing of argument variables corresponds to the so-called
``call time choice'' in the declarative semantics
\cite{GonzalesEtAl96ESOP}: if a rule is applied to a function call,
a unique value must be assigned to each argument
(in the example above: either \pr{0} or \pr{1} must be assigned
to the occurrence of \pr{coin} when the expression
\pr{(double\,\,coin)} is evaluated). This does not mean that
functions have to be evaluated in a strict manner but this
behavior can be easily obtained by sharing the different occurrences
of a variable.
Since different occurrences of the same variable are always shared
but different occurrences of (textually identical) function
calls are not shared, it is important to distinguish
between variables and functions. Usually, all symbols
occurring at the top-level in the left-hand side of some rule
are considered as functions and the non-constructor symbols
in the arguments of the left-hand sides are considered
as variables. But note that there is a small exception
from this general rule in local declarations
(see Section~\ref{sec-localdecls}).
\subsubsection{Conditional Equations with Boolean Guards}
\label{sec-bool-guards}
For convenience and compatibility with Haskell,
one can also write conditional equations with multiple guards
\startprog
$f~t_1\ldots t_n$ | $b_1$ = $e_1$
\,$\vdots$
\,| $b_k$ = $e_k$
\stopprog
where $b_1,\ldots,b_k$ ($k>0$) are expressions of type \pr{Bool}
(instead of \pr{Constraint}).
Such a rule is interpreted as in Haskell: the guards are successively evaluated
and the right-hand side of the first guard which is evaluated to \pr{True}
is the result of applying this equation. Thus, this equation
can be considered as an abbreviation for the rule
\startprog
$f~t_1\ldots t_n$ = if $b_1$ then $e_1$ else
\,$\vdots$
\,if $b_k$ then $e_k$ else \mbox{\it{}undefined}
\stopprog
(where {\it undefined} is some non-reducible function).
To write rules with several Boolean guards more nicely,
there is a Boolean function \pr{otherwise}\pindex{otherwise}
which is predefined as \pr{True}. For instance, the factorial
function can be declared as follows:
\startprog
fac n | n==0 = 1
| otherwise = fac(n-1)*n
\stopprog
Since all guards have type Bool, this definition is equivalent to
(after a simple optimization)
\startprog
fac n = if n==0 then 1 else fac(n-1)*n
\stopprog
%
To avoid confusion, it is not allowed to mix the Haskell-like
notation with Boolean guards and the standard notation
with constraints: in a rule with multiple guards,
all guards must be expressions of either type \pr{Constraint}
or type \pr{Bool} but not a simultaneous mixture of both.
The default type of a guard is \pr{Constraint}, i.e., in a rule like
\startprog
f c x | c = x
\stopprog
the type of the variable \pr{c} is \pr{Constraint}
(provided that it is not restricted to \pr{Bool} by the use of \pr{f}).
\subsection{Local Declarations}
\label{sec-localdecls}
\index{local declaration}\index{declaration!local}
Since not all auxiliary functions should be globally
visible, it is possible to restrict the scope of declared entities.
Note that the scope of parameters in function definitions
is already restricted since the variables occurring in parameters
of the left-hand side are only visible in the corresponding
conditions and right-hand sides. The visibility of other
entities can be restricted using \pr{let} in expressions
or \pr{where} in defining equations.
An expression of the form \pr{let {\it{}decls} in {\it{}exp}}
\index{local declaration}\index{declaration!local}\pindex{let}
introduces a set of local names. The list of local declarations
\emph{decls} can contain function definitions as well as
definitions of constants by pattern matching. The names
introduced in these declarations are visible in the expression
\emph{exp} and the right-hand sides of the declarations in \emph{decls},
i.e., the local declarations can be mutually recursive.
For instance, the expression
\startprog
let a=3*b
b=6
in 4*a
\stopprog
reduces to the value \pr{72}.
Auxiliary functions which are only introduced to define another
function should often not be visible outside. Therefore,
such functions can be declared in a \pr{where}-clause
\index{local declaration}\index{declaration!local}\pindex{where}
added to the right-hand side of the corresponding function definition.
This is done in the following definition of a fast exponentiation
where the auxiliary functions \pr{even} and \pr{square}
are only visible in the right-hand side of the rule for \pr{exp}:
\startprog
exp b n = if n==0
then 1
else if even n then square (exp b (n `div` 2))
else b * (exp b (n-1))
where even n = n `mod` 2 == 0
square n = n*n
\stopprog
Similarly to \pr{let}, \pr{where}-clauses can contain
mutually recursive function definitions as well as
definitions of constants by pattern matching.
The names declared in the \pr{where}-clauses are only visible
in the corresponding conditions and right-hand sides.
As a further example, the following Curry program implements
the quicksort algorithm with a function \pr{split} which
splits a list into two lists containing the smaller and larger elements:
\startprog
split e [] = ([],[])
split e (x:xs) | e>=x = (x:l,r)
| e b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
\stopprog
However, there is an important difference w.r.t.{} to functional programming.
Since Curry is also a logic language, it allows logical variables
also for functional values, i.e., it is possible to evaluate the
equation \pr{map f [1 2] =:= [2 3]} which has, for instance, a solution
\pr{\{f=inc\}} if \pr{inc} is the increment function on natural numbers.
In principle, such solutions can be computed by extending (first-order)
unification to \emph{higher-order unification}
\cite{HanusPrehofer96RTA,NadathurMiller88,Prehofer95Diss}.
Since higher-order unification is a computationally expensive
operation, Curry delays the application of unknown functions
until the function becomes known \cite{Ait-KaciLincolnNasr87,Smolka95}.
Thus, the application operation can be considered as a function
(``\pr{@}'' is a left-associative infix operator)
\startprog
(@) :: (a -> b) -> a -> b
f@x = f x
\stopprog
which is ``rigid'' in its first argument (cf.\ Section~\ref{sec-opsem}).
In cases where a function is only used a single time, it is tedious
to assign a name to it. For such cases, anonymous functions
(\emph{$\lambda$-abstractions}\index{lambda-abstraction@$\lambda$-abstraction}),
denoted by
\startprog
\ttbs{}$x_1\ldots{}x_n$->$e$
\stopprog
are provided.
\section{Operational Semantics}
\label{sec-opsem}
Curry's operational semantics is based on the lazy evaluation
of expressions combined with a possible instantiation
of free variables occurring in the expression.
If the expression is ground, the operational model
is similar to lazy functional languages,
otherwise (possibly non-deterministic) variable instantiations
are performed as in logic programming.
If an expression contains free variables, it
may be reduced to different values by binding the
free variables to different expressions. In functional programming,
one is interested in the computed \emph{value}, whereas logic programming
emphasizes the different bindings (\emph{answers}).
Thus, we define for the integrated functional logic language Curry an
\emph{answer expression}\index{answer expression}\index{expression!answer}
as a pair ``$\sigma \ans e$''
consisting of a substitution $\sigma$ (the answer computed so far)
and an expression $e$.
An answer expression $\sigma \ans e$ is \emph{solved} if $e$
is a data term.
Usually, the identity substitution in answer expressions is omitted,
i.e., we write $e$ instead of $\{\} \ans e$ if it is clear from
the context.
Since more than one answer may exist for expressions containing
free variables, in general, initial expressions are reduced to disjunctions
of answer expressions. Thus, a
\emph{disjunctive expression}\index{disjunctive expression}\index{expression!disjunctive}
is a (multi-)set of answer expressions
$\{\sigma_1 \ans e_1,\ldots, \sigma_n \ans e_n\}$.
For the sake of readability, we write concrete disjunctive expressions
like
\[
\{\{\pr{x}=\pr{0},\pr{y}=\pr{2}\} \ans \pr{2}~,~\{\pr{x}=\pr{1},\pr{y}=\pr{2}\} \ans \pr{3}\}
\]
in the form \pr{\{x=0,y=2\}\,2\,|\,\{x=1,y=2\}\,3}.
Thus, substitutions are represented by lists of equations enclosed in
curly brackets, and disjunctions are separated by vertical bars.
A single \emph{computation step}\index{computation step}
performs a reduction in exactly one unsolved
expression of a disjunction (e.g., in the leftmost unsolved
answer expression in Prolog-like implementations).
If the computation step is
deterministic, the expression is reduced to a new one.
If the computation step is non-deterministic,
the expression is replaced by a disjunction of new expressions.
The precise behavior depends on the function calls occurring
in the expression.
For instance, consider the following rules:
\startprog
f 0 = 2
f 1 = 3
\stopprog
The result of evaluating the expression \pr{f 1} is \pr{3}, whereas
the expression \pr{f x} should be evaluated to the disjunctive expression
\startprog
\{x=0\} 2 | \{x=1\} 3 .
\stopprog
To avoid superfluous computation steps and to apply
programming techniques of modern functional languages,
nested expressions are evaluated lazily, i.e., the leftmost outermost
function call is primarily selected in a computation step.
Due to the presence of free variables in expressions, this
function call may have a free variable at an argument position
where a value is demanded by the left-hand sides of the function's rules
(a value is \emph{demanded} by an argument position of the left-hand side
of some rule, if the left-hand side has a constructor at this
position, i.e., in order to apply the rule, the actual value at
this position must be the constructor).
In this situation there are two possibilities to proceed:
\begin{enumerate}
\item
Delay the evaluation of this function call until the corresponding
free variable is bound (this corresponds to the
\emph{residuation}\index{residuation} principle
which is the basis of languages like Escher \cite{Lloyd94ILPS,Lloyd95},
Le Fun \cite{Ait-KaciLincolnNasr87}, Life \cite{Ait-Kaci90},
NUE-Prolog \cite{Naish91}, or Oz \cite{Smolka95}).
In this case, the function is called \emph{rigid}\index{rigid}.
\item
(Non-deterministically) instantiate the free variable to the different values
demanded by the left-hand sides and apply reduction steps using
the different rules (this corresponds to \emph{narrowing}\index{narrowing}
principle which is the basis of languages like
ALF \cite{Hanus90PLILP}, Babel \cite{MorenoRodriguez92},
K-Leaf \cite{GiovannettiLeviMoisoPalamidessi91}, LPG \cite{BertEchahed86},
or SLOG \cite{Fribourg85}).
In this case, the function is called \emph{flexible}\index{flexible}.
\end{enumerate}
Since Curry is an attempt to provide a common platform for different
declarative programming styles and the decision about the
``right'' strategy depends on the definition and
intended meaning of the functions, Curry supports both strategies.
The precise strategy is specified by \emph{evaluation annotations}
\index{evaluation annotation}
for each function.\footnote{%
Evaluation annotations are similar to coroutining
declarations \cite{Naish87Diss} in Prolog where the programmer
specifies conditions under which a literal is ready for
a resolution step.}
The precise operational meaning of evaluation annotations
is defined in Appendix~\ref{app-opsem}.
A function can be annotated as
\emph{rigid}\index{rigid}\index{annotation!rigid}\pindex{rigid} or
\emph{flexible}\index{flexible}\index{annotation!flexible}\pindex{flex}.
If an explicit annotation is not provided by the user, a default strategy
is used: functions with the result type \pr{Constraint}
are flexible and all other functions are rigid.
Functions with a polymorphic result type (like the identity)
are considered as rigid, although they can used as a constraint function
in a particular context.
This default strategy can be changed by providing explicit
evaluation annotations for each function or by pragmas
(see Section~\ref{sec-pragmas}).
For instance, consider the function \pr{f} as defined above.
If \pr{f} has the evaluation annotation
\startprog
f eval flex
\stopprog
(i.e., \pr{f} is flexible),
then the expression \pr{f x} is evaluated by instantiating \pr{x} to
\pr{0} or \pr{1} and applying a reduction step in both cases.
This yields the disjunctive expression
\startprog
\{x=0\} 2 | \{x=1\} 3 .
\stopprog
However, if \pr{f} has the evaluation annotation
\startprog
f eval rigid
\stopprog
(i.e., \pr{f} is rigid),
then the expression \pr{f x} cannot be evaluated since the argument
is a free variable. In order to proceed, we need a ``generator''
for values for \pr{x}, which is the case in the following constraint:
\startprog
f x =:= y \& x =:= 1
\stopprog
Here, the first constraint \pr{f x =:= y}
cannot be evaluated and thus suspends,
but the second constraint \pr{x=:=1} is evaluated by binding \pr{x} to \pr{1}.
After this binding, the first constraint can be evaluated and the entire
constraint is solved. Thus, the constraint is solved by the following
steps:
\startprog
f x =:= y \& x =:= 1
$\leadsto$ \{x=1\} f 1 =:= y
$\leadsto$ \{x=1\} 3 =:= y
$\leadsto$ \{x=1,y=3\}
\stopprog
(The empty constraint is omitted in the final answer.)
\section{Types}
\subsection{Built-in Types}
\label{sec-builtin-types}
The following types are predefined in Curry:
\begin{description}
\item[Boolean values:]
\pindex{Bool}\pindex{True}\pindex{False}
They are predefined by the datatype declaration
\startprog
data Bool = True | False
\stopprog
The (sequential) conjunction\index{conjunction}\index{conjunction!sequential}
is predefined as the left-associative infix operator \pr{\&\&}\pindex{\&\&}:
\startprog
(\&\&) :: Bool -> Bool -> Bool
True \&\& x = x
False \&\& x = False
\stopprog
Similarly, the (sequential) disjunction\index{disjunction}
\pr{||}\pindex{"|"|} and the negation\index{negation} \pr{not}\pindex{not}
are defined as usual (see Appendix~\ref{app-prelude}).
Furthermore, the function \pr{otherwise}\pindex{otherwise}
is predefined as \pr{True}
to write rules with multiple guards more nicely.
Boolean values are mainly used in conditionals, i.e., the
conditional function \pr{if_then_else} is predefined as
\startprog
if_then_else :: Bool -> a -> a -> a
if True then x else y = x
if False then x else y = y
\stopprog
where \pr{if b then x else y} is syntactic sugar for the application
\pr{if_then_else b x y}.
A function with result type \pr{Bool} is often called
a \emph{predicate}\index{predicate} (in contrast to constraints
which have result type \pr{Constraint}, see below).\footnote{Predicates
in the logic programming sense should be considered as constraints
since they are only checked for satisfiability and usually
not reduced to \pr{True} or \pr{False} in contrast to Boolean functions.}
There are a number of built-in predicates for comparing objects,
like the predicate ``\pr{<}'' to compare numbers (e.g., \pr{1<2} evaluates
to \pr{True} and \pr{4<3} evaluates to \pr{False}).
There is also a standard predicate ``\pr{==}''\pindex{==}\index{equality!test}
\index{strict equality}\index{equality!strict}
to test the convertibility of expressions to identical data terms
(``strict equality'', cf.\ Section~\ref{sec-equality}):
$e_1 \pr{==} e_2$ reduces to \pr{True} if
$e_1$ and $e_2$ are reducible to identical ground data terms,
and it reduces to \pr{False} if $e_1$ and $e_2$ are reducible to
different ground data terms. The evaluation of $e_1 \pr{==} e_2$
suspends if one of the arguments is a free variable
(i.e., \pr{==} is a ``rigid'' function, cf.\ Section~\ref{sec-opsem}).
If neither $e_1$ nor $e_2$ is a free variable, $e_1 \pr{==} e_2$
is reduced according to the following rules:
\startprog
C == C \,\,= True \hfill\%{\rm for all $0$-ary constructors} C
C $x_1\ldots{}x_n$ == C $y_1\ldots{}y_n$ \,= $x_1\pr{==}y_1$ \&\&\ldots\&\& $x_n\pr{==}y_n$\hfill\%{\rm for all $n$-ary constructors} C
C $x_1\ldots{}x_n$ == D $y_1\ldots{}y_m$ = False \hfill\%{\rm for all different constructors} C {\rm{}and} D
\stopprog
This implements a test for
\emph{strict equality}\index{strict equality}\index{equality!strict}
(cf.\ Section~\ref{sec-equality}).
For instance, the test ``\pr{[0]==[0]}'' reduces to \pr{True},
whereas the test ``\pr{1==0}'' reduces to \pr{False}.
An equality test with a free variable in one side is delayed
in order to avoid an infinite set of solutions for insufficiently
instantiated tests like \pr{x==y}.
Usually, strict equality is only defined on data terms, i.e.,
\pr{==} is not really polymorphic but an overloaded
function symbol. This could be more precisely expressed using
type classes which will be considered in a future version.
Note that \pr{$e_1$==$e_2$} only \emph{tests} the identity of $e_1$ and $e_2$
but never binds one side to the other if it is a free variable.
This is in contrast to solving the equational constraint
\pr{$e_1$=:=$e_2$} which is checked for \emph{satisfiability} and propagates
new variable bindings in order to solve this constraint.
Thus, in terms of concurrent constraint programming languages
\cite{Saraswat93}, \pr{==} and \pr{=:=} correspond to
\emph{ask} and \emph{tell} equality constraints, respectively.
\item[Constraints:]
\pindex{Constraint}\index{constraint}
The type \pr{Constraint} is the type of a constraint which
is used in conditions of defining rules.
A function with result type \pr{Constraint} is often called
a \emph{constraint}\index{constraint}.
Constraints are different from Boolean values: a Boolean expression
reduces to \pr{True} or \pr{False} whereas a constraint is checked
for satisfiability. A constraint is applied (i.e., solved)
in a condition of a conditional equation. The equational constraint
\pr{$e_1$=:=$e_2$} is an elementary constraint which is solvable if
the expressions $e_1$ and $e_2$ are evaluable to unifiable data terms.
Constraints can be combined into a
conjunction\index{conjunction}\index{conjunction!concurrent}
of the form
$c_1 \cconj c_2 \cconj\ldots\cconj c_n$
by applying the concurrent conjunction operator \pr{\&}\pindex{\&}.
In this case, all constraints in the conjunction are
evaluated concurrently.
Constraints can also be evaluated in a sequential order
by the sequential
conjunction operator\index{conjunction}\index{conjunction!sequential}
\pr{\&>}\pindex{\&>}, i.e., the combined constraint
\pr{$c_1$ \&> $c_2$} will be evaluated by first completely evaluating
$c_1$ and then $c_2$.
\pr{success}\pindex{success} denotes an always solvable constraint.
Constraints can be passed as parameters or results of functions
like any other data object. For instance, the following function
takes a list of constraints as input and produces a single constraint,
the conjunction of all constraints of the input list:
\startprog
conj :: [Constraint] -> Constraint
conj [] = success
conj (c:cs) = c \& conj cs
\stopprog
The trivial constraint \pr{success} is usually not shown in answers
to a constraint expression. For instance, the constraint
\pr{x*x=:=y \& x=:=2} is evaluated to the answer
\startprog
\{x=2, y=4\}
\stopprog
\item[Functions:] The type \pr{t1 -> t2}\pindex{->}
is the type of a function
which produces a value of type \pr{t2} for each argument of type \pr{t1}.
A function $f$ is applied to an argument $x$ by writing ``$f~x$''.
The type expression
\startprog
$t_1$ -> $t_2$ -> $\cdots$ -> $t_{n+1}$
\stopprog
is an abbreviation for the type
\startprog
$t_1$ -> ($t_2$ -> ($\cdots$ -> $t_{n+1}$))
\stopprog
and denotes the type of a (currified) $n$-ary function,
i.e., \pr{->} associates to the right.
Similarly, the expression
\startprog
$f$ $e_1$ $e_2$ $\ldots$ $e_n$
\stopprog
is an abbreviation for the expression
\startprog
(\ldots(($f$ $e_1$) $e_2$) $\ldots$ $e_n$)
\stopprog
and denotes the application of a function $f$ to $n$ arguments,
i.e., the application associates to the left.
\item[Integers:] \pindex{Int}
The common integer values, like ``42'' or ``-15'',
are considered as constructors (constants) of type \pr{Int}.
The usual operators on integers, like \pr{+} or \pr{*},
are predefined functions with ``rigid'' arguments,
i.e., are evaluated only if both arguments are integer values,
otherwise such function calls are delayed.
Thus, these functions can be used as ``passive constraints''
which become active after binding their arguments.
For instance, if the constraint \pr{digit} is defined by the
equations
\startprog
digit 0 = success
\ldots
digit 9 = success
\stopprog
then the constraint \pr{ x*x=:=y \& x+x=:=y \& digit x }
is solved by delaying
the two equations which will be activated after binding the variable \pr{x}
to a digit by the constraint \pr{digit x}. Thus, the corresponding
computed solution is
\startprog
\{x=0,y=0\} | \{x=2,y=4\}
\stopprog
\item[Floating point numbers:] \pindex{Float}
Similarly to integers, values like ``3.14159'' or ``5.0e-4'' are
considered as constructors of type \pr{Float}.
Since overloading is not included in the kernel version of Curry,
the names of arithmetic functions on floats are different
from the corresponding functions on integers.
% SA I don't think it would be too hard to fix without classes.
% SA E.g., SML does it well
\item[Lists:] \pindex{[a]}\pindex{[]}\pindex{:}\index{lists}
The type \pr{[t]} denotes all lists whose elements are values of
type \pr{t}. The type of lists can be considered as predefined
by the declaration
\startprog
data [a] = [] | a : [a]
\stopprog
where \pr{[]} denotes the empty list and \pr{x:xs} is the non-empty list
consisting of the first element \pr{x} and the remaining list \pr{xs}.
Since it is common to denote lists with square brackets,
the following convenient notation is supported:
\startprog
[$e_1$,$e_2$,\ldots,$e_n$]
\stopprog
denotes the list \pr{$e_1$:$e_2$:$\cdots$:$e_n$:[]} (which is equivalent
to \pr{$e_1$:($e_2$:($\cdots$:($e_n$:[])\ldots))} since ``\pr{:}''
is a right-associative infix operator).
%The notation
%\startprog
%[$e_1$,$e_2$,\ldots,$e_n$|as]
%\stopprog
%is equivalent to \pr{$e_1$:$e_2$:$\cdots$:$e_n$:as}, i.e., ``\pr{|}''
%introduces the tail of a list.
Note that there is an overloading in the notation \pr{[t]}: if \pr{t}
is a type, \pr{[t]} denotes the type of lists containing elements
of type \pr{t}, where \pr{[t]} denotes a single element list
(with element \pr{t}) if \pr{t} is an expression.
Since there is a strong distinction between occurrences of
types and expressions, this overloading can always be resolved.
For instance, the following predefined functions define
the concatenation of two lists and the application of a function to
all elements in a list:\pindex{++}\pindex{map}
\startprog
(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x:xs) ++ ys = x : xs ++ ys
~
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
\stopprog
\item[Characters:] \pindex{Char}
Values like \pr{'a'} or \pr{'0'} denote constants of type \pr{Char}.
There are two conversion functions between characters and
their corresponding ASCII values:\pindex{ord}\pindex{chr}
\startprog
ord :: Char -> Int
chr :: Int -> Char
\stopprog
\item[Strings:] \pindex{String}
The type \pr{String} is an abbreviation for \pr{[Char]},
i.e., strings are considered as lists of characters.
String constants are enclosed in double quotes.
Thus, the string constant \pr{"Hello"} is identical
to the character list \pr{['H','e','l','l','o']}.
A term can be converted into a string by the function\pindex{show}
\startprog
show :: a -> String
\stopprog
For instance, the result of \pr{show(42)} is the
character list \pr{['4','2']}.
\item[Tuples:] \index{tuple}\pindex{(\ldots)}
If $t_1, t_2,\ldots,t_n$ are types and $n \geq 2$, then
\pr{($t_1$,$t_2$,\ldots,$t_n$)}
denotes the type of all $n$-tuples.
The elements of type \pr{($t_1$,$t_2$,\ldots,$t_n$)}
are \pr{($x_1$,$x_2$,\ldots,$x_n$)} where $x_i$ is an element
of type $t_i$ ($i=1,\ldots,n$). Thus, for each $n$,
the tuple-building operation
\pr{($\dots$)} can be considered as an $n$-ary constructor
introduced by the pseudo-declaration
\startprog
data ($a_1\pr{,}a_2\pr{,}\ldots\pr{,}a_n$) = ($\cdot$) $a_1$ $a_2$\ldots$a_n$
\stopprog
where \pr{($x_1$,$x_2$,\ldots,$x_n$)} is equivalent to the
constructor application \pr{($\cdot$) $x_1$ $x_2$\ldots$x_n$}.
The unit type\index{unit type}\pindex{()}
\pr{()} has only a single element
\pr{()} and can be considered as defined by
\startprog
data () = ()
\stopprog
Thus, the unit type can also be interpreted as the type of 0-tuples.
\end{description}
\subsection{Type System}
Curry is a strongly typed language with a
Hindley/Milner-like polymorphic type system \cite{DamasMilner82}.\footnote{%
The extension of the type system to Haskell's type classes
is not included in the kernel language but may be considered
in a future version.}
Each variable, constructor and operation has a unique type,
where only the types of constructors have to be declared by
datatype declarations (see Section~\ref{sec-datatypes}).
The types of functions can be declared (see Section~\ref{sec-funcdecl})
but can also be omitted. In the latter case they will be reconstructed
by a type inference mechanism.
Note that Curry is an explicitly typed language, i.e., each function
has a type. The type can only be omitted if the type inferencer
is able to reconstruct it and to insert the missing type declaration.
In particular, the type inferencer can reconstruct only those
types which are visible in the module (cf.\ Section~\ref{sec-modules}).
Each type inferencer of a Curry implementation must be able
to insert the types of the parameters and the free variables
(cf.\ Section~\ref{sec-freevars}) for each rule.
The automatic inference of the types of the defined functions
might require further restrictions depending on the type inference method.
Therefore, the following definition of a well-typed Curry program
assumes that the types of all defined functions are given
(either by the programmer or by the type inferencer).
A Curry implementation must accept a well-typed program
if all types are explicitly provided but should also
support the inference of function types according to \cite{DamasMilner82}.
A \emph{type expression}\index{type expression}\index{type}
is a well-formed expression containing type variables,
basic types like \pr{Bool}, \pr{Constraint}, \pr{Int}, \pr{Float},
\pr{Char}, and type constructors like \pr{[$\cdot$]},
\pr{($\cdot$,$\cdots$,$\cdot$)}, \pr{$\cdot$->$\cdot$}, or introduced by
datatype declarations (cf.\ Section~\ref{sec-datatypes}).
For instance, \pr{[(Int,a)]->a} is a type expression
containing the type variable \pr{a}. A \emph{type scheme}\index{type scheme}
is a type expression with a universal quantification for some
type variables, i.e., it has the form $\forall \alpha_1,\ldots,\alpha_n.\tau$
($n \geq 0$; in case of $n=0$, the type scheme is equivalent to a
type expression). A function type declaration \pr{$f$::$\tau$}
is considered as an assignment of the type scheme
$\forall \alpha_1,\ldots,\alpha_n.\tau$ to $f$, where
$\alpha_1,\ldots,\alpha_n$ are all type variables occurring in $\tau$.
The type $\tau$ is called a
\emph{generic instance}\index{generic instance}\index{type instance}
of the type scheme $\forall \alpha_1,\ldots,\alpha_n.\tau'$
if there is a substitution
$\sigma = \{\alpha_1 \mapsto \tau_1,\ldots,\alpha_n \mapsto \tau_n\}$
on the types with $\sigma(\tau') = \tau$.
The types of all defined functions are collected in a
\emph{type environment}\index{type environment}
$\Ac$ which is a mapping from identifiers to type schemes.
It contains at least the type schemes of the defined functions
and an assignment of types for some local variables.
An expression $e$ is \emph{well-typed}\index{well-typed}
and has type $\tau$ w.r.t.\ a type environment $\Ac$
if $\Ac \vdash e::\tau$ is derivable according to the
following inference rules:
\[
\begin{array}{@{}ll}
\mbox{Axiom:} &
\infrule{}
{\Ac \vdash x::\tau}
\mbox{~~if $\tau$ is a generic instance of $\Ac(x)$}\\[3ex]
\mbox{Application:} &
\infrule{\Ac \vdash e_1::\tau_1 \to \tau_2 \quad \Ac \vdash e_2::\tau_1}
{\Ac \vdash e_1\,e_2::\tau_2}\\[3ex]
\mbox{Abstraction:} &
\infrule{\Ac[x/\tau] \vdash e::\tau'}
{\Ac \vdash \ttbs{}x\pr{->}e::\tau \to \tau'}
\mbox{~~if $\tau$ is a type expression}\\[3ex]
\mbox{Existential:} &
\infrule{\Ac[x/\tau] \vdash e::\pr{Constraint}}
{\Ac \vdash \pr{let\,}x\pr{\,free\,in\,}e::\pr{Constraint}}
\mbox{~~if $\tau$ is a type expression}\\[3ex]
\mbox{Conditional:} &
\infrule{\Ac \vdash e_1::\pr{Bool} \quad \Ac \vdash e_2::\tau \quad \Ac \vdash e_3::\tau}
{\Ac \vdash \pr{if\,}e_1\pr{\,then\,}e_2\pr{\,else\,}e_3::\tau}\\[3ex]
\end{array}
\]
A defining equation \pr{~$f~t_1\ldots t_n$ = $e$ $[$where $x$ free$]$~}
is well-typed
w.r.t.\ a type environment $\Ac$ if
$\Ac(f) = \forall \alpha_1,\ldots,\alpha_m.\tau$
[and $\Ac(x)$ is a type expression] and
$\Ac \vdash \ttbs{}t_1\pr{->}\cdots \ttbs{}t_n\pr{->}e :: \tau$
is derivable according to the above inference rules.
A conditional equation \pr{~$l$ | $c$ = $r$}
is considered (for the purpose of typing) as syntactic sugar
for \pr{~$l$ = cond $c$ $r$~}
where \pr{cond} is a new function with type scheme
$\Ac(\pr{cond}) = \forall \alpha.\pr{Constraint}\to \alpha \to \alpha$.
A \emph{program is well-typed}\index{well-typed}
if all its rules are well-typed with a unique assignment of
type schemes to defined functions.
Note that the following recursive definition is a well-typed
Curry program according to the definition above (and the type definitions
given in the prelude, cf.\ Appendix~\ref{app-prelude}):
\startprog
f :: [a] -> [a]
f x = if length x == 0 then fst (g x x) else x
~
g :: [a] -> [b] -> ([a],[b])
g x y = (f x, f y)
~
h :: ([Int],[Bool])
h = g [3,4] [True,False]
\stopprog
However, if the type declaration for \pr{g} is omitted,
the usual type inference algorithms are not able to
infer this type.
\section{Modules}
\label{sec-modules}
%The design of a module system for Curry is not influenced
%by the functional logic features of Curry. Therefore, the current
%version of Curry uses a standard module system similar
%to ALF's \cite{Hanus90PLILP} or G\"odel's \cite{HillLloyd94}.
%The extension to a more sophisticated module system
%such as SML's \cite{MilnerTofteHarper90} is a topic for
%future extensions.
A \emph{module}\index{module} defines a collection of datatypes, constructors
and functions which we call \emph{entities}\index{entity} in the following.
A module exports some of its entities which can be imported
and used by other modules. An entity which is not exported
is not accessible from other modules.
A Curry \emph{program}\index{program} is a collection of modules.
There is one main module which is loaded into a Curry system.
The modules imported by the main module are implicitly loaded
but not visible to the user. After loading the main module,
the user can evaluate expressions which contain entities
exported by the main module.
% SA in practice there is the important notion of compilation unit
% Until there are modules, this should be clarified
There is one distinguished module, named \pr{prelude}\index{prelude},
which is implicitly imported into all programs
(see also Appendix~\ref{app-prelude}).
Thus, the entities defined in the prelude (basic functions
for arithmetic, list processing etc.) can be always used.
A module always starts with the head which contains at least
the name of the module followed by the keyword
\pr{where}\pindex{module}\pindex{where}, like
\startprog
module stack where \ldots
\stopprog
If a program does not contain a module head, the \emph{standard module head}
``\pr{module main where}'' is implicitly inserted.
\emph{Module names}\index{module name}\index{name!of a module}
can be given a hierarchical structure by inserting dots
which is useful if larger applications should be structured
into different subprojects, e.g.,
\startprog
company.project1.subproject2.mod4
\stopprog
is a valid module name.
The dots may reflect the hierarchical file structure
where modules are stored. For instance, the module
\pr{compiler.symboltable} could be stored in the file
\pr{symboltable.cur} in the directory \pr{compiler}.
Without any further restrictions in the module head,
all entities which are directly accessible (see below) in the module
(roughly speaking, all entities defined or imported in the module)
are exported.
In order to restrict the set of exported entities of a module,
an \emph{export list}\index{entity!exported}
\index{export declaration}\index{declaration!export}
can be added to the module head.
For instance, a module with the head
\startprog
module stack(stackType, push, pop, newStack) where \ldots
\stopprog
exports the entities \pr{stackType}, \pr{push}, \pr{pop},
and \pr{newStack}. An export list can contain the following entries:
\begin{enumerate}
\item
\emph{Names of datatypes}: This exports only the datatype, whose name
must be directly accessible in this module,
\emph{but not} the constructors of the datatype.
The export of a datatype
without its constructors allows the definition of
\emph{abstract datatypes}\index{abstract datatype}\index{datatype!abstract}.
\item
\emph{Datatypes with constructors}: If the export list contains
the entry \pr{t(c$_1$,\ldots,c$_n$)}, then \pr{t} must be a datatype
directly accessible in the module and \pr{c$_1$},\ldots,\pr{c$_n$}
are directly accessible constructors of this datatype. In this case,
the datatype \pr{t} and the constructors \pr{c$_1$},\ldots,\pr{c$_n$}
are exported by this module.
\item
\emph{Datatypes with all constructors}: If the export list contains
the entry \pr{t(..)}, then \pr{t} must be a datatype
whose name is directly accessible in the module. In this case,
the datatype \pr{t} and all constructors of this datatype,
which must be also directly accessible in this module, are exported.
\item
\emph{Names of functions}: This exports the corresponding functions
whose names must be directly accessible
in this module. The types occurring in the
argument and result type of this function are implicitly exported,
otherwise the function may not be applicable outside this module.
\item
\emph{Modules}: The set of all directly accessible
entities imported from a module $m$
into the current module (see below)
can be exported by a single entry ``\pr{module $m$}''
in the export list. For instance, if the head of the module \pr{stack}
is defined as above, the module head
\startprog
module queue(module stack, enqueue, dequeue) where \ldots
\stopprog
specifies that the module \pr{queue} exports the entities
\pr{stackType}, \pr{push}, \pr{pop}, \pr{newStack}, \pr{enqueue},
and \pr{dequeue}.
If the exported entities from imported modules should be further
restricted, one can also add an export list to the exported module.
This list can contain names of datatypes
and functions imported from this module. If a datatype which is imported
from another module is exported, the datatype is exported in the same way
(i.e., with or without constructors) how it is imported
into the current module. Thus, a further specification
for the exported constructors is not necessary.
For instance, the module head
\startprog
module queue(module stack(stackType,newStack), enqueue, dequeue) \ldots
\stopprog
specifies that the module \pr{queue} exports the entities
\pr{stackType} and \pr{newStack}, which are imported from \pr{stack},
and \pr{enqueue} and \pr{dequeue}, which are defined in \pr{queue}.
\end{enumerate}
%
All entities defined by top-level declarations in a module are
\emph{directly accessible}\index{directly accessible}\index{accessible!directly}
in this module. Additionally, the
entities exported by another module can be also made directly accessible
in the module by an \pr{import}%
\pindex{index}\index{entity!imported}%
\index{import declaration}\index{declaration!import}
declaration. An import declaration
consists of the name of the imported module and (optionally)
a list of entities imported from that module. If the list
of imported entities is omitted, all entities exported by
that module are imported. For instance,
the import declaration
\startprog
import stack
\stopprog
imports all entities exported by the module \pr{stack},
whereas the declaration
\startprog
import family(father, grandfather)
\stopprog
imports only the entities \pr{father} and \pr{grandfather}
from the module \pr{family}, provided that they are exported
by \pr{family}.
The names of all imported entities are directly accessible in the current
module, i.e., they are equivalent to top-level declarations,
provided that their names are not in conflict with other names.
For instance, if a function \pr{f} is imported from module \pr{m}
but the current module contains a top-level declaration for \pr{f}
(which is thus directly accessible in the current module),
the imported function is not directly accessible.
Similarly, if two identical names are imported from different modules,
none of these entities is directly accessible.
It is possible to access imported but not directly accessible
names by prefixing them with the module identifier.
For instance, consider the module \pr{m1} defined by
\startprog
module m1 where
f :: Int -> Int
\ldots
\stopprog
and the module \pr{m2} defined by
\startprog
module m2 where
f :: Int -> Int
\ldots
\stopprog
together with the main module
\startprog
module main where
import m1
import m2
\ldots
\stopprog
Then the names of the imported functions \pr{f} are not directly accessible
in the main module but one can refer by the qualified identifiers
\pr{m1.f} or \pr{m2.f}
to the corresponding imported entities. Note that export
declarations only allow unqualified names to be exported
(which is the reason for the direct accessibility condition
for the exports). This prevents hierarchical qualifications like
\pr{mod1.mod2.f} and supports the view to consider a module
as a single collection of related entities.
Another method to resolve name conflicts between imported
entities is the renaming of imported entities.
For instance, the name conflict between \pr{m1.f} and \pr{m2.f}
can be resolved by
the following imports:
\startprog
module main where
import m1
import m2 renaming f to m2_f
\ldots
\stopprog
Thus, the entity \pr{f} exported by module \pr{m2} is imported
with the name \pr{m2_f}.
In the subsequent body of this module, the (directly accessible)
name \pr{f} refers to the entity exported by module \pr{m1}
and the (directly accessible) name
\pr{m2_f} refers to the entity \pr{f} exported by module \pr{m2}.
Only imported entities can be renamed, i.e., the import declaration
\startprog
import m(f) renaming g to mg
\stopprog
will always cause an error.
Although each name refers to exactly one entity, it is possible
that the same entity is referred by different names.
For instance, consider the module \pr{m} defined by
\startprog
module m(f) where
f :: Int -> Int
\ldots
\stopprog
and the module \pr{use_m} defined by
\startprog
module use_m(f) where
import m
\stopprog
together with the main module
\startprog
module main where
import m renaming f to mf
import use_m
\ldots
\stopprog
Then the names \pr{mf} as well as \pr{f} refers in the main module
to the same function defined in module \pr{m}.
The import dependencies between modules must be \emph{non-circular},
i.e., it is not allowed that module $m_1$ imports module $m_2$
and module $m_2$ also imports (directly or indirectly)
module $m_1$.
\section{Input/Output}
Curry provides a declarative model of I/O by considering
I/O operations as transformations on the outside world.
In order to avoid dealing with different versions of the
outside world, it must be ensured that at each point of
a computation only one version of the world is accessible.
This is guaranteed by using the monadic I/O approach
\cite{PeytonJonesWadler93POPL} of Haskell and by
requiring that I/O operations are not allowed in program
parts where non-deterministic search is possible.
In the monadic I/O\index{monadic I/O} approach,
the outside ``world'' is not directly
accessible but only through actions which change the world.
Thus, the world is encapsulated in an abstract datatype
which provides actions to change the world.
The type of such actions is \pr{IO t}\pindex{IO}
which is an abbreviation for
\startprog
World -> (t,World)
\stopprog
where \pr{World} denotes the type of all states of the outside world.
If an action of type \pr{IO t} is applied to a particular world,
it yields a value of type \pr{t} and a new (changed) world.
For instance, \pr{getChar} is an action which reads a character
from the standard input when it is executed, i.e., applied to a world.
Therefore, \pr{getChar} has the type \pr{IO Char}.
The important point is that values of type \pr{World} are
not accessible for the programmer --- she/he can only create
and compose actions on the world. Thus, a program intended to perform I/O
operations has a sequence of actions as the result. These actions
are computed and executed when the program is connected to the world
by executing it. For instance,
\pindex{getChar}\pindex{getLine}
\startprog
getChar :: IO Char
getLine :: IO String
\stopprog
are actions which read the next character or the next line from
the standard input. The functions
\startprog
putChar :: Char -> IO ()
putStr :: String -> IO ()
putStrLn :: String -> IO ()
\stopprog
\pindex{putChar}\pindex{putStr}\pindex{putStrLn}
take a character (string) and produces an action which, when
applied to a world, puts this character (string) to the standard output
(and a line-feed in case of \pr{putStrLn}.
Since an interactive program consists of a sequence of I/O operations,
where the order in the sequence is important, there are two
operations to compose actions in a particular order.
The function \pindex{>>}
\startprog
(>>) :: IO a -> IO b -> IO b
\stopprog
takes two actions as input and yields an action as output.
The output action consists of performing the first action followed
be the second action, where the produced value of the first action is
ignored. If the value of the first action should be taken into account
before performing the second action, the actions can be composed by
the function
\pindex{>>=}
\startprog
(>>=) :: IO a -> (a -> IO b) -> IO b
\stopprog
where the second argument is a function taking the value produced
by the first action as input and performs another action.
For instance, the action
\startprog
getLine >>= putStrLn
\stopprog
is of type \pr{IO ()} and copies, when executed, a line
from standard input to standard output.
The \pr{return}\pindex{return} function
\startprog
return :: a -> IO a
\stopprog
is sometimes useful to terminate a sequence of actions and return
a result of an I/O operation. Thus, \pr{return v} is an action which
does not change the world and returns the value \pr{v}.
To execute an action, it must be the main expression
in a program, i.e., interactive programs have type \pr{IO ()}.
Since the world cannot be copied (note that the world contains at least
the complete file system), non-determinism in relation with
I/O operations must be avoided. Thus, the applied action must always
be known, i.e., \pr{>>} and \pr{>>=} are rigid in their arguments.
Moreover, it is a runtime error if a disjunctive expression
(cf.\ Section~\ref{sec-opsem})
\pr{$\sigma_1 \ans e_1$ |$\cdots$| $\sigma_n \ans e_n$},
where the $e_i$'s are of type \pr{IO ()}, occurs as the top-level
expression of a program, since it is unclear in this case
which of the disjunctive actions should be applied to the current
world. Thus, all possible search must be encapsulated between I/O operations
(see Section~\ref{sec-encapsearch}).
Using the evaluation annotations, a compiler is able to
detect functions where search is definitely avoided
(e.g., if all evaluated positions are declared as \pr{rigid}).
Thus, the compiler may warn the user about non-deterministic
computations which may occur in I/O actions so that the programmer
can encapsulate them.
\section{Encapsulated Search}
\label{sec-encapsearch}
Global search, possibly implemented by backtracking,
must be avoided in some situations (user-control of
efficiency, concurrent computations, non-backtrackable I/O).
Hence it is sometimes necessary to encapsulate search,
i.e., non-deterministic computations in parts of larger programs.
Non-deterministic computations might occur in Curry
whenever a function must be evaluated with a free variable at
a flexible argument position. In this case, the computation must follow
different branches with different bindings applied to
the current expression which has the effect that the entire
expression is split into (at least) two independent disjunctions.
To give the programmer control
on the actions taken in this situation, Curry provides
a primitive search operator which is the basis to implement
sophisticated search strategies.
This section sketches the idea of encapsulating search in Curry
and describes some predefined search strategies.
\subsection{Search Goals and Search Operators}
Since search is used to find solutions to some constraint,
search is always initiated by a constraint containing a
\emph{search variable}\index{variable!search}\index{search variable}
for which a solution should be
computed.\footnote{The generalization to more than
one search variable is straightforward by the use of tuples.}
Since the search variable may be bound to different solutions
in different disjunctive branches, it must be abstracted.
Therefore, a \emph{search goal}\index{search goal} has the type
\pr{a->Constraint} where \pr{a} is the type of the values
which we are searching for. In particular, if $c$ is a constraint
containing a variable $x$ and we are interested in solutions
for $x$, i.e., values for $x$ such that $c$ is satisfied,
then the corresponding search goal has the form \pr{\ttbs$x$->$c$}.
However, any other expression of the same type can also be used
as a search goal.
To control the search performed to find solutions to search goals,
Curry has a predefined operator\pindex{try}
\startprog
try :: (a->Constraint) -> [a->Constraint]
\stopprog
which takes a search goal and produces a list of search goals.
The search operator \pr{try} attempts to evaluate
the search goal until the computation finishes or
performs a non-deterministic splitting. In the latter case,
the computation is stopped and the different search goals
caused by this splitting are returned as the result list.
Thus, an expression of the form \pr{try \ttbs$x$->$c$}
can have three possible results:
\begin{enumerate}
\item
An empty list. This indicates that the search goal \pr{\ttbs$x$->$c$}
has no solution. For instance, the expression
\startprog
try \ttbs{}x -> 1=:=2
\stopprog
reduces to \pr{[]}.
\item
A list containing a single element. In this case, the search goal
\pr{\ttbs$x$->$c$} has a single solution represented by the element
of this list. For instance, the expression
\startprog
try \ttbs{}x->[x]=:=[0]
\stopprog
reduces to \pr{[\ttbs{}x->x=:=0]}.
Note that a solution, i.e., a binding for the search variable
like the substitution $\{\pr{x}\mapsto\pr{0}\}$,
can always be presented by an equational constraint
like \pr{x=:=0}.
Generally, a one-element list as a result of \pr{try}
has always the form \pr{[\ttbs$x$->$x$=:=$e$]}
(plus some local variables, see next subsection)
where $e$ is fully evaluated, i.e., $e$ does not contain defined functions.
Otherwise, this goal might not be solvable due to the definition
of equational constraints.
\item
A list containing more than one element. In this case, the evaluation
of the search goal \pr{\ttbs$x$->$c$} requires a non-deterministic
computation step. The different alternatives immediately after
this non-deterministic step are represented as elements of this list.
For instance, if the function \pr{f} is defined by
\startprog
f eval flex
f a = c
f b = d
\stopprog
then the expression
\startprog
try \ttbs{}x -> f x =:= d
\stopprog
reduces to the list
\pr{[\ttbs{}x->x=:=a \& f a =:= d, \ttbs{}x->x=:=b \& f b =:= d]}.
This example also shows why the search variable must be abstracted:
the alternative bindings cannot be actually performed
(since a free variable is only bound to at most one value
in each computation thread) but are represented as
equational constraints in the search goal.
\end{enumerate}
Note that the search goals of the list in the last example
are not further evaluated.
This provides the possibility to determine the behavior
for non-deterministic computations. For instance,
the following function defines a depth-first search operator
which collects all solutions of a search goal in a list:
\startprog
all :: (a->Constraint) -> [a->Constraint]
all g = collect (try g)
where collect [] = []
collect [g] = [g]
collect (g1:g2:gs) = concat (map all (g1:g2:gs))
\stopprog
(\pr{concat} concatenates a list of lists to a single list).
For instance, if \pr{append} is the list concatenation
(and defined as flexible), then the expression
\startprog
all \ttbs{}l -> append l [1] =:= [0,1]
\stopprog
reduces to \pr{[\ttbs{}l->l=:=[0]]}.
The value computed for the search variable in a search goal
can be easily accessed by applying it to a free variable.
For instance, the evaluation of the applicative expression
\startprog
(all \ttbs{}l->append l [1] =:= [0,1]) =:= [g] \& g x
\stopprog
binds the variable \pr{g} to the search goal
\pr{[\ttbs{}l->l=:=[0]]} and the variable \pr{x} to the value \pr{[0]}
(due to solving the constraint \pr{x=:=[0]}).
Based on this idea, there is a predefined function
\startprog
findall :: (a->Constraint) -> [a]
\stopprog
which takes a search goal and collects all solutions
(computed by a depth-first search like \pr{all}) for the search variable into
a list.
Due to the laziness of the language, search goals with infinitely
many solutions cause no problems if one is interested only in finitely
many solutions. For instance, a function which computes only the
first solution w.r.t.\ a depth-first search strategy can be defined by
\startprog
first g = head (findall g)
\stopprog
Note that \pr{first} is a partial function, i.e., it is undefined
if \pr{g} has no solution.
\subsection{Local Variables}
\label{sec-localvars}
Some care is necessary if free variables occur in the search goal,
like in the goal
\startprog
\ttbs{}l2 -> append l1 l2 =:= [0,1]
\stopprog
Here, only the variable \pr{l2} is abstracted in the search goal
but \pr{l1} is free. Since non-deterministic bindings
cannot be performed during encapsulated search,
\emph{free variables are never bound inside encapsulated search}.
Thus, if it is necessary to bind a free variable in order to
proceed an encapsulated search operation, the computation suspends.
For instance, the expression
\startprog
first \ttbs{}l2 -> append l1 l2 =:= [0,1]
\stopprog
cannot be evaluated and will be suspended until the variable \pr{l1}
is bound by another part of the computation. Thus, the constraint
\startprog
s =:= (first \ttbs{}l2->append l1 l2 =:= [0,1]) \& l1 =:= [0]
\stopprog
can be evaluated since the free variable \pr{l1} in the search goal
is bound to \pr{[0]}, i.e., this constraint reduces to the answer
\startprog
\{l1=[0], s=[1]\}
\stopprog
In some cases it is reasonable to have unbound variables
in search goals, but these variables should be treated as local,
i.e., they might have different bindings in different branches
of the search. For instance, if we want to compute
the last element of the list \pr{[3,4,5]}
based on \pr{append}, we could try to solve
the search goal
\startprog
\ttbs{}e -> append l [e] =:= [3,4,5]
\stopprog
However, the evaluation of this goal suspends due to the necessary
binding of the free variable \pr{l}.
This can be avoided by declaring the variable \pr{l}
as \emph{local} to the constraint by the use of \pr{let}
(see Section~\ref{sec-freevars}), like in the following expression:
\startprog
first \ttbs{}e -> let l free in append l [e] =:= [3,4,5]
\stopprog
Due to the local declaration of the variable \pr{l}
(which corresponds logically to an
existential quantification), the variable \pr{l} is only visible
inside the constraint and, therefore, it can be bound to
different values in different branches. Hence this expression
evaluates to the result \pr{5}.
In order to ensure that an encapsulated search will not be
suspended due to necessary bindings of free variables,
the search goal should be a closed expression when a search operator is
applied to it, i.e., the search variable is bound by the
lambda abstraction and all other free variables are existentially
quantified by local declarations.
\subsection{Predefined Search Operators}
\label{sec-predef-search}
There are a number of search operators which are predefined
in the prelude. All these operators are based on the primitive
\pr{try} as described above.
It is also possible to define other search strategies
in a similar way. Thus, the \pr{try} operator is a
a powerful primitive to define appropriate search strategies.
In the following, we list the predefined search operators.
\begin{description}
\item[\pr{all :: (a->Constraint) -> [a->Constraint]}]~\\
Compute all solutions for a search goal via a depth-first search
strategy. If there is no solution and the search space is finite,
the result is the empty list, otherwise the list contains solved
search goals (i.e., without defined operations).
\item[\pr{once :: (a->Constraint) -> (a->Constraint)}]~\\
Compute the first solution for a search goal via a depth-first search
strategy. Note that \pr{once} is partially defined, i.e.,
if there is no solution and the search space is finite,
the result is undefined.
\item[\pr{findall :: (a->Constraint) -> [a]}]~\\
Compute all solutions for a search goal via a depth-first search
strategy and unpack the solution's values for the search variable
into a list.
\item[\pr{best :: (a->Constraint) -> (a->a->Bool) -> [a->Constraint]}]~\\
Compute the best solution via a depth-first search strategy, according to
a specified relation (the second argument)
that can always take a decision which of two solutions is better
(the relation should deliver \pr{True} if the first argument
is a better solution than the second argument).
This operator is only used for a test, i.e.,
it should be a rigid function.
As a trivial example, consider the relation \pr{shorter} defined by
\startprog
shorter l1 l2 = length l1 <= length l2
\stopprog
Then the expression
\startprog
best (\ttbs{}x -> let y free in append x y =:= [1,2,3]) shorter
\stopprog
computes the shortest list which can be obtained by splitting the
list \pr{[1,2,3]} into this list and some other list, i.e.,
it reduces to \pr{[\ttbs{}x->x=:=[]]}. Similarly, the expression
\startprog
best (\ttbs{}x -> let y free in append x y =:= [1,2,3])
(\ttbs{}l1 l2 -> length l1 > length l2)
\stopprog
reduces to \pr{[\ttbs{}x->x=:=[1,2,3]]}.
\item[\pr{one :: (a->Constraint) -> [a->Constraint]}]~\\
Compute one solution via a fair strategy.
If there is no solution and the search space is finite,
the result is the empty list, otherwise the list contains one element
(the first solution represented as a search goal).
\item[\pr{condSearch :: (a->Constraint) -> (a->Bool) -> [a->Constraint]}]~\\
Compute the first solution (via a fair strategy)
that meets a specified condition.
The condition (second argument) must be a unary Boolean function.
\item[\pr{browse :: (a->Constraint) -> IO ()}]~\\
Show the solution of a \emph{solved} constraint on the standard output,
i.e., a call \pr{browse g}, where \pr{g} is a solved search goal,
is evaluated to an I/O action which prints the solution.
If \pr{g} is not an abstraction of a solved constraint,
a call \pr{browse g} produces a runtime error.
\item[\pr{browseList :: [a->Constraint] -> IO ()}]~\\
Similar to \pr{browse} but shows a list of solutions on the standard output.
The \pr{browse} operators are useful for testing search
strategies. For instance, the evaluation of the expression
\startprog
browseList (all \ttbs{}x -> let y free in append x y =:= [0,1,2])
\stopprog
produces the following result on the standard output:
\startprog
[]
[0]
[0,1]
[0,1,2]
\stopprog
Due to the laziness of the evaluation strategy, one can also
browse the solutions of a goal with infinitely many solutions
which are printed on the standard output until the process is stopped.
\end{description}
\subsection{Choice}
In order to provide a ``don't care'' selection of an element
out of different alternatives, which is necessary in concurrent
computation structures to support many-to-one communication,
Curry provides the special evaluation annotation\index{evaluation annotation}
\pr{choice}\pindex{choice}\index{committed choice}:
\startprog
$f$ eval choice
\stopprog
Intuitively, a function $f$ declared as a \pr{choice} function
behaves as follows. A call to $f$ is evaluated as usual
(with a fair evaluation of disjunctive alternatives, which is
important here!) but with the following differences:
\begin{enumerate}
\item No free variable of this function call is bound
(i.e., this function call and all its subcalls are considered as rigid).
\item If one rule for $f$ matches and its guard is entailed
(i.e., satisfiable without binding goal variables), all other
alternatives for evaluating this call are ignored.
\end{enumerate}
This means that the first applicable rule for this function call
(after pattern matching and guard checking) is taken and all
other alternatives are discarded.
As an example, consider the non-deterministic function \pr{merge}\pindex{merge}
for the fair merge of two lists:
\startprog
merge :: [a] -> [a] -> [a]
merge eval choice
merge [] l2 = l2
merge l1 [] = l1
merge (e:r) l2 = e : merge r l2
merge l1 (e:r) = e : merge l1 r
\stopprog
Due to the \pr{choice} annotation, a call to \pr{merge} is reducible
if one of the input lists is known to be empty or non-empty
(but it is not reducible of both inputs are unknown).
Thus, \pr{merge} can also merge partially unknown lists
which are incrementally instantiated during computation.
Another application of the \pr{choice} annotation is a fair
evaluation of disjunctive search goals. For instance,
the following function takes a search goal and computes
one result (if it exists) for the search variable in a fair manner,
i.e., possible infinite computation branches in a disjunction do not inhibit
the computation of a solution in another branch:\pindex{tryone}
\startprog
tryone :: (a->Constraint) -> a
tryone eval choice
tryone g | g x = x where x free
\stopprog
%
Note that functions containing \pr{choice} expressions may be
indeterministic in contrast to other user-defined functions,
i.e., identical calls may lead to different answers at different times.
This is similar to non-deterministic functions which can
be given a declarative semantics \cite{GonzalesEtAl96ESOP}.
The difference to non-deterministic functions is that
only one result is computed for functions declared with \pr{choice}
instead of all results. As a consequence, the
completeness result for Curry's operational semantics
\cite{Hanus97POPL} does not hold in the presence of the evaluation
annotation \pr{choice}. Since indeterministic functions
are mainly used in the coordination level of concurrent
applications (as in parallel functional computation models
\cite{BreitingerLoogenOrtega95,ChakravartyEtAl98Goffin}),
this is a only minor problem. A programming environment
for Curry could classify the indeterministic functions in a program.
\section{Interface to External Functions and Constraint Solvers}
Since Curry has only a small number of builtins,
it provides a simple interface to connect external functions
(functions programmed in another language) and external constraint
solvers. External functions must be free of side-effects
in order to be compatible with Curry's computation model.
An external constraint solver consists of a constraint store
which can be accessed and manipulated via a few primitive operations.
\subsection{External Functions}
External functions\index{external function}\index{function!external}
are considered as an implementation of
a potentially infinite set of equations (describing the graph of
the functions). In particular, they have no side effects, i.e.,
identical function calls at different points of time
yield identical results. Since the implementation of an external
function is unknown, a call to an external function is suspended
until all arguments are fully known, i.e., until they are
evaluated to ground data terms.
This view is compatible with the residuation principle
covered by Curry's computation model and very similar to
the connection of external functions to logic programs
\cite{BonnierMaluszynski88,Boye91}.
An external function is declared by a type declaration
followed by \pr{external(FName,OF,Lang)}\pindex{external}
where \pr{FName} is the name of the external function where its code is
contained in the object file \pr{OF}. \pr{Lang} is the
implementation language of the external function.
For instance, if the addition and multiplication on integers
are defined as C functions named \pr{add} and \pr{mult}
where the compiled code is contained in the file ``\pr{arith.o}'',
we can connect these functions to a Curry program by providing
the declarations
\startprog
add :: Int -> Int -> Int external("add","arith.o",C)
mult :: Int -> Int -> Int external("mult","arith.o",C)
\stopprog
Implementations of Curry might have restrictions on the interface.
A reasonable requirement is that the implementation language is
constrained to C and the argument and result types are only simple
types like \pr{Bool}, \pr{Int}, \pr{Float}, \pr{Char}, or
\pr{String}.
\subsection{External Constraint Solvers}
A constraint solver can be viewed as an abstract datatype
consisting of a constraint store together with a few operations
to check the entailment of a constraint or to add a new constraint.
In order to connect a constraint solver to Curry, like a solver for
arithmetic or Boolean constraints, the external solver must
provide the following operations ($cs$ denotes the type of the
constraint store and $c$ the type of constraints):
\[
\begin{array}{l@{~:~}l@{~\to~}ll}
new & & cs & \mbox{({\it create and initialize a new constraint store})} \\
ask & cs \times c & \{True,False,Unknown\} & \mbox{({\it check entailment of a constraint})} \\
tell & cs \times c & cs & \mbox{({\it add a new constraint})} \\
clone & cs & cs \times cs & \mbox{({\it copy the constraint store})}
\end{array}
\]
Using these operations, it is relatively easy to extend
Curry's computation model to include constraints by adding
the constraint store as a new component to the substitution part
in answer expressions (cf.\ Section~\ref{sec-opsem}).\footnote{%
This section will be modified or extended in a later version.}
\section{Pragmas}
\label{sec-pragmas}
Curry is a multi-paradigm language providing different
programming styles, like functional programming, logic
programming, and functional logic programming based on
narrowing or residuation. As described in Section~\ref{sec-opsem},
the difference between narrowing (non-deterministic evaluation
of functions) and residuation (delayed evaluation of functions)
is specified by evaluation annotations for each function.
Since the specification of the evaluation annotations
is tedious, they are automatically generated using
a default strategy (rigid branches for non-constraint functions, flexible
branches for constraints). Moreover, pattern matching is performed from
left to right (see Appendix~\ref{app-gentrees}).
This default strategy can be overridden
by providing explicit evaluation annotations for each function
or by a pragma which changes the default strategy
for the entire module where they occur (but not for the imported
modules):\pindex{pragma}
\startprog
pragma $p$
\stopprog
If $p=\pr{flex}$,\pindex{flex}
then all functions are flexible if not specified otherwise.
If $p=\pr{rigid}$,\pindex{rigid}
then all functions (including the functions with result type
\pr{Constraint}) are rigid if not specified otherwise.
If $p=\pr{optmatch}$,\pindex{optmatch} then the compiler tries to find
pattern matching strategies with a better termination behavior
than the strict left-to-right pattern matching (i.e., the compiler
generates evaluation annotations without \pr{or} annotations for inductively
sequential programs, see \cite{Antoy92ALP,Hanus97POPL}
and Appendix~\ref{app-gentrees} for more details).
Since the pragma \pr{optmatch} has no influence on the flexibility
or rigid status of functions, it can be also mixed with the pragmas
\pr{flex} or \pr{rigid}, i.e., a module could start with
\startprog
module m where
pragma optmatch
pragma rigid
\ldots
\stopprog
\section{Interactive Programming Environment}
In order to support different implementations with a comparable
user interface, the following commands should be supported by
each interactive programming environment for Curry (these commands
can also be abbreviated to \pr{:$c$} where $c$ is the
first character of the command):
\begin{description}
\item[\fbox{\pr{:load file}}] Load the program stored in \pr{file.curry}.
\item[\fbox{\pr{:reload}}] Repeat the last load command.
\item[\fbox{\pr{expr}}] Evaluate the expression \pr{expr} and
show all computed results. Since an expression could be evaluated
to a disjunctive expression (cf.\ Section~\ref{sec-opsem}),
the expression could be automatically wrapped in some search
operator like depth-first search or a fair breadth-first search,
depending on the implementation.
\item[\fbox{\pr{:debug expr}}] Debug the evaluation of the expression \pr{expr},
i.e., show the single computation steps and ask the user what to do
after each single step (like proceed, abort, etc.).
\item[\fbox{\pr{:type expr}}] Show the type of the expression \pr{expr}.
\item[\fbox{\pr{:eval f}}] Show the definitional tree
(see Section~\ref{app-def-trees}) of function \pr{f}.
\item[\fbox{\pr{:quit}}] Exit the system.
\end{description}
\newpage
\appendix
\section{Example Programs}
This section contains a few example programs together with some initial
expressions and their computed results.
\subsection{Operations on Lists}
Here are simple operations on lists.
Note that, in contrast to the list concatenation \pr{++}
defined in the prelude (see Appendix~\ref{app-prelude}),
\pr{append} has a ``flexible'' evaluation annotation so that
it can also be applied to solve equations containing \pr{append},
i.e., \pr{append} can be used to split lists. We exploit this property
to define the last element of a list in a very simple way.
\startprog
~
-- Concatenation of two lists:
append :: [t] -> [t] -> [t]
append eval flex
append [] ys = ys
append (x:xs) ys = x:append xs ys
~
-- Naive reverse of all list elements:
rev :: [t] -> [t]
rev [] = []
rev (x:xs) = append (rev xs) [x]
~
-- Last element of a list:
last :: [t] -> t
last xs | append _ [x] =:= xs = x where x free
~
\stopprog
Expressions and their evaluated results:
\startprog
~
append [0,1] [2,3] $~~\leadsto~~$ [0,1,2,3]
~
append L M =:= [0,1]
$~~\leadsto~~$ \{L=[],M=[0,1]\} | \{L=[0],M=[1]\} | \{L=[0,1],M=[]\}
~
rev [0,1,2,3] $~~\leadsto~~$ [3,2,1,0]
~
last (append [1,2] [3,4]) $~~\leadsto~~$ 4
\stopprog
\newpage
\subsection{Higher-Order Functions}
Here are some ``traditional'' higher-order functions
to show that the familiar functional programming techniques
can be applied in Curry. Note that the functions \pr{map}, \pr{foldr},
and \pr{filter} are predefined in Curry.
\startprog
~
-- Map a function on a list (predefined):
map :: (t1->t2) -> [t1] -> [t2]
map f [] = []
map f (x:xs) = f x:map f xs
~
-- Fold a list (predefined):
foldr :: (t1->t2->t2) -> t2 -> [t1] -> t2
foldr f a [] = a
foldr f a (x:xs) = f x (foldr f a xs)
~
-- Filter elements in a list (predefined):
filter :: (t -> Bool) -> [t] -> [t]
filter p [] = []
filter p (x:xs) = if p x then x:filter p xs
else filter p xs
~
-- Quicksort function:
quicksort :: [Int] -> [Int]
quicksort [] = []
quicksort (x:xs) = quicksort (filter (<= x) xs)
++ [x]
++ quicksort (filter (> x) xs)
~
\stopprog
\newpage
\subsection{Relational Programming}
Here is a traditional example from logic programming:
a simple deductive database with family relationships.
We use a relational programming style, i.e., all
relationships are represented as constraints
(i.e., functions with result type \pr{Constraint})
in order to provide goal solving for these relationships.
\startprog
-- Declaration of an enumeration type for persons:
-- (as an alternative, one could consider persons as strings)
~
data Person = Christine | Maria | Monica | Alice | Susan |
Antony | Bill | John | Frank | Peter | Andrew
~
-- Two basic relationships:
~
married :: Person -> Person -> Constraint
married Christine Antony = success
married Maria Bill = success
married Monica John = success
married Alice Frank = success
~
mother :: Person -> Person -> Constraint
mother Christine John = success
mother Christine Alice = success
mother Maria Frank = success
mother Monica Susan = success
mother Monica Peter = success
mother Alice Andrew = success
~
-- and here are the deduced relationships:
father :: Person -> Person -> Constraint
father f c = let m free in married m f \& mother m c
~
grandfather :: Person -> Person -> Constraint
grandfather g c | father g _f \& father _f c = success
| father g _m \& mother _m c = success
~
\stopprog
Expressions and their evaluated results:\vspace{1ex}
\startprog
father John child $~~\leadsto~~$ \{child=Susan\} | \{child=Peter\}
~
grandfather g c $~~\leadsto~~$
\{g=Antony,c=Susan\} | \{g=Antony,c=Peter\} |
\{g=Bill,c=Andrew\} | \{g=Antony,c=Andrew\}
\stopprog
\newpage
\subsection{Functional Logic Programming}
This is the same example as in the previous section.
However, we use here a functional logic programming style
which is more readable but provides the same goal solving
capabilities. The basic functions are \pr{husband}
and \pr{mother} which express the functional dependencies
between the different persons. Note that the derived
function \pr{grandfather} is a non-deterministic function
which yields all grandfathers for a given person.
The pragma \pr{flex} specifies that all functions
are flexible and can be used to compute bindings for unknown
arguments.
\startprog
pragma flex
~
data Person = Christine | Maria | Monica | Alice | Susan |
Antony | Bill | John | Frank | Peter | Andrew
~
-- Two basic functional dependencies:
~
husband :: Person -> Person
husband Christine = Antony
husband Maria = Bill
husband Monica = John
husband Alice = Frank
~
mother :: Person -> Person
mother John = Christine
mother Alice = Christine
mother Frank = Maria
mother Susan = Monica
mother Peter = Monica
mother Andrew = Alice
~
-- and here are the deduced functions and relationships:
father :: Person -> Person
father c = husband (mother c)
~
grandfather :: Person -> Person
grandfather g = father (father g)
grandfather g = father (mother g)
~
\stopprog
Expressions and their evaluated results:
\startprog\vspace{1ex}
father Child =:= john $~~\leadsto~~$ \{Child=susan\} | \{Child=peter\}
~
grandfather c $~~\leadsto~~$
\{c=Susan\} Antony | \{c=Peter\} Antony | \{c=Andrew\} Bill | \{c=Andrew\} Antony
\stopprog
\newpage
\subsection{Constraint Solving and Concurrent Programming}
In this example we demonstrate how Curry's concurrent features
can be used to solve constraint problems with finite domains
more efficiently than a simple generate-and-test solver.
We want to solve the classical map coloring problem.
Consider the following simple map:
\begin{center}
\begin{tabular}{|c|@{}c@{}|c|} \hline
\mbox{Country 1} & \begin{tabular}{c} ~\\~~\mbox{Country 2}~~ \\~\\
\hline ~\\ \mbox{Country 3}\\~\\ \end{tabular} & \mbox{Country 4} \\
\hline
\end{tabular}
\end{center}
The problem is to assign to each of the four countries
a color (red, green, yellow, or blue) such that countries
with a common border have different colors.
To solve this problem, we have to specify
that some countries have different colors. For this purpose,
we define the following constraint which checks to elements
for incompatibility:
\startprog
diff :: a -> a -> Constraint
diff x y = (x==y)=:=False
\stopprog
For instance, \pr{diff $t_1\,t_2$} is satisfied if $t_1$
and $t_2$ are different constants. If one of the arguments is a variable,
the evaluation of the function is delayed by definition of \pr{==}.
Now there is a straightforward solution of the map coloring problem.
We define a constraint \pr{coloring} specifying the valid colors for
each country and a constraint \pr{correct} specifying which countries
must have different colors:
\startprog
~
data Color = Red | Green | Yellow | Blue
~
isColor :: Color -> Constraint
isColor Red = success
isColor Yellow = success
isColor Green = success
isColor Blue = success
~
coloring :: Color -> Color -> Color -> Color -> Constraint
coloring l1 l2 l3 l4 = isColor l1 \& isColor l2 \& isColor l3 \& isColor l4
~
correct :: Color -> Color -> Color -> Color -> Constraint
correct l1 l2 l3 l4 =
diff l1 l2 \& diff l1 l3 \& diff l2 l3 \& diff l2 l4 \& diff l3 l4
~
\stopprog
In classical logic programming, we can compute the solutions to
the map coloring problem by enumerating all potential solutions
followed by a check whether a potential solution is a correct one
(``generate and test''). This can be expressed by solving
the following goal:\footnote{We assume an implementation
that processes the concurrent conjunction $c_1 \cconj c_2$
in a ``sequential-first'' manner: first $c_1$ is solved and, if
this is not possible due to a suspended evaluation, $c_2$ is solved.}
\startprog
coloring l1 l2 l3 l4 \& correct l1 l2 l3 l4
\stopprog
However, a much faster solution can be obtained by reversing the order
of the tester and the generator:
\startprog
correct l1 l2 l3 l4 \& coloring l1 l2 l3 l4
\stopprog
The latter constraint is evaluated in a concurrent way. In the first steps,
the subexpression \pr{correct l1 l2 l3 l4} is reduced to the constraint
\startprog
diff l1 l2 \& diff l1 l3 \& diff l2 l3 \& diff l2 l4 \& diff l3 l4
\stopprog
which is then reduced to
\startprog
(l1==l2)=:=False \& (l1==l3)=:=False \& (l2==l3)=:=False \& (l2==l4)=:=False
\& (l3==l4)=:=False
\stopprog
This constraint cannot be further evaluated since the arguments to \pr{==} are
free variables. Therefore, it is suspended and the final equational constraint
\pr{coloring l1 l2 l3 l4} is evaluated which binds the variables
to the potential colors. Since the variable binding is performed
by consecutive computation steps, the equalities are evaluated
as soon as their arguments are bound. For instance, if the variables
\pr{l1} and \pr{l2} are bound to the color \pr{Red}, the first constraint
\pr{(l1==l2)=:=False} cannot be solved (due to the unsolvability of
the equational constraint \pr{True=:=False}) which causes the failure
of the entire goal. As a consequence, not all potential colorings
are computed (as in the generate-and-test approach) but only those
colorings for which the constraints \pr{correct l1 l2 l3 l4} is satisfiable.
Therefore, the (dis)equality goals act as ``passive constraints''
aiming to reduce the search space.
\newpage
\subsection{Concurrent Object-Oriented Programming}
The following example shows a simple method to program
in a concurrent object-oriented style in Curry.
For this purpose, an object is a process waiting for incoming
messages. The local state is a parameter of this process.
Thus, a process is a function of type
\startprog
[Message] -> State -> Constraint
\stopprog
In the subsequent example, we implement a bank account
as an object waiting for messages of the form
\pr{Deposit i}, \pr{Withdraw i}, or \pr{Balance i}.
Thus, the bank account can be defined as follows:
\startprog
~
data Message = Deposit Int | Withdraw Int | Balance Int
~
account :: [Message] -> Int -> Constraint
account eval rigid -- account is a consumer
account [] _ = success
account (Deposit a : ms) n = account ms (n+a)
account (Withdraw a : ms) n = account ms (n-a)
account (Balance b : ms) n = b=:=n \& account ms n
~
-- Install an initial account with message stream s:
make_account s = {account s 0}
~
\stopprog
A new account object is created by the constraint \pr{make_account s}
where \pr{s} is a free variable. When \pr{s} is instantiated with
messages, the account process starts processing these messages.
The following concurrent conjunction of constraints creates
an account and sends messages:
\startprog
make_account s \& s=:=[Deposit 200, Deposit 50, Balance b]
\stopprog
After this goal is solved, the free variable \pr{b} has been bound to
\pr{250} representing the balance after the two deposits.
To show a client-server interaction, we define a client of
a bank account who is finished if his account is \pr{50},
buys things for \pr{30} if his account is greater than \pr{50},
and works for an income of \pr{70} if his account is less than \pr{50}.
In order to get the client program independent of the account
processing, the client sends messages to his account.
Therefore, the client is implemented as follows:
\startprog
-- Send a message to an object identified by its message stream obj:
sendMsg :: [msg] -> msg -> [msg]
sendMsg obj msg | obj =:= msg:obj1 = obj1 where obj1 free
~
-- Client process of an bank account
client s | s1 =:= sendMsg s (Balance b) =
if b==50 then s1=:=[] -- stop
else if b>50 then client (sendMsg s1 (Withdraw 30)) -- buy
else client (sendMsg s1 (Deposit 70)) -- work
where s1,b free
\stopprog
We start the account and client process with an initial amount of \pr{100}:
\startprog
make_account s \& client (sendMsg s (Deposit 100))
\stopprog
A Curry system evaluates this goal by a concurrent evaluation
of both processes and computes the final answer
\startprog
~
\{s=[Deposit 100, Balance 100, Withdraw 30, Balance 70, Withdraw 30,
Balance 40, Deposit 70, Balance 110, Withdraw 30, Balance 80,
Withdraw 30, Balance 50]\}
~
\stopprog
which shows the different messages sent to the account process.
\newpage
\section{Standard Prelude}
\label{app-prelude}
\index{prelude}
This section contains the standard prelude for Curry programs.\footnote{%
The definition is will be extended in a later version.}
This prelude will be imported to every Curry program,
i.e., the names declared in the prelude cannot be redefined.
\startprog
~
-- Infix operator declarations
~
infixl 9 !!
infixr 9 .
infixl 7 *, /, `div`, `mod`
infixl 6 +, -
infixr 5 ++, :
infix 4 =:=, ==, <, >, <=, >=
infixr 3 \&\&
infixr 2 ||
infixr 1 >>, >>=
infixr 0 \&, \&>
~
~
-- Some standard combinators:
~
-- Function composition
(.) :: (b -> c) -> (a -> b) -> (a -> c) \pindex{.}
f . g = \ttbs{}x -> f (g x)
~
-- Identity
id :: a -> a \pindex{id}
id x = x
~
curry :: ((a,b) -> c) -> a -> b -> c \pindex{curry}
curry f a b = f (a,b)
~
uncurry :: (a -> b -> c) -> (a,b) -> c \pindex{uncurry}
uncurry f (a,b) = f a b
~
flip :: (a -> b -> c) -> b -> a -> c \pindex{flip}
flip f x y = f y x
~
~
-- Boolean values
~
data Bool = True | False \pindex{Bool}\pindex{True}\pindex{False}
~
-- Sequential conjunction
(\&\&) :: Bool -> Bool -> Bool \pindex{\&\&}
True \&\& x = x
False \&\& _ = False
~
-- Sequential disjunction
(||) :: Bool -> Bool -> Bool \pindex{"|"|}
True || _ = True
False || x = x
~
-- Negation
not :: Bool -> Bool \pindex{not}
not True = False
not False = True
~
otherwise :: Bool \pindex{otherwise}
otherwise = True
~
~
-- Pairs
~
data (a,b) = (a,b) \pindex{(\ldots)}\index{pair}\index{tuple}
~
fst :: (a,b) -> a \pindex{fst}
fst (x,_) = x
~
snd :: (a,b) -> b \pindex{snd}
snd (_,y) = y
~
~
-- Unit type
~
data () = () \pindex{()}
~
~
-- Lists
~
data [a] = [] | a : [a] \pindex{[]}\pindex{:}\pindex{[a]}\index{lists}
~
head :: [a] -> a \pindex{head}
head (x:_) = x
~
tail :: [a] -> [a] \pindex{tail}
tail (_:xs) = xs
~
-- Concatenation
(++) :: [a] -> [a] -> [a] \pindex{++}
[] ++ ys = ys
(x:xs) ++ ys = x : xs++ys
~
-- List length
length :: [a] -> Int \pindex{length}
length [] = 0
length (_:xs) = 1 + length xs
~
-- List index (subscript) operator, head has index 0
(!!) :: [a] -> Int -> a \pindex{"!"!}
(x:_) !! 0 = x
(_:xs) !! n | n>0 = xs !! (n-1)
~
-- Map a function on a list
map :: (a->b) -> [a] -> [b] \pindex{map}
map _ [] = []
map f (x:xs) = f x : map f xs
~
-- Accumulate all list elements
foldr :: (a->b->b) -> b -> [a] -> b \pindex{foldr}
foldr _ z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
~
-- Filter elements in a list
filter :: (a -> Bool) -> [a] -> [a] \pindex{filter}
filter _ [] = []
filter p (x:xs) = if p x then x : filter p xs
else filter p xs
~
-- Join two lists to one list of pairs
zip :: [a] -> [b] -> [(a,b)] \pindex{zip}
zip [] [] = []
zip (x:xs) (y:ys) = (x,y) : zip xs ys
~
-- Concatenate a list of lists into one list
concat :: [[a]] -> [a] \pindex{concat}
concat l = foldr (++) [] l
~
-- return prefix of length n
take :: Int -> [a] -> [a] \pindex{take}
take n l = if n==0 then [] else takep n l
where takep _ [] = []
takep n (x:xs) = x : take (n-1) xs
~
-- return suffix without first n elements
drop :: Int -> [a] -> [a] \pindex{drop}
drop n l = if n==0 then l else dropp n l
where dropp _ [] = []
dropp n (_:xs) = drop (n-1) xs
~
-- return longest prefix with elements satisfying a predicate
takeWhile :: (a -> Bool) -> [a] -> [a] \pindex{takeWhile}
takeWhile _ [] = []
takeWhile p (x:xs) = if p x then x : takeWhile p xs else []
~
-- return suffix without takeWhile prefix
dropWhile :: (a -> Bool) -> [a] -> [a] \pindex{dropWhile}
dropWhile _ [] = []
dropWhile p (x:xs) = if p x then dropWhile p xs else x:xs
~
~
-- Conversion functions between characters and their ASCII values
~
ord :: Char -> Int \pindex{ord}
chr :: Int -> Char \pindex{chr}
~
~
-- Convert a term into a printable representation
~
show :: a -> String \pindex{show}
~
~
-- Types of primitive arithmetic functions and predicates
~
(+) :: Int -> Int -> Int \pindex{+}
(-) :: Int -> Int -> Int \pindex{-}
(*) :: Int -> Int -> Int \pindex{*}
div :: Int -> Int -> Int \pindex{div}
mod :: Int -> Int -> Int \pindex{mod}
(<) :: Int -> Int -> Bool \pindex{<}
(>) :: Int -> Int -> Bool \pindex{>}
(<=) :: Int -> Int -> Bool \pindex{<=}
(>=) :: Int -> Int -> Bool \pindex{>=}
~
~
-- Constraints
~
-- Equational constraint
(=:=) :: a -> a -> Constraint \pindex{=:=}
~
-- Always solvable constraint
success :: Constraint \pindex{success}
~
-- Concurrent conjunction of constraints
(\&) :: Constraint -> Constraint -> Constraint \pindex{\&}\index{conjunction}\index{conjunction!concurrent}
~
-- Sequential conjunction of constraints
(\&>) :: Constraint -> Constraint -> Constraint \pindex{\&>}\index{conjunction}\index{conjunction!sequential}
c1 \&> c2 | c1 = c2
~
~
-- Monadic IO
~
data IO a -- conceptually: \mbox{\it{}World} -> (a,\mbox{\it{}World}) \pindex{IO}
~
(>>) :: IO a -> IO b -> IO b \pindex{>>}
(>>=) :: IO a -> (a -> IO b) -> IO b \pindex{>>=}
putChar :: Char -> IO () \pindex{putChar}
getChar :: IO Char \pindex{getChar}
done :: IO () \pindex{done}
return :: a -> IO a \pindex{return}
readFile :: String -> IO String \pindex{readFile}
writeFile :: String -> String -> IO () \pindex{writeFile}
~
putStr :: String -> IO () \pindex{putStr}
putStr [] = done
putStr (c:cs) = putChar c >> putStr cs
~
putStrLn :: String -> IO () \pindex{putStrLn}
putStrLn cs = putStr cs >> putChar '\ttbs{}n'
~
getLine :: IO String \pindex{getLine}
getLine = getChar >>= \ttbs{}c ->
if c=='\ttbs{}n' then return []
else getLine >>= \ttbs{}cs -> return (c:cs)
~
~
-- Encapsulated search
~
-- primitive operator to control non-determinism
try :: (a->Constraint) -> [a->Constraint] \pindex{try}
~
-- compute all solutions via depth-first search
all :: (a->Constraint) -> [a->Constraint] \pindex{all}
all g = collect (try g)
where collect [] = []
collect [g] = [g]
collect (g1:g2:gs) = concat (map all (g1:g2:gs))
~
-- compute first solution via depth-first search
once :: (a->Constraint) -> (a->Constraint) \pindex{once}
once g = head (all g)
~
-- compute all values of solutions via depth-first search
findall :: (a->Constraint) -> [a] \pindex{findall}
findall g = map unpack (all g)
~
-- compute best solution via branch-and-bound with depth-first search
best :: (a->Constraint) -> (a->a->Bool) -> [a->Constraint] \pindex{best}
~
-- try to find a solution to a search goal via a fair search
-- (and fail if there is no solution)
tryone :: (a->Constraint) -> a \pindex{tryone}
tryone eval choice
tryone g | g x = x where x free
~
-- compute one solution with a fair search
one :: (a->Constraint) -> [a->Constraint] \pindex{one}
one g = dfs \ttbs{}x -> x =:= tryone g
~
-- compute one solution with a fair search satisfying a condition
condSearch :: (a->Constraint) -> (a->Bool) -> [a->Constraint] \pindex{condSearch}
condSearch g cond = one \ttbs{}x -> g x \& cond x =:= True
~
-- show the solution of a solved constraint
browse :: (a->Constraint) -> IO () \pindex{browse}
~
-- show the solutions of a list of solved constraints
browseList :: [a->Constraint] -> IO () \pindex{browseList}
browseList [] = done
browseList (g:gs) = browse g >> putChar '\ttbs{}n' >> browseList gs
~
~
-- unpack a solution's value from a (solved) search goal
unpack :: (a -> Constraint) -> a \pindex{unpack}
unpack g | g x = x where x free
\stopprog
\newpage
\section{Syntax of Curry}
The syntax is still preliminary and not intended to be used by
automatic tools. Nevertheless, it provides a precise definition
of the concrete syntax which can be further discussed.
The syntax is close to Haskell but the following differences
should be noted.
\begin{itemize}
\item{} As-patterns are missing
\item{} Case expressions are missing
\item{} Currently, there are no infix constructors except for ``\term{:}'',
the predefined list \emph{cons}tructor. They will be added later,
although they are already used in concrete example programs.
\end{itemize}
\subsection{Lexicon}
The case of identifiers matters, i.e., ``abc'' differs from ``Abc''.
There are four \emph{case modes} selectable at compilation time:
\begin{itemize}
\item{}
Prolog mode: variables start with an upper case latter,
all other identifier symbols start with a lower case letter.
\item{}
G\"odel mode: like Prolog mode with the cases swapped.
\item{}
Haskell mode: see section 1.3 of the Haskell report.
\item{}
free: no constraints on the case of identifiers.
\end{itemize}
The default case mode is \emph{free}.
If a case mode is selected
but not obeyed, the compiler issues a warning.
The syntax does not define the following non-terminal symbols
defining classes of identifiers:
\emph{ModuleID},
\emph{TypeConstrID},
\emph{DataConstrID},
\emph{TypeVarID},
\emph{InfixOpID},
\emph{FunctionID},
\emph{VariableID}.
All, except \emph{InfixOpID}, consist of an initial letter,
whose upper or lower case depend on the case mode,
followed by any number of letters, digits, and/or underscores.
Additionally, \emph{ModuleID} can contain dots at inner positions.
\emph{InfixOpID} is any string of characters from the string
``\verb ~!@#$%^&*+-=<>?./|\: ''\comment{$} % for correct font-locking in emacs
or another identifier
(like \emph{VariableID}) enclosed in \pr{`$\ldots$`} like \pr{`mod`}.
The following symbols are \emph{keywords}\index{keywords}
and cannot be used as an identifier:
\startprog
choice data else external flex free if
import in infix infixl infixr let module
optmatch pragma renaming rigid then to where
\stopprog
%
The syntax leaves undefined \emph{Literal} of primitive types.
These are literal constants, such as ``\term{1}'',
``\term{3.14}'', or the character ``\term{'a'}''. They are as in Java.
Java, in constrast to Haskell, adopts the Unicode standard to
represent characters and character strings.
Comments\index{comment}\pindex{--}\pindex{\{-\ldots-\}}
begins either with ``\term{--}'' and terminate at the end of the line
or with ``\verb|{-|'' and terminate with a matching ``\verb|-}|'', i.e.,
the delimiters ``\verb|{-|'' and ``\verb|-}|''
act as parentheses and can be nested.
\subsection{Layout}
Similarly to Haskell, a Curry programmer can use layout information
to define the structure of blocks. For this purpose, we define
the indentation of a symbol as the column number indicating
the start of this symbol. The indentation of a line is the
indentation of its first symbol.\footnote{In order to determine
the exact column number, we assume a fixed-width font with
tab stops at each 8th column.}
The layout (or ``off-side'') rule applies to lists of syntactic
entities after the keywords \term{let} or \term{where}.
In the subsequent context-free syntax (Section~\ref{app-syntax}),
these lists are enclosed with
curly brackets (\term{\{ \}}) and the single entities are
separated by semicolons (\term{;}). Instead of using the curly
brackets and semicolons of the context-free syntax,
a Curry programmer should specify these lists
by indentation: the indentation of a list of syntactic
entities after \term{let} or \term{where} is the
indentation of the next symbol following the
\term{let} or \term{where}. Any item of this list
start with the same indentation as the list. Lines with only whitespaces or
an indentation greater than the indentation of the list continue
the item in its previous line. Lines with an indentation less than
the indentation of the list terminates the entire list.
Moreover, a list started by \term{let} is terminated by the
keyword \pr{in}. Thus, the sentence
\startprog
f x = h x where \{g y = y+1 ; h z = (g z) * 2 \}
\stopprog
which is valid w.r.t.\ the context-free syntax, can be written
with the layout rules as
\startprog
f x = h x
where g y = y+1
h z = (g z) * 2
\stopprog
or also as
\startprog
f x = h x where
g y = y+1
h z = (g z)
* 2
\stopprog
To avoid an indentation of top-level declarations,
the keyword \term{module} and the end-of-file token are
assumed to start in column 0.
\subsection{Context Free Syntax}
\label{app-syntax}
{\parindent=0pt
\production Module
\term{module} ModuleID \opt{Exports} \term{where} Block
end
\lexicon{ModuleID}
\production Exports
\term{(} \seq{Export}{,}{n} \term{)}
end
\production Export
FunctionName
\por TypeConstrID
\por TypeConstrID \term{(} \seq{DataConstrID}{,}{n} \term{)}
\por TypeConstrID \term{(..)}
\por \term{module} ModuleID \opt{\term{(} \seq{FuncOrType}{,}{n} \term{)}}
end
\production FuncOrType
FunctionName $|$ TypeConstrID
end
\production Block
\term{\{}
\opt{\seq{Import}{;}{n} \term{;}}
\marg{n \geqslant 0}
\next \phantom{\term{\{}}
\opt{\seq{Pragma}{;}{n} \term{;}}
\marg{n \geqslant 0}
\next \phantom{\term{\{}}
\opt{\seq{FixityDeclaration}{;}{n} \term{;}}
\marg{n \geqslant 0}
\next \phantom{\term{\{}}
\seq{BlockDeclaration}{;}{m}
\term{\}}
\marg{m \geqslant 0}
end
\production Import
\term{import} ModuleID \opt{\term{(} \seq{FuncOrType}{,}{n} \term{)}} \opt{Renamings}
end
\production Renamings
\seq{Renaming}{}{n}
end
\production Renaming
\term{renaming} FunctionName \term{to} FunctionName
\por \term{renaming} TypeConstrID \term{to} TypeConstrID
end
\production Pragma
\term{pragma} PragmaOption
end
\production PragmaOption
\term{flex} $|$ \term{rigid} $|$ \term{optmatch}
end
\production BlockDeclaration
DataDeclaration \por FunctionDeclaration
end
\production DataDeclaration
\term{data} TypeDeclaration
% \offside{TypeDeclaration}{;}{n}
% \marg{n > 0}
end
\production TypeDeclaration
TypeConstrID
\seq{TypeVarID}{}{n} \marg{n \geqslant 0}
\next
\term{=} \seq{ConstrDeclaration}{|}{m} \marg{m > 0}
end
\lexicon{TypeConstrID}
\production ConstrDeclaration
DataConstrID
\seq{SimpleTypeExpr}{}{n} \marg{n \geqslant 0}
end
\lexicon{DataConstrID}
\production TypeExpr
TypeAppl \opt{\term{->} TypeExpr}
end
\production TypeAppl
QTypeConstrID \seq{SimpleTypeExpr}{}{n} \marg{n \geqslant 0}
end
\production SimpleTypeExpr
QTypeConstrID
\por TypeVarID $|$ \term{\ttus}
\por \term{()}
\por \term{(} TypeExpr \term{)}
\por \term{(} \seq{TypeExpr}{,}{n} \term{)} \marg{n>1}
\por \term{[} TypeExpr \term{]}
end
\lexicon{TypeVarID}
\production FixityDeclaration
FixityKeyword Natural \seq{InfixOpID}{,}{n} \marg{n>0}
% \offside{FixityArgument}{;}{n} \marg{n>0}
end
\production FixityKeyword
\term{infixl} $|$ \term{infixr} $|$ \term{infix}
end
%\production FixityArgument
% Natural
% \seq{InfixOpID}{,}{n} \marg{n>0}
%end
\production Natural Digit $|$ Digit Natural end
\production Digit
\term{0} $|$ \term{1} $|$ \term{2} $|$ \term{3} $|$ \term{4} $|$
\term{5} $|$ \term{6} $|$ \term{7} $|$ \term{8} $|$ \term{9}
end
\lexicon{InfixOpID}
\production FunctionDeclaration
Signature $|$ EvalAnnot $|$ Equat
end
\production Signature FunctionNames \term{::} TypeExpr end
\production FunctionNames
\seq{FunctionName}{,}{n} \marg{n>0}
end
\production FunctionName
\term{(} InfixOpID \term{)}
\por FunctionID
end
\lexicon{FunctionID}
\production EvalAnnot FunctionNames \term{eval} Annotation end
\production Annotation
\term{flex} $|$ \term{rigid} $|$ \term{choice}
end
\production Equat
FunLHS \term{=} Expr \opt{\term{where} LocalDefs}
\por FunLHS CondExprs \opt{\term{where} LocalDefs}
end
\production FunLHS
FunctionName \seq{SimplePattern}{}{n} \marg{n \geqslant 0}
\por SimplePattern InfixOpID SimplePattern
end
\production Pattern
QDataConstrID \seq{Pattern}{}{n} \opt{\term{:} Pattern} \marg{n > 0}
\por SimplePattern \opt{\term{:} Pattern}
end
\production SimplePattern
VariableID $|$ \term{\ttus}
\por QDataConstrID
\por Literal
\por \term{()}
\por \term{(} \seq{Pattern}{,}{n} \term{)} \marg{n > 1}
% \por \term{(} QDataConstrID \seq{Pattern}{}{n} \term{)} \marg{n > 0}
\por \term{(} Pattern \term{)}
\por \term{[} \seq{Pattern}{,}{n} \term{]} \marg{n \geqslant 0}
end
\lexicon{VariableID}
\production LocalDefs
\offside{ValueDeclaration}{;}{n} \marg{n > 0}
end
\production ValueDeclaration
FunctionDeclaration
\por PatternDeclaration
\por \seq{VariableID}{,}{n} \term{free} \marg{n \geqslant 0}
end
\production PatternDeclaration
Pattern \term{=} Expr \opt{\term{where} LocalDefs}
end
\production CondExprs
\term{|} Expr \term{=} Expr \opt{CondExprs}
end
\production Expr
\term{\ttbs} \seq{SimplePattern}{}{n} \term{->} Expr \marg{n > 0}
\por \term{let} LocalDefs \term{in} Expr
\por \term{if} Expr \term{then} Expr \term{else} Expr
\por Expr InfixOpID Expr
\por \term{-} Expr
\por FunctExpr
end
\production FunctExpr
\opt{FunctExpr} BasicExpr \marg{\mbox{\it function application}}
end
\production BasicExpr
QVariableID \marg{\mbox{\it variable}}
\por QDataConstrID \marg{\mbox{\it data constructor}}
\por QFunctionID \marg{\mbox{\it defined function}}
\por \term{(} InfixOpID \term{)} \marg{\mbox{\it operator function}}
\por Literal
\por \term{()} \marg{\mbox{\it empty tuple}}
\por \term{(} Expr \term{)} \marg{\mbox{\it parenthesized expression}}
\por \term{(} \seq{Expr}{,}{n} \term{)} \marg{\mbox{\it tuple, $n > 1$}}
\por \term{[} \seq{Expr}{,}{n} \term{]}\marg{\mbox{\it list, $n\geqslant 0$}}
\por \term{(} Expr InfixOpID \term{)} \marg{\mbox{\it left section}}
\por \term{(} InfixOpID Expr \term{)} \marg{\mbox{\it right section}}
end
\production QTypeConstrID
ModuleID \term{.} TypeConstrID $~|~$ TypeConstrID
end
\production QDataConstrID
ModuleID \term{.} DataConstrID $~|~$ DataConstrID
end
\production QFunctionID
ModuleID \term{.} FunctionID $~|~$ FunctionID
end
\production QVariableID
ModuleID \term{.} VariableID $~|~$ VariableID
end
\production Literal
Int $|$ Char $|$ String $|$ Float
end
\lexicon{Int}
\lexicon{Char}
\lexicon{String}
\lexicon{Float}
} % end parindent definition
\vspace{3ex}\noindent
If the alternative \emph{FunctionDeclaration} is used in
a \emph{ValueDeclaration}, then the left-hand side (\emph{FunLHS})
must have at least one pattern after the \emph{FunctionName}
(instead of zero patterns which is possible in top-level
function declarations).
In \emph{CondExprs}, the first expression
must have type \pr{Constraint} or \pr{Bool}. In the latter case,
Boolean expressions in conditions are considered
as an abbreviation for a (nested) \pr{if-then-else}
(see Section~\ref{sec-bool-guards}).
\subsection{Infix Operators}
In the grammar above, the use of infix operators in the rule
for \emph{Expr} is ambiguous. These ambiguities are resolved
by assigning an associativity and precedence to each operator
(\emph{InfixOpID}) by a
\emph{fixity declaration}.\index{fixity declaration}\index{declaration!fixity}
There are three kinds of associativities,
non-, left- and right-associativity
(\pr{infix}\pindex{infix}, \pr{infixl}\pindex{infixl},
\pr{infixr}\pindex{infixr}) and ten levels of precedence, 0 to 9,
where level 0 binds least tightly and level 9 binds most tightly.
All fixity declarations must appear at the beginning of a module.
Any operator without an explicit fixity declaration is assumed
to have the declaration \pr{infixl 9}.
For instance, the expression ``\pr{1+2*3+4==x \&\& b}'' is equivalent
to ``\pr{((((1+(2*3))+4)==x) \&\& b)}'' w.r.t.\ the fixity declarations
provided in the prelude.
Note that the correct use of precedences and associativities
excludes some expressions which are valid w.r.t.\ the context-free
grammar. In general, the arguments of an infix operator must
have an infix operator at the top with a higher precedence
than the current infix operator (or no infix operator at the top).
Additionaly, the left or right argument of a left- or
right-associative operator can have a top infix operator
with the same precedence. The unary minus operator is
interpreted with precedence 6 which means that its argument
must have a precedence of at least 7. The expression ``\pr{(- $t$)}''
is not considered as a right section but as the negation of $t$.
As a consequence,
the expressions ``\pr{1<2==True}'' and ``\pr{1 + - 3}''
are not allowed and must be bracketed as
``\pr{(1<2)==True}'' and ``\pr{1 + (- 3)}''.
\newpage
\stepcounter{footnote}
\section{Operational Semantics of Curry}
\label{app-opsem}
The precise specification of the operational semantics of Curry
is based on the evaluation annotation and the patterns on the
rules' left-hand sides for each function.
Therefore, we describe the computation model by providing
a precise definition of pattern matching with evaluation annotations
(Section~\ref{app-def-trees}) which is the basis to
define the computation steps on expressions
(Section~\ref{app-compsteps}). The extension of this
basic computation model to solving equational constraints and higher-order
functions is described in Sections~\ref{sec-equationsolving}
and~\ref{app-higher-order}, respectively. Section~\ref{app-gentrees}
describes the automatic generation of the definitional trees
to guide the pattern matching strategy.
Finally, Section~\ref{sec-opsemtry} specifies the operational semantics
of the primitive operator \pr{try} for encapsulating search.
\subsection{Definitional Trees}
\label{app-def-trees}
We start by considering only the unconditional first-order part of Curry,
i.e., rules do not contain conditions and $\lambda$-abstractions
and partial function applications are not allowed.
We assume some familiarity with basic notions of term rewriting
\cite{DershowitzJouannaud90} and functional logic programming
\cite{Hanus94JLP}.
We denote by $\Cc$ the set of \emph{constructors}\index{constructor}
(with elements $c,d$),
by $\Fc$ the set of
\emph{defined functions}\index{defined function}\index{function!defined}
or \emph{operations}\index{operation}
(with elements $f,g$), and by $\Xc$ the set of \emph{variables}\index{variable}
(with elements $x,y$), where $\Cc$, $\Fc$ and $\Xc$ are disjoint.
An \emph{expression}\index{expression} (\emph{data term}\index{data term})
is a variable $x \in \Xc$ or an application $\varphi(e_1,\ldots,e_n)$
where $\varphi \in \Cc \cup \Fc$ ($\varphi \in \Cc$) has arity $n$ and
$e_1,\ldots,e_n$ are expressions (data terms).\footnote{Since we do not
consider partial applications in this part, we write the full application
as $\varphi(e_1,\ldots,e_n)$
which is equivalent to Curry's notation $\varphi~e_1\ldots e_n$.}
The set of all expressions and data terms are denoted
by $\Tc(\Cc \cup \Fc,\Xc)$ and $\Tc(\Cc,\Xc)$, respectively.
A \emph{pattern}\index{pattern} is an expression of the form
$f(t_1,\ldots,t_n)$ where each variable occurs only once,
$f \in \Fc$ is an $n$-ary function, and $t_1,\ldots,t_n \in \Tc(\Cc,\Xc)$.
$root(e)$ denotes the symbol at the
\emph{root} of the expression\index{root (of an expression)} $e$.
A \emph{position}\index{position} $p$ in an expression $e$
is represented by a sequence of
natural numbers, $e|_p$ denotes the \emph{subterm}\index{subterm}
or \emph{subexpression}\index{subexpression}
of $e$ at position $p$, and $e[e']_p$ denotes the result of
\emph{replacing the subterm} $e|_p$ by the expression $e'$
(see \cite{DershowitzJouannaud90} for details).
A \emph{substitution}\index{substitution}
is a mapping $\sigma \colon \Xc \to \Tc(\Cc \cup \Fc,\Xc)$ with
$\sigma(x) \neq x$ only for finitely many variables $x$.
Thus, we denote a substitution by
$\sigma = \{x_1 \mapsto t_1,\ldots, x_n \mapsto t_n\}$,
where $\sigma(x_i) \neq x_i$ for $i=1,\ldots,n$,
and $id$ denotes the identity substitution ($n=0$).
$\Dom(\sigma) = \{x_1,\ldots,x_n\}$ is the
\emph{domain}\index{domain}\index{$\Dom$}
of $\sigma$ and
\[
\VRan(\sigma) = \{ x \mid x \mbox{ is a variable occurring in some } t_i,
i \in \{1,\ldots,n\} \}
\]
is its \emph{variable range}.\index{variable range}\index{range}\index{$\VRan$}
Substitutions are extended to morphisms on expressions by
$\sigma(f(e_1,\ldots,e_n)) = f(\sigma(e_1),\ldots,\sigma(e_n))$
for every expression $f(e_1,\ldots,e_n)$.
A Curry program is a set of rules \pr{$l$\,=\,$r$} satisfying
some restrictions (see Section~\ref{sec-funcdecl}).
In particular, the left-hand side $l$ must be a pattern.
A rewrite rule is called a \emph{variant}\index{variant} of another rule
if it is obtained by a unique replacement of variables
by other variables.
An \emph{answer expression}\index{answer expression}\index{expression!answer}
is a pair $\sigma \ans e$
consisting of a substitution $\sigma$ (the answer computed so far)
and an expression $e$.
An answer expression $\sigma \ans e$ is \emph{solved}\index{solved expression}
if $e$ is a data term.
We sometimes omit the identity substitution in answer expressions,
i.e., we write $e$ instead of $id \ans e$ if it is clear from
the context.
A \emph{disjunctive expression}\index{disjunctive expression}\index{expression!disjunctive} is a (multi-)set
of answer expressions $\{\sigma_1 \ans e_1,\ldots, \sigma_n \ans e_n\}$.
The set of all disjunctive expressions is denoted by $\Dc$.
A single \emph{computation step} performs a reduction in exactly one unsolved
expression of a disjunction.
To provide a precise definition of the operational semantics,
we use definitional trees.\footnote{Our notion is influenced
by Antoy's work \cite{Antoy92ALP}, but here we use an extended form
of definitional trees.}
A definitional tree is a hierarchical structure containing
all rules of a defined function.
$\Tc$ is a \emph{definitional tree with pattern}\index{definitional tree} $\pi$
iff the depth of $\Tc$ is finite and
one of the following cases holds:
\begin{description}
\item [$\Tc= rule(l \pr{\,=\,} r),$] where $l \pr{\,=\,} r$ is a variant
of a rule such that $l = \pi$.
\item [$\Tc= branch(\pi,o,r,\Tc_1,\ldots,\Tc_k),$] where
$o$ is an occurrence of a variable in $\pi$,
$r \in \{rigid,flex\}$,
$c_1,\ldots,c_k$ are different constructors of the sort
of $\pi|_o$, for some $k > 0$,
and, for all $i = 1,\ldots,k$, $\Tc_i$ is a definitional tree with
pattern $\pi[c_i(x_1,\ldots,x_n)]_o$,
where $n$ is the arity of $c_i$ and
$x_1,\ldots,x_n$ are new variables.
\item [$\Tc=or(\Tc_1,\Tc_2),$] where $\Tc_1$ and $\Tc_2$ are
definitional trees with pattern $\pi$.\footnote{For the sake of simplicity,
we consider only binary $or$ nodes. The extension
to such nodes with more than two subtrees is straightforward.}
\end{description}
A \emph{definitional tree of an $n$-ary function} $f$
is a definitional tree $\Tc$ with pattern $f(x_1,\ldots,x_n)$,
where $x_1,\ldots,x_n$ are distinct variables, such that for each rule
$l \pr{\,=\,} r$ with $l = f(t_1,\ldots,t_n)$ there is a node
$rule(l' \pr{\,=\,} r')$
in $\Tc$ with $l$ variant of $l'$.
In the following, we write $pat(\Tc)$ for the pattern
of a definitional tree $\Tc$.
It is always possible to construct a definitional tree
for each function (concrete algorithms are described in
\cite{Antoy92ALP,LoogenLopezFraguasRodriguezArtalejo93PLILP}
and in Section~\ref{app-gentrees}).
For instance, consider the following definition of the less-or-equal predicate
on natural numbers represented by data terms built from \pr{z} (zero)
and \pr{s} (successor):\label{example-leq}
\startprog
data Nat = z | s Nat
~
leq :: Nat -> Nat -> Bool
leq z n = True
leq (s m) z = False
leq (s m) (s n) = leq m n
\stopprog
Consider a function call like \pr{(leq $e_1$ $e_2$)}.
In order to apply some reduction rule, the first argument $e_1$
must always be evaluated to \emph{head normal form}\index{head normal form}
(i.e., to an expression
without a defined function symbol at the top).
However, the second argument must be evaluated only if
the first argument has the form \pr{(s $e$)}.\footnote{%
Naive lazy narrowing strategies may also evaluate
the second argument in any case. However, as shown in
\cite{AntoyEchahedHanus94POPL}, the consideration of
dependencies between arguments is essential to obtain
optimal evaluation strategies.}
This dependency between the first and the second argument
is expressed by the definitional tree
\[
\begin{array}{@{}l@{}l@{}l}
branch(\pr{leq}(x_1,x_2),1,flex,
& rule(\pr{leq}(\pr{z},x_2) \pr{\,=\,True}), \\
& branch(\pr{leq}(\pr{s}(x),x_2),2,flex, &
rule(\pr{leq}(\pr{s}(x),\pr{z}) \pr{\,=\,False}),\\
& & rule(\pr{leq}(\pr{s}(x),\pr{s}(y)) \pr{\,=\,} \pr{leq}(x,y))))
\end{array}
\]
%which corresponds to the explicit evaluation annotation
%\startprog
%leq eval 1:flex(0 => rule, s => 2:flex(0 => rule, s => rule))
%\stopprog
This definitional tree %or annotation
specifies that the first argument is initially evaluated
and the second argument is only evaluated
if the first argument has the constructor \pr{s} at the top.
The precise operational meaning induced by definitional trees
is described in the following section.
\subsection{Computation Steps}
\label{app-compsteps}
The operational semantics is defined by the derivability relation
$\dexp_1 \To \dexp_2$ on disjunctive expressions
specified in Figure~\ref{figure-opsem}. An inference rule
$\infrule{\alpha}{\beta}$ should be read as
``if $\alpha$ is derivable, then $\beta$ is also derivable''.
We say that the computation of an expression $e$
\emph{suspends}\index{suspended evaluation} if there is no $\dexp$
with $\eval{e} \To \dexp$.
A constraint is solvable if it can be reduced to \pr{success}.
The exact method to solve constraints depends on the constraint solver.
A method to solve equational constraints on data terms
is specified in Section~\ref{sec-equationsolving}.
For the purpose of this definition, we consider \pr{success} as
the neutral element of the operation \pr{\&}, i.e.,
$\pr{success} \cconj c$ and $c \cconj \pr{success}$ are considered
as equivalent to $c$.
\begin{figure*}[t]
\begin{center}
\fbox{
$
\begin{array}{l@{}}
\mbox{\bf Computation step for a single expression:}\\[1ex]
\infrule{\eval{e_i} \To \dexp}
{\eval{e_1 \cconj e_2} \To replace(e_1 \cconj e_2,i,\dexp)}
\mbox{~~$i \in \{1,2\}$}\\[3ex]
\infrule{\eval{e_i} \To \dexp}
{\eval{c(e_1,\ldots,e_n)} \To replace(c(e_1,\ldots,e_n),i,\dexp)}
\mbox{~~$i \in \{1,\ldots,n\}$}\\[3ex]
\infrule{\eval{f(e_1,\ldots,e_n);\Tc} \To \dexp}
{\eval{f(e_1,\ldots,e_n)} \To \dexp}
\mbox{~~if $\Tc$ is a definitional tree for $f$ with fresh variables} \\[4ex]
\mbox{\bf Computation step for an operation-rooted expression $e$:}\\[-1ex]
\infrule{}{\eval{e;rule(l \pr{\,=\,} r)} \To \{ id \ans \sigma(r)\}}
\mbox{~~~if $\sigma$ is a substitution with $\sigma(l)=e$}\\[3ex]
\infrule{\eval{e;\Tc_1} \To \dexp_1 \quad \eval{e;\Tc_2} \To \dexp_2}{
\eval{e;or(\Tc_1,\Tc_2)} \To \dexp_1 \cup \dexp_2} \\[3ex]
\begin{array}{@{}l@{}}
\eval{e;branch(\pi,p,r,\Tc_1,\ldots,\Tc_k)} \To \\[0.5ex]
\hspace{2ex} \cases{\dexp & if $e|_p = c(e_1,\ldots,e_n)$, $pat(\Tc_i)|_p = c(x_1,\ldots,x_n)$, and $\eval{e;\Tc_i} \To \dexp$ \cr
\emptyset & if $e|_p = c(\cdots)$ and $pat(\Tc_i)|_p \neq c(\cdots), i=1,\ldots,k$ \cr
\bigcup_{i=1}^k \{\sigma_i \ans \sigma_i(e)\} &
if $e|_p = x$, $r = flex$, and
$\sigma_i = \{x \mapsto pat(\Tc_i)|_p\}$ \cr
replace(e,p,\dexp) & if $e|_p = f(e_1,\ldots,e_n)$ and $\eval{e|_p} \To \dexp$ }
\end{array}\\[10ex]
\mbox{\bf Derivation step for a disjunctive expression:}\\
\infrule{\eval{e} \To \{\sigma_1 \ans e_1,\ldots, \sigma_n \ans e_n\}}
{\{\sigma \ans e\} \cup \dexp ~\To~
\{\sigma_1\circ\sigma \ans e_1,\ldots, \sigma_n\circ\sigma \ans e_n\}\cup \dexp}\\[3ex]
\end{array}
$
} % fbox
\end{center}\vspace{-3ex}
\caption{Operational semantics of Curry\label{figure-opsem}}
\end{figure*}
As can be seen by the rules for evaluating constraints
in Figure~\ref{figure-opsem}, the concurrent conjunction
of two constraints is evaluated by applying computation
steps to one of the two constraints. Since both constraints
must be finally solved (i.e., reduced to \pr{success}), one can consider
the evaluation of a concurrent conjunction as two computation
threads evaluating these constraints. They are synchronized
by accessing common variables (which may suspend a thread, see below).
In a simple sequential implementation, the evaluation of $e_1 \cconj e_2$
could be started by an attempt to solve $e_1$.
If the evaluation of $e_1$ suspends,
an evaluation step is applied to $e_2$. If a variable
responsible to the suspension of $e_1$ is instantiated,
the left expression $e_1$ will be evaluated in the subsequent step.
Thus, we obtain a concurrent behavior with an interleaving semantics.
However, a sophisticated implementation should provide a
fair selection of threads, e.g., as done in a Java-based
implementation of Curry \cite{HanusSadre97}.
A computation step for an expression
$e$ attempts to apply a reduction step to an outermost operation-rooted
subterm in $e$ by evaluating this subterm with the
definitional tree for the function symbol at the root of this subterm.
Although it is unspecified which outermost subterm is evaluated
(compare second inference rule in Figure~\ref{figure-opsem}
for constructor-rooted expressions),
we assume that a single (e.g., leftmost) outermost subterm
is always selected.
The evaluation of an operation-rooted term
is defined by a case distinction on the definitional tree.
If it is a $rule$ node, this rule is applied.
An $or$ node produces a disjunction where the different
alternatives are combined. Note that this definition requires that
the entire disjunction suspends if one disjunct suspends.
This can be relaxed in concrete implementations by continuing
the evaluation in one $or$ branch even if the other branch
is suspended. However, to ensure completeness, it is not allowed
to omit a suspended $or$ branch and continue with the other
non-suspended $or$ branch \cite{Hanus97POPL}.
For a similar reason, we cannot commit to a disjunct which does
not bind variables but we have to consider both alternatives
(see \cite{AntoyEchahedHanus97ICLP} for a counter-example).
The most interesting case is a $branch$ node. Here we have to
branch on the value of the top-level symbol at the selected position.
If the symbol is a constructor, we proceed with the appropriate
definitional subtree, if possible. If it is a function symbol, we proceed by
evaluating this subexpression. If it is a variable and the branch
is flexible, we instantiate the variable
to the different constructors, otherwise (if the branch is $rigid$)
we cannot proceed and suspend. The auxiliary function
$replace$ puts a possibly disjunctive expression
into a subterm:
\[
replace(e,p,\{\sigma_1 \ans e_1,\ldots, \sigma_n \ans e_n\}) =
\{\sigma_1 \ans \sigma_1(e)[e_1]_p,\ldots, \sigma_n \ans \sigma_n(e)[e_n]_p\}
\qquad (n \geq 0)
\]
%
The overall computation strategy on disjunctive expressions
takes a disjunct $\sigma \ans e$ not in solved form
and evaluates $e$. If the evaluation does not suspend and yields
a disjunctive expression, we substitute it
for $\sigma \ans e$ composed with the old answer substitution.
Soundness and completeness results for the operational semantics
can be found in \cite{Hanus97POPL}.
Note that the evaluation of expressions is completely deterministic
if the concurrent conjunction of constraints does not occur.
\paragraph{Conditional rules.}
Note that the operational semantics specified in Figure~\ref{figure-opsem}
handles only unconditional rules. Conditional rules are
treated by the rules in Figure~\ref{figure-condrules}.
If a conditional rules is applied, the condition and the right-hand side
are joined into a \emph{guarded expression}, i.e., a pair of the form
$c\pr{|}r$. Due to the additional evaluation rules,
a guarded expression is evaluated by an attempt to solve its condition.
If the condition is solvable (i.e., reducible to \pr{success}),
the guarded expression is replaced by its right-hand side.
If the condition is not solvable
and does not suspend, the disjunct containing this guarded
expression will be deleted by the definition of computation steps, i.e.,
the application of the conditional rule fails and does not contribute
to the final result.
\begin{figure*}[t]
\begin{center}
\fbox{
$
\begin{array}{l@{}}
\infrule{}{\eval{e;rule(l \pr{\,|\,$c$\,=\,} r)} \To \{ id \ans \sigma(c \pr{|} r)\}}
\mbox{~~~if $\sigma$ is a substitution with $\sigma(l)=e$}\\[3ex]
\infrule{\eval{c} \To \dexp}
{\eval{c \pr{|} e} \To replace(c \pr{|} e,1,\dexp)} \qquad\qquad
\infrule{}{\eval{\pr{success|} e} \To \{ id \ans e\}}\\[3ex]
\end{array}
$
} % fbox
\end{center}\vspace{-3ex}
\caption{Evaluation of conditional rules\label{figure-condrules}}
\end{figure*}
\paragraph{Sharing and graph rewriting.}
For the sake of simplicity,
this description of the operational semantics is based on term rewriting
and does not take into account that common subterms are shared
(see Section~\ref{sec-variable-sharing}). We only note here that
\emph{several occurrences of the same variable are always shared},
i.e., subterms obtained by the instantiation
of the same variable are always simultaneously evaluated.
This is necessary not only for efficiency reasons
but also for the soundness of the operational semantics
in the presence of non-deterministic functions, as discussed
in Section~\ref{sec-variable-sharing}. The sharing of variables
can be described with the more complicated framework
of graph rewriting.
Formal descriptions of graph rewriting and narrowing can be
found in \cite{EchahedJanodet98,GonzalesEtAl96ESOP}.
\subsection{Committed Choice}
To define the semantics of calls to functions with
the evaluation annotation \pr{choice}, we assume that all
such calls are enclosed by $choice(\ldots)$. Now we extend the rules
in Figures~\ref{figure-opsem} and~\ref{figure-condrules}
for covering such choices.
An expression $choice(f(e_1,\ldots,e_n))$ is reduced as follows.
If
\[
\eval{f(e_1,\ldots,e_n)} ~\To^*~ \{\ldots,\sigma \ans \pr{success|}\varphi(r),\ldots\}
\]
according to the operational semantics described in
Figures~\ref{figure-opsem} and~\ref{figure-condrules},
where $l \pr{~|\,} c \pr{~=\,} r$
is a rule for $f$,\footnote{An unconditional rule $l \pr{\,=\,} r$
is considered here as an abbreviation for
$l \pr{\,|\,success\,=\,} r$.} $\sigma$ does not bind any free variable
occurring in $f(e_1,\ldots,e_n)$, and this is the first
step in the derivation with this property, then
$\eval{choice(f(e_1,\ldots,e_n))} \To \{ id \ans \varphi(r)\}$.
Thus, if disjunctions occur due to the evaluation of a call
to a choice function, these disjunctions are not distributed
to the top-level but are kept inside the $choice$ expression.
If one rule for the choice function becomes applicable (without
binding of free variables in this call), all
other alternatives in the disjunction are discarded.
Hence, a call to a choice function corresponds to the committed choice (with
deep guards) of concurrent logic languages.
This can be implemented by evaluating this call (with a fair
strategy for alternatives) as usual but with the restriction that only
local variables can be bound.
\subsection{Equational Constraints}
\label{sec-equationsolving}
An equational constraint\index{constraint}\index{constraint!equational}\index{equational constraint}\pindex{=:=}
\pr{$e_1$=:=$e_2$} is solvable if both
sides are reducible to a unifiable data term
(\emph{strict equality}\index{strict equality}\index{equality!strict}).
An equational constraint can be solved in an incremental
way by an interleaved lazy evaluation of the expressions
and binding of variables to constructor terms.
The evaluation steps implementing this method
are shown in Figure~\ref{figure-eqsolve}, where we consider the
symbol \pr{=:=} is a binary infix function symbol.
The last two rules implements an occur check
where the \emph{critical variables} ($cv$) are only those variables
occurring outside a function call:
\begin{eqnarray*}
cv(x) & = & \{x\} \\
cv(c(e_1,\ldots,e_n)) & = & cv(e_1) \cup \ldots \cup cv(e_n) \\
cv(f(e_1,\ldots,e_n)) & = & \emptyset
\end{eqnarray*}
However, Curry implementations can also provide other methods
to solve equational constraints or other types of constraints
with appropriate constraint solvers. The only property,
which is important for the operational semantics, is the fact
that a solved constraint has the form \pr{success}, i.e.,
a solvable constraint $c$ is reducible to the answer expression
$\sigma \ans \pr{success}$ where $\sigma$ is a solution of $c$.
\begin{figure*}[t]
\begin{center}
\fbox{
$
\begin{array}{l@{}}
\infrule{\eval{e_i} \To \dexp}
{\eval{e_1\pr{=:=}e_2} \To replace(e_1\pr{=:=}e_2,i,\dexp)}
\mbox{~~if $e_i = f(t_1,\ldots,t_n)$, $i \in \{1,2\}$}\\[3ex]
\infrule{}{\eval{c(e_1,\ldots,e_n)\pr{=:=}c(e'_1,\ldots,e'_n)} \To
\{id \ans e_1\pr{=:=}e'_1 \cconj \ldots \cconj e_n\pr{=:=}e'_n\}}\\[3ex]
\infrule{}{\eval{c(e_1,\ldots,e_n)\pr{=:=}d(e'_1,\ldots,e'_m)} \To \emptyset}
\mbox{~~if $c \neq d$ or $n \neq m$}\\[3ex]
\infrule{\eval{x\pr{=:=}e} \To \dexp}{\eval{e\pr{=:=}x} \To \dexp}
\mbox{~~if $e$ is not a variable}\\[3ex]
\infrule{}{\eval{x\pr{=:=}y} \To \{\{x \mapsto y\}\ans \pr{success}\}}\\[3ex]
\infrule{}{\eval{x\pr{=:=}c(e_1,\ldots,e_n)} \To
\{\sigma \ans y_1\pr{=:=}\sigma(e_1) \cconj\ldots\cconj y_n\pr{=:=}\sigma(e_n)\}}
~~\parbox{4.2cm}{if $x \not\in cv(c(e_1,\ldots,e_n))$,\\
$\sigma = \{x \mapsto c(y_1,\ldots,y_n)\}$,
$y_1,\ldots,y_n$ fresh variables}\\[3ex]
\infrule{}{\eval{x\pr{=:=}c(e_1,\ldots,e_n)} \To \emptyset}
\mbox{~~if $x \in cv(c(e_1,\ldots,e_n))$}\\[3ex]
\end{array}
$
} % fbox
\end{center}\vspace{-3ex}
\caption{Solving equational constraints\label{figure-eqsolve}}
\end{figure*}
\subsection{Higher-Order Features}
\label{app-higher-order}
Warren \cite{Warren82} has shown for the case of logic programming
that the higher-order features
of functional programming can be implemented by providing a (first-order)
definition of the application function. Since Curry supports
the higher-order features of current functional languages but
excludes the guessing of higher-order objects by
higher-order unification
(as, e.g., in \cite{HanusPrehofer96RTA,NadathurMiller88,Prehofer95Diss}),
the operational semantics specified in Figure~\ref{figure-opsem}
can be simply extended to cover Curry's higher-order features
by adding the following axiom (here we denote by
the infix operator
``\pr{@}''\index{application operator}\index{function application}
the application of a function to an expression):
\[
\eval{\varphi(e_1,\ldots,e_m)\pr{@}e} \To \varphi(e_1,\ldots,e_m,e)
\hspace{3ex}\mbox{if $\varphi$ has arity $n$ and $m f x y + 2
\stopprog
can be replaced by the (partial) application \pr{(lambda f)},
where \pr{lambda} is a new name, and adding the rule
\startprog
lambda f x y = f x y + 2
\stopprog
Note that this transformation (as well as the extension of $\eval{\cdot}$
above)
is only used to explain the operational meaning of Curry's higher-order
features. A concrete Curry system is free to choose other implementation
techniques.
\subsection{Generating Definitional Trees}
\label{app-gentrees}
Curry's computation model is based on the specification of a definitional
tree for each operation. Although definitional trees are a
high-level formalism to specify evaluation strategies,
it is tedious to annotate each operation which its
definitional tree. Therefore, the user need not write them explicitly
in the program since they are automatically inserted by the Curry system.
In the following, we present
Curry's algorithm to generate definitional trees from the left-hand sides
of the functions' rules.
The generation of definitional trees for each function
is not straightforward, since there may exist many non-isomorphic
definitional trees for a single function representing different
evaluation strategies. This demands for a default strategy
to generate definitional trees. Curry uses the following default strategy:
\begin{enumerate}
\item Pattern-matching is performed from left to right.
\item $or$ nodes are generated in case of a conflict
between constructors and variables, i.e., if two rules
have a variable and a constructor at the same position on the left-hand side.
\end{enumerate}
This default strategy is reasonable since it does not generate any
$or$ node for most typical functional programs and is also used
in current functional languages \cite{Wadler87}.
However, there are programs where this default does not generate
the best possible results (see below).
In the following, we assume that all rules are unconditional
(it is obvious how to extend it to conditional rules since only the
left-hand sides of the rules are relevant for the definitional trees).
To specify the construction algorithm, we define by
\[
DP(\pi,R) = \{ o \mbox{ position of a variable in } \pi \mid
root(l|_o) \in \Cc \mbox{ for some } l \pr{\,=\,} r \in R \}
\]
the set of \emph{demanded positions} of a pattern $\pi$ w.r.t.\ a set of
rules $R$. For instance, the demanded positions of the pattern
\pr{leq(x,y)} w.r.t.\ the rules for the predicate \pr{leq}
(see Section~\ref{example-leq}, page~\pageref{example-leq})
are $\{1,2\}$ referring to the pattern variables \pr{x} and \pr{y}.
The \emph{g}eneration of a definitional \emph{t}ree for a pattern $\pi$
and a set of rules $R$ (where $l$ is an instance of $\pi$ for each
$l \pr{\,=\,} r \in R$) is described by the function $gt(\pi,m,R)$
($m \in \{flex,rigid\}$ determines the mode annotation in the generated
$branch$ nodes).
We distinguish the following cases for $gt$:
\begin{enumerate}
\item
If the position $o$ is leftmost in $DP(\pi,R)$,
$\{ root(l|_o) \mid l \pr{\,=\,} r \in R \} = \{c_1,\ldots,c_k \}$
where $c_1,\ldots,c_k$ are different constructors with arities
$n_1,\ldots,n_k$,
and $R_i = \{ l \pr{\,=\,} r \in R \mid root(l|_o) = c_i \}$, then
\[
\begin{array}{l@{\,}l}
gt(\pi,m,R) = branch(\pi,o,m,
& gt(\pi[c_1(x_{11},\ldots,x_{1n_1})]_o,m,R_1),\\
& \ldots,\\
& gt(\pi[c_k(x_{k1},\ldots,x_{kn_k})]_o,m,R_k))
\end{array}
\]
where $x_{ij}$ are fresh variables.
I.e., if all rules have a constructor at the leftmost demanded position,
we generate a $branch$ node.
\item
If the position $o$ is leftmost in $DP(\pi,R)$ and
$R' = \{ l \pr{\,=\,} r \in R \mid root(l|_o) \in \Cc \} \neq R$, then
\[
gt(\pi,m,R) = or(gt(\pi,m,R'),gt(\pi,m,R-R'))
\]
I.e., we generate an $or$ node if the leftmost demanded position
of the pattern is not demanded by the left-hand sides of all rules.
\item
If $DP(\pi,R) = \emptyset$ and $l \pr{\,=\,} r$ variant of some rule in $R$
with $l = \pi$, then
\[
gt(\pi,m,R) = rule(l \pr{\,=\,} r)
\]
Note that all rules in $R$ are variants of each other if there is no demanded
position (this follows from the weak orthogonality of the rewrite system).
For non-weakly orthogonal rewrite systems, which may occur in the presence
of non-deterministic functions \cite{GonzalesEtAl96ESOP} or conditional rules,
the rules in $R$ may not be variants. In this case we simply connect
the different rules by $or$ nodes.
\end{enumerate}
If $R$ is the set of all rules defining the $n$-ary function $f$,
then a definitional tree for $f$ is generated by computing
$gt(f(x_1,\ldots,x_n),m,R)$, where $m=flex$ if the type of $f$ has the
form \pr{$\cdots$->Constraint}, i.e., if $f$ denotes a constraint,
and $m=rigid$ otherwise (this default can be changed by the pragmas
``\pr{flex}''\pindex{flex} and ``\pr{rigid}''\pindex{rigid},
see Section~\ref{sec-pragmas}).
It is easy to see that this algorithm
computes a definitional tree for each
function since the number of rules is reduced in each recursive call
and it keeps the invariant that the left-hand sides of the current set
of rules are always instances of the current pattern.
The algorithm $gt$ conforms to the evaluation strategy
of functional languages like Haskell or Miranda, since it generates
for the rules
\startprog
f 0 0 = 0
f x 1 = 0
\stopprog
the definitional tree
\[
\begin{array}{l@{}l}
or( & branch(\pr{f(x1,x2)},1,rigid,branch(\pr{f(0,x2)},2,rigid,rule(\pr{f(0,0) = 0}))), \\
& branch(\pr{f(x1,x2)},2,rigid,rule(\pr{f(x1,1) = 0})))
\end{array}
\]
This tree is not optimal
since it has a non-deterministic $or$ node and always requires
the evaluation of the first argument (in the first alternative).
We can avoid the generation of $or$ nodes for this function
if we relax the strict left-to-right evaluation
strategy and select in case~1 of the definition of $gt$
the leftmost position among all those demanded positions
where \emph{all} rules have a constructor in the left-hand side
at this position. This algorithm, which is used if the pragma
``\pr{optmatch}''\pindex{optmatch} is chosen (see Section~\ref{sec-pragmas}),
generates the following definitional tree:
\[
\begin{array}{l@{\,}l}
branch(\pr{f(x1,x2)},2,rigid, & branch(\pr{f(x1,0)},1,rigid,rule(\pr{f(0,0) = 0})), \\
& rule(\pr{f(x1,1) = 0}))
\end{array}
\]
In general, the ``\pr{optmatch}'' algorithm generates more powerful
evaluation strategies than $gt$ since it is ensured that,
for ground expressions, normal forms are always computed (if they exists)
provided that the definitional trees contain only
$branch$ and $rule$ nodes \cite{Antoy92ALP}.
If the function definitions are \emph{uniform} in the sense of
\cite{Wadler87}, then $gt$ and \pr{optmatch} generate identical
definitional trees.
\subsection{Encapsulated Search}
\label{sec-opsemtry}
\begin{figure}[t]
\begin{center}
\fbox{
$
\begin{array}{l}
\eval{\pr{try($g$)}} \To \\[0.5ex]
\hspace{2ex} \cases{
\pr{[]} & if $\eval{c} \To \emptyset$ \vspace{1ex} \cr
\pr{[$g'$]} & if $\eval{c} \To \{\sigma\ans \pr{success}\}$
(i.e., $\sigma$ is a mgu for all equations in $c$) with \cr
& $\Dom(\sigma) \subseteq \{x,x_1,\ldots,x_n\}$ (or $c=\pr{success}$
and $\sigma=id$) and \cr
& $g'= \pr{\ttbs$x$->let $x_1,\ldots,x_n$ free in $\sigma\circ\varphi\ans
\pr{success}$}$
\vspace{1ex}\cr
%
\dexp & if $\eval{c} \To \{\sigma\ans c'\}$
with $\Dom(\sigma) \subseteq \{x,x_1,\ldots,x_n\}$, \cr
& $g'= \pr{\ttbs$x$->let $x_1,\ldots,x_n,y_1,\ldots,y_m$ free in
$\sigma\circ\varphi\ans c'$}$ \cr
& where $\{y_1,\ldots,y_m\} = \VRan(\sigma)\setminus(\{x,x_1,\ldots,x_n\}\cup free(g))$, \cr
& and $\eval{\pr{try($g'$)}} \To \dexp$
\vspace{1ex} \cr
%
\pr{[$g_1$,\ldots,$g_k$]}
& if $\eval{c} \To \{\sigma_1\ans c_1,\ldots,
\sigma_k\ans c_k\}$, $k>1$, and, for $i=1,\ldots,k$, \cr
& $\Dom(\sigma_i) \subseteq \{x,x_1,\ldots,x_n\}$ and \cr
& $g_i = \pr{\ttbs$x$->let $x_1,\ldots,x_n,y_1,\ldots,y_{m_i}$ free in
$\sigma_i\circ\varphi\ans c_i$}$ \cr
& where $\{y_1,\ldots,y_{m_i}\} = \VRan(\sigma_i)\setminus(\{x,x_1,\ldots,x_n\}\cup
free(g))$
}
\end{array}
$
} % fbox
\end{center}\vspace{-3ex}
\caption{Operational semantics of the \pr{try} operator for
$g = \pr{\ttbs$x$->let $x_1,\ldots,x_n$ free in $\varphi\protect\ans c$}$}
\label{fig:try-opsem}
\end{figure}
The exact behavior of the \pr{try} operator is specified in
Figure~\ref{fig:try-opsem}. Note that a substitution $\varphi$ of the form
$\{x_1\mapsto t_1,\ldots,x_n\mapsto t_n\}$, computed by evaluating the
body of a search goal, must be encoded as a constraint in the new search goal.
Therefore, $\varphi$ is transformed into its \emph{equational representation}
$\overline{\varphi}$, where $\overline{\varphi}$ is a solved constraint
of the form \pr{$x_1$=:=$t_1$\,\&\ldots\&\,$x_n$=:=$t_n$}.
Hence, a search goal can always be written as
\pr{try \ttbs$x$->$\overline{\varphi}\cconj c$}.
Since the solved constraint $\overline{\varphi}$
does not have to be evaluated again, we use the sugared form
\pr{try \ttbs$x$->$\varphi \ans c$} where only $c$ is further evaluated.
Thus, an initial search goal \pr{try \ttbs$x$->$c$}
is equivalent to \pr{try \ttbs$x$->$id \ans c$}.
A search goal is solved (second case) if the constraint is solvable
without bindings of global variables. In a deterministic step (third
case), we apply the \pr{try} operator again after adding the newly
introduced variables to the list of local variables. Note that the
\emph{free variables} occurring in $g$, denoted by $free(g)$, must not
be locally declared because they can appear also outside of $g$, and
therefore they have to be removed from $\VRan(\sigma)$. In a
non-deterministic step (fourth case), we return the different
alternatives as the result. Note that the evaluation of the \pr{try}
operator suspends if a computation step on the constraint attempts to
bind a free variable. In order to ensure that an encapsulated search
will not be suspended due to necessary bindings of free variables, the
search goal should be a closed expression when a search operator is
applied to it, i.e., the search variable is bound by the lambda
abstraction and all other free variables are existentially quantified
by local declarations.
If one wants to use free variables in a search goal, e.g., for synchronization
in systems where concurrency is intensively used, the goal should perform only
deterministic steps until all free variables have been bound by another part of
the computation. Otherwise, a non-deterministic step could unnecessarily split
the search goal. For instance, consider the expression
\startprog try \ttbs{}x -> y=:=[x] \& append x [0,1] =:= [0,1]
\stopprog
The computation of \pr{y=:=[x]}
suspends because \pr{y} is free. Therefore, a concurrent computation of the
conjunction will split the goal by reducing the \pr{append} expression.
However, if \pr{y} is first bound to \pr{[[]]} by another thread of the
computation, the search goal could proceed deterministically.
To avoid such unnecessary splittings, we can restrict the
applicability of the split case and adapt a solution known as
\emph{stability} from AKL \cite{JansonHaridi91} and Oz
\cite{SchulteSmolka94}. For this purpose, we can slightly change the
definition of \pr{try} such that non-deterministic steps lead to a
suspension as long as a deterministic step might be enabled by another
part of the computation, i.e., the search goal contains free global
variables. To do so, we have to replace only the fourth rule of the
\pr{try} operator:
\[
\begin{array}{ll}
\eval{\pr{try($g$)}} \To
\pr{[$g_1$,\ldots,$g_k$]}
& \mbox{if }\eval{c} \To \{\sigma_1\ans c_1,\ldots,
\sigma_k\ans c_k\},\ k>1, \cr
&\eval{c} \not\To \{\sigma\ans c'\}, free(c)\subseteq
\{x,x_1,\ldots,x_n\}\cr
&\mbox{and, for }i=1,\ldots,k, \cr
& g_i = \pr{\ttbs$x$->let $x_1,\ldots,x_n,y_1,\ldots,y_{m_i}$ free in
$\sigma_i\circ\varphi\ans c_i$} \cr
& \mbox{where }y_1,\ldots,y_{m_i} = \VRan(\sigma_i)\setminus(\{x,x_1,\ldots,x_n\}\cup
free(g))
\end{array}
\]
%
Thus, a non-deteterministic step is only performed if no deterministic
step is possible and the unsolved constraint $c$ contains no free
variables except those locally declared in the search goal. Global
variables appearing in the data terms $t_i$ of the substitution
$\varphi=\{x_1\mapsto t_1,\ldots,x_n\mapsto t_n\}$ but not in $c$ are
not considered by these conditions because they cannot influence the
further evaluation of $c$. Note that the former condition
$\Dom(\sigma_i) \subseteq \{x,x_1,\ldots,x_n\}$ is covered by the new
condition $free(c) \subseteq \{x,x_1,\ldots,x_n\}$ because
$\Dom(\sigma_i)$ is always a subset of $free(c)$.
%We consider only the free
%variables of $c$ because a global variable appearing only in the data terms
%$t_i$ of the substitution $\varphi$ has no influence on the
%further evaluation of $c$.
Note that the first definition of \pr{try} in
Figure~\ref{fig:try-opsem} computes non-deterministic splittings
where the latter definition suspends. On the other hand,
the latter definition can avoid some unnecessary splittings.
Thus, different implementations of Curry can support
one or both of these definitions.
\newpage
\addcontentsline{toc}{section}{Bibliography}
\bibliography{mh}
\bibliographystyle{mhnumber}
\newpage
\addcontentsline{toc}{section}{Index}
\printindex
\end{document}