report.tex 159 KB
Newer Older
Michael Hanus's avatar
Michael Hanus committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% $Id$
% $Log$

%\documentstyle[makeidx,11pt,fleqn]{article}
\documentclass[11pt,fleqn]{article}

\usepackage{latexsym}
\usepackage{makeidx}
\usepackage{amssymb}

\setlength{\textwidth}{16.5cm}
\setlength{\textheight}{23cm}
\renewcommand{\baselinestretch}{1.1}
\setlength{\topmargin}{-1cm}
\setlength{\oddsidemargin}{0cm}
\setlength{\evensidemargin}{0cm}
\setlength{\marginparwidth}{0.0cm}
\setlength{\marginparsep}{0.0cm}

\newlength{\figurewidth}
\setlength{\figurewidth}{\textwidth}
\addtolength{\figurewidth}{-0.4cm}

% environment for typing program texts:
\makeatletter
\newenvironment{prog}{\par\vspace{1.5ex}
\setlength{\parindent}{1.0cm}
\setlength{\parskip}{-0.1ex}
\obeylines\@vobeyspaces\tt}{\vspace{1.5ex}\noindent
}
\makeatother
\newcommand{\startprog}{\begin{prog}}
\newcommand{\stopprog}{\end{prog}\noindent}
\newcommand{\pr}[1]{\mbox{\tt #1}}   % program text in normal text

\newcommand{\Ac}{{\cal{A}}}
\newcommand{\Cc}{{\cal{C}}}
\newcommand{\Dc}{{\cal{D}}}
\newcommand{\Fc}{{\cal{F}}}
\newcommand{\Tc}{{\cal{T}}}
\newcommand{\Xc}{{\cal{X}}}
\newcommand{\var}{{\cal V}ar}
\newcommand{\Dom}{{\cal D}om}
\newcommand{\VRan}{{\cal VR}an}
\renewcommand{\emptyset}{\varnothing}

\newcommand{\ans}{\mathbin{\framebox[1mm]{\rule{0cm}{1.2mm}}}}
%\newcommand{\ans}{\;}
\newcommand{\todo}[1]{\fbox{\sc To do: #1}}
\newcommand{\dexp}{D}  % a disjunctive expression
\newcommand{\cconj}{\ensuremath \mathop{\pr{\&}}} % concurrent conj. in math
\newcommand{\ttbs}{\mbox{\tt\char92}} % backslash in tt font
\newcommand{\ttus}{\mbox{\tt\char95}} % underscore in tt font
\newcommand{\sem}[1]{\ensuremath [\![#1]\!]} % double square brackets
\newcommand{\eval}[1]{{\cal E}\!val\sem{#1}}
\newcommand{\To}{\Rightarrow}
\newcommand{\infrule}[2]{\begin{array}{@{}c@{}} #1 \\ \hline #2 \end{array}}

\newcommand{\pindex}[1]{\index{#1@{\tt #1}}}  % program elements in index
\newcommand{\comment}[1]{} % ignore the argument
\catcode`\_=\active
\let_=\sb
\catcode`\_=12

% commands for the syntax:
\newcommand{\term}[1]{{\tt #1}}
\newcommand{\opt}[1]{\rm [\it #1{\rm ]}}
\newcommand{\offside}[3]{\term{\{}\seq{#1}{#2}{#3} \term{\}}}
\newcommand{\seq}[3]{{\it #1$_1$ \term{#2} $\ldots$ \term{#2} #1$_#3$}}
\newcommand{\head}[1]{\makebox[1.3in][r]{\it #1}}
\newcommand{\por}{\\\head{}\makebox[3em][c]{$|$}}
\newcommand{\next}{\\\head{}\makebox[3em]{}}
\newcommand{\marg}[1]{\hspace{\fill}$(#1)$}
\newcommand{\lexicon}[1]{\production #1 {\rm see lexicon} end}
\def\production #1 #2 end {\head{#1}\makebox[3em]{\rm ::=}\it #2}

\makeindex

\begin{document}
\sloppy

\begin{titlepage}
\begin{center}
\fbox{
\begin{minipage}[t]{\figurewidth}
\begin{center}\vspace{10ex}
{\Huge\bf Curry}\\[4ex]
{\LARGE\bf An Integrated Functional Logic Language}\\[5ex]
{\large\bf Version 0.5.1}\\[1ex]
{\large \today}\\[8ex]
\Large
Michael Hanus$^1$ [editor] \\[3ex]
{\large Additional Contributors:}\\[2ex]
Sergio Antoy$^2$ \\
Herbert Kuchen$^3$ \\
Francisco J. L\'opez-Fraguas$^4$ \\
Juan Jos\'e Moreno Navarro$^5$ \\
Frank Steiner$^6$ \\[20ex]
\normalsize
(1) RWTH Aachen, Germany, {\tt hanus@informatik.rwth-aachen.de} \\
(2) Portland State University, USA, {\tt antoy@cs.pdx.edu} \\
(3) University of M\"unster, Germany, {\tt kuchen@uni-muenster.de} \\
(4) Universidad Complutense de Madrid, Spain, {\tt fraguas@dia.ucm.es} \\
(5) Universidad Polit\'ecnica de Madrid, Spain, {\tt jjmoreno@fi.upm.es} \\
(6) RWTH Aachen, Germany, {\tt steiner@i2.informatik.rwth-aachen.de} \\[5ex]~
\end{center}
\end{minipage}}
\end{center}
\end{titlepage}
\tableofcontents
\newpage


\section{Introduction}

Curry is a universal programming language aiming at the amalgamation
of the most important declarative programming paradigms,
namely functional programming and logic programming.  
Curry combines in a seamless way features from functional programming
(nested expressions, lazy evaluation, higher-order functions),
logic programming (logical variables, partial data structures,
built-in search), and concurrent programming (concurrent evaluation
of constraints with synchronization on logical variables).
Moreover, Curry provides additional features in
comparison to the pure languages (compared to functional programming:
search, computing with partial information; compared to logic
programming: more efficient evaluation due to the deterministic
evaluation of functions).
Moreover, it also amalgamates the most
important operational principles developed in the area of integrated
functional logic languages: ``residuation''\index{residuation}
and ``narrowing''\index{narrowing} (see
\cite{Hanus94JLP} for a survey on functional logic programming).

The development of Curry is an international initiative intended to
provide a common platform for the research, teaching\footnote{%
Actually, Curry has been successfully applied to teach functional and
logic programming techniques in a single course without switching
between different programming languages. More details about
this aspect can be found in \cite{Hanus97DPLE}.}
and application
of integrated functional logic languages.
This document describes the features of Curry, its syntax and
operational semantics.


\section{Programs}

A Curry program specifies the semantics of expressions, where
goals, which occur in logic programming, are particular expressions
(of type \pr{Constraint}).
Executing a Curry program means simplifying an expression until
a value (or solution) is computed.
To distinguish between values and reducible expressions,
Curry has a strict distinction between (data)
\emph{constructors}\index{constructor} and
\emph{operations}\index{operation} or
\emph{defined functions}\index{defined function}\index{function!defined}
on these data.
Hence, a Curry program consists of a set of type and function declarations.
The type declarations define the computational domains (constructors)
and the function declarations the operations on these domains.
Predicates in the logic programming sense can be considered
as constraints, i.e., functions with result type \pr{Constraint}.

Modern functional languages (e.g., Haskell \cite{HudakPeytonJonesWadler92},
SML \cite{MilnerTofteHarper90}) allow the detection of many programming
errors at compile time by the use of polymorphic type systems.
Similar type systems are also used in modern logic languages
(e.g., G\"odel \cite{HillLloyd94}, $\lambda$Prolog \cite{NadathurMiller88}).
Therefore, Curry is a strongly typed language with a
Hindley/Milner-like polymorphic type system \cite{DamasMilner82}.\footnote{%
The extension of this type system to Haskell's type classes
is not included in the kernel language but will be considered
in a future version.}
Each object in a program has a unique type, where
the types of variables and operations can be omitted and are
reconstructed by a type inference mechanism.


\subsection{Datatype Declarations}
\label{sec-datatypes}

A datatype declaration\index{type declaration}\index{datatype!declaration}
\index{declaration!datatype} has the form
\startprog
data $T$ $\alpha_1$ \ldots $\alpha_n$ = $C_1$ $\tau_{11}$ \ldots $\tau_{1n_1}$ | $\cdots$ | $C_k$ $\tau_{k1}$ \ldots $\tau_{kn_k}$
\stopprog
and introduces a new $n$-ary \emph{type constructor}\index{type constructor}
$T$ and $k$
new (data) \emph{constructors} $C_1,\ldots,C_k$, where each $C_i$ has the type
\startprog
$\tau_{i1}$ -> $\cdots$ -> $\tau_{in_i}$ -> $T~\alpha_1\ldots\alpha_n$
\stopprog
($i=1,\ldots,k$). Each $\tau_{ij}$ is a
\emph{type expression}\index{type expression} built from the
\emph{type variables}\index{type variable} $\alpha_1,\ldots,\alpha_n$
and some type constructors. Curry has a number of built-in type constructors,
like \pr{Bool}, \pr{Int}, \pr{->} (function space), or, lists and tuples,
which are described in Section~\ref{sec-builtin-types}.
Since Curry is a higher-order language,
the types of functions (i.e., constructors or operations)
are written in their curried form
\pr{$\tau_1$ -> $\tau_2$ -> $\cdots$ -> $\tau_n$ -> $\tau$}
where $\tau$ is not a functional type.
In this case, $n$ is called the \emph{arity}\index{arity}\index{function!arity}
of the function. 
For instance, the datatype declarations
\startprog
data Bool = True | False
data List a = [] | a : List a
data Tree a = Leaf a | Node (Tree a) a (Tree a)
\stopprog
introduces the datatype \pr{Bool} with the 0-ary constructors
(\emph{constants})\index{constant} \pr{True} and \pr{False},
and the polymorphic types \pr{List a} and \pr{Tree a} of lists
and binary trees. Here, ``\pr{:}'' is an infix operator, i.e.,
``\pr{a:List a}'' is another notation for ``\pr{(:) a (List a)}''.
Lists are predefined in Curry, where the notation ``\pr{[a]}''
is used to denote list types (instead of ``\pr{List a}'').
The usual convenient notations for lists are supported,
i.e., \pr{[0,1,2]} is an abbreviation for
\pr{0:(1:(2:[]))} (see also Section~\ref{sec-builtin-types}).

A \emph{data term}\index{data term} is a variable $x$ or a
constructor application
$c~t_1\ldots t_n$ where $c$ is an $n$-ary constructor and
$t_1,\ldots,t_n$ are data terms. An \emph{expression}\index{expression}
is a variable or a (partial) application $\varphi~e_1 \ldots e_m$
where $\varphi$ is an $n$-ary function or
constructor, $m\leq n$, and $e_1,\ldots,e_m$ are expressions.
A data term or expression is called \emph{ground}\index{ground expression}
\index{expression!ground}\index{ground term}\index{data term!ground}
if it does not contain any variable.
Ground data terms correspond to values in the intended domain,
and expressions containing operations
should be evaluated to data terms.
Note that traditional functional languages compute on ground expressions,
whereas logic languages also allow non-ground expressions.


\subsection{Function Declarations}
\label{sec-funcdecl}

A function is defined by a type declaration (which can be omitted)
followed by a list of defining equations.
A \emph{type declaration}\index{type declaration}\index{function declaration}
\index{declaration!function} for an $n$-ary function has the form
\startprog
$f$ :: $\tau_1$ -> $\tau_2$ -> $\cdots$ -> $\tau_n$ -> $\tau$
\stopprog
where $\tau_1,\ldots,\tau_n,\tau$ are type expressions
and $\tau$ is not a functional type.
The simplest form of a
\emph{defining equation}\index{equation}\index{defining equation}
(or \emph{rule})\index{rule}\index{function rule}\pindex{=}
for an $n$-ary function $f$ ($n \geq 0$) is
\startprog
$f~t_1\ldots t_n$ = $e$
\stopprog
where $t_1,\ldots,t_n$ are data terms and the
\emph{right-hand side}\index{right-hand side} $e$ is an expression.
The \emph{left-hand side}\index{left-hand side}
$f~t_1\ldots t_n$ must not contain multiple occurrences of a variable.
Functions can also be defined by
\emph{conditional equations}\index{conditional equation}%
\index{equation!conditional}\index{conditional rule}\index{rule!conditional}
which have the form
\startprog
$f~t_1\ldots t_n$ | $c$ = $e$
\stopprog
where the \emph{condition}\index{condition} $c$ is a constraint
(cf.\ Section~\ref{sec-equality}). In order to apply a conditional equation,
its condition must be solved. Conditional equations with
identical left-hand sides can be also written as
\startprog
$f~t_1\ldots t_n$ | $c_1$ = $e_1$
          \,$\vdots$
          \,| $c_k$ = $e_k$
\stopprog
which is an abbreviation for the $k$ conditional equations
\startprog
$f~t_1\ldots t_n$ | $c_1$ = $e_1$
$\vdots$
$f~t_1\ldots t_n$ | $c_k$ = $e_k$
\stopprog
%
Note that one can define functions with a non-determinate
behavior by providing several rules with overlapping left-hand sides
or \emph{free variables}\index{free variable}\index{variable!free}
(i.e., variables which do not occur in the left-hand
side) in the conditions or right-hand sides of rules.
For instance, the following non-deterministic function inserts an element
at an arbitrary position in a list:
\startprog
insert x []     = [x]
insert x (y:ys) = x : y : ys
insert x (y:ys) = y : insert x ys
\stopprog
Such \emph{non-deterministic functions}\index{non-deterministic function}
\index{function!non-deterministic}
can be given a perfect declarative semantics \cite{GonzalesEtAl96ESOP}
and their implementation causes no overhead in Curry
since techniques for handling non-determinism are already contained in
the logic programming part (see also \cite{Antoy97ALP}).
However, deterministic functions are advantageous since
they provide for more efficient implementations (like the
\emph{dynamic cut} \cite{LoogenWinkler95}).
If one is interested only in defining deterministic functions,
this can be ensured by the following restrictions:
\begin{enumerate}
\item
Each variable occurring in the right-hand side of a rule
must also occur in the corresponding left-hand side.
\item
If \pr{~$f~t_1\ldots t_n$ | $c$ = $e$~} and
\pr{~$f~t'_1\ldots t'_n$ | $c'$ = $e'$~} are rules defining $f$ and
$\sigma$ is a \emph{substitution}\index{substitution}\footnote{%
A \emph{substitution} $\sigma$ is a mapping from variables into expressions
which is extended to a homomorphism on expressions by defining
$\sigma(f~t_1\ldots t_n) = f~\sigma(t_1)\ldots \sigma(t_n)$.
$\{x_1 \mapsto e_1,\ldots,x_k \mapsto e_k\}$ denotes the substitution
$\sigma$ with $\sigma(x_i)=e_i$ ($i=1,\ldots,k$) and
$\sigma(x)=x$ for all variables $x\not\in\{x_1,\ldots,x_k\}$.}
with $\sigma(t_1\ldots t_n)=\sigma(t'_1\ldots t'_n)$, then at least
one of the following conditions holds:
\begin{enumerate}
\item $\sigma(e)=\sigma(e')$ (\emph{compatibility of right-hand sides}).
\item $\sigma(c)$ and $\sigma(c')$ are not simultaneously satisfiable
(\emph{incompatibility of conditions}). A decidable approximation
of this condition can be found in \cite{KuchenEtAl96NGC}.
\end{enumerate}
\end{enumerate}
These conditions ensure the confluence\index{confluence}
of the rules if they are considered as a conditional term rewriting
system \cite{SuzukiMiddeldorpIda95RTA}.
Implementations of Curry may check these conditions and warn the
user if they are not satisfied. There are also more general
conditions to ensure confluence \cite{SuzukiMiddeldorpIda95RTA}
which can be checked instead of the above conditions.
% Note that we do not require
% the termination of the rewrite system so that the computation with
% infinite data structures is supported as in lazy functional languages.
% In contrast to functional languages,
% \emph{extra variables}\index{extra variables} in the conditions
% (i.e., variables which do not occur in the left-hand
% side) are allowed. These extra variables provide the power of
% logic programming since a search for appropriate values is necessary
% in order to apply a conditional rule with extra variables.

Note that defining \emph{equations of higher-type},
\index{equation!higher-order}\index{higher-order equation}
e.g., \pr{f = g} if \pr{f} and \pr{g} are of type \pr{Bool -> Bool},
are formally excluded in order to define a simple operational semantics based
on first-order rewriting.\footnote{Note that this is also necessary
in an extension of Curry which allows higher-order rewrite rules,
since rules with lambda-abstractions in left-hand sides which are
not of base type may cause a gap between
the standard notion of higher-order rewriting and the corresponding equational
theory \cite{Nipkow91LICS}.}
For convenience, a defining equation \pr{f = g}
between functions is allowed but will be interpreted in Curry
as syntactic sugar for the corresponding defining equation \pr{f x = g x}
on base types.


\subsubsection{Functions vs.\ Variables}
\label{sec-variable-sharing}

In lazy functional languages, different occurrences of the same
variable are shared to avoid multiple evaluations of
identical expressions. For instance, if we apply the rule
\startprog
double x = x+x
\stopprog
to an expression \pr{double$\,t$}, we obtain the new expression
\pr{$t$+$t$} but both occurrences of $t$ denote the identical
expression, i.e., these subterms will be simultaneously evaluated.
Thus, \emph{several occurrences of the same variable are
always shared.} This is  necessary not only for efficiency reasons
but it has also an influence on the soundness of the operational semantics
in the presence of non-deterministic functions
(see also \cite{GonzalesEtAl96ESOP}). For instance,
consider the non-deterministic function \pr{coin} defined by the rules
\startprog
coin = 0
coin = 1
\stopprog
Thus, the expression \pr{coin} evaluates to \pr{0} or \pr{1}.
However, the result values of the expression \pr{(double\,\,coin)}
depend on the sharing of the two occurrences of \pr{coin}
after applying the rule for \pr{double}: if both occurrences
are shared (as in Curry), the results are \pr{0} or \pr{2},
otherwise (without sharing) the results are \pr{0}, \pr{1}, or \pr{2}.
The sharing of argument variables corresponds to the so-called
``call time choice'' in the declarative semantics
\cite{GonzalesEtAl96ESOP}: if a rule is applied to a function call,
a unique value must be assigned to each argument
(in the example above: either \pr{0} or \pr{1} must be assigned
to the occurrence of \pr{coin} when the expression
\pr{(double\,\,coin)} is evaluated). This does not mean that
functions have to be evaluated in a strict manner but this
behavior can be easily obtained by sharing the different occurrences
of a variable.

Since different occurrences of the same variable are always shared
but different occurrences of (textually identical) function
calls are not shared, it is important to distinguish
between variables and functions. Usually, all symbols
occurring at the top-level in the left-hand side of some rule
are considered as functions and the non-constructor symbols
in the arguments of the left-hand sides are considered
as variables. But note that there is a small exception
from this general rule in local declarations
(see Section~\ref{sec-localdecls}).


\subsubsection{Conditional Equations with Boolean Guards}
\label{sec-bool-guards}

For convenience and compatibility with Haskell,
one can also write conditional equations with multiple guards
\startprog
$f~t_1\ldots t_n$ | $b_1$ = $e_1$
          \,$\vdots$
          \,| $b_k$ = $e_k$
\stopprog
where $b_1,\ldots,b_k$ ($k>0$) are expressions of type \pr{Bool}
(instead of \pr{Constraint}).
Such a rule is interpreted as in Haskell: the guards are successively evaluated
and the right-hand side of the first guard which is evaluated to \pr{True}
is the result of applying this equation. Thus, this equation
can be considered as an abbreviation for the rule
\startprog
$f~t_1\ldots t_n$ = if $b_1$ then $e_1$ else
            \,$\vdots$
            \,if $b_k$ then $e_k$ else \mbox{\it{}undefined}
\stopprog
(where {\it undefined} is some non-reducible function).
To write rules with several Boolean guards more nicely,
there is a Boolean function \pr{otherwise}\pindex{otherwise}
which is predefined as \pr{True}. For instance, the factorial
function can be declared as follows:
\startprog
fac n | n==0      = 1
      | otherwise = fac(n-1)*n
\stopprog
Since all guards have type Bool, this definition is equivalent to
(after a simple optimization)
\startprog
fac n = if n==0 then 1 else fac(n-1)*n
\stopprog
%
To avoid confusion, it is not allowed to mix the Haskell-like
notation with Boolean guards and the standard notation
with constraints: in a rule with multiple guards,
all guards must be expressions of either type \pr{Constraint}
or type \pr{Bool} but not a simultaneous mixture of both.
The default type of a guard is \pr{Constraint}, i.e., in a rule like
\startprog
f c x | c = x
\stopprog
the type of the variable \pr{c} is \pr{Constraint}
(provided that it is not restricted to \pr{Bool} by the use of \pr{f}).


\subsection{Local Declarations}
\label{sec-localdecls}

\index{local declaration}\index{declaration!local}
Since not all auxiliary functions should be globally
visible, it is possible to restrict the scope of declared entities.
Note that the scope of parameters in function definitions
is already restricted since the variables occurring in parameters
of the left-hand side are only visible in the corresponding
conditions and right-hand sides. The visibility of other
entities can be restricted using \pr{let} in expressions
or \pr{where} in defining equations.

An expression of the form \pr{let {\it{}decls} in {\it{}exp}}
\index{local declaration}\index{declaration!local}\pindex{let}
introduces a set of local names. The list of local declarations
\emph{decls} can contain function definitions as well as
definitions of constants by pattern matching. The names
introduced in these declarations are visible in the expression
\emph{exp} and the right-hand sides of the declarations in \emph{decls},
i.e., the local declarations can be mutually recursive.
For instance, the expression
\startprog
let a=3*b
    b=6
in 4*a
\stopprog
reduces to the value \pr{72}.

Auxiliary functions which are only introduced to define another
function should often not be visible outside. Therefore,
such functions can be declared in a \pr{where}-clause
\index{local declaration}\index{declaration!local}\pindex{where}
added to the right-hand side of the corresponding function definition.
This is done in the following definition of a fast exponentiation
where the auxiliary functions \pr{even} and \pr{square}
are only visible in the right-hand side of the rule for \pr{exp}:
\startprog
exp b n = if n==0
          then 1
          else if even n then square (exp b (n `div` 2))
                         else b * (exp b (n-1))

       where even n = n `mod` 2 == 0
             square n = n*n
\stopprog
Similarly to \pr{let}, \pr{where}-clauses can contain
mutually recursive function definitions as well as
definitions of constants by pattern matching.
The names declared in the \pr{where}-clauses are only visible
in the corresponding conditions and right-hand sides.
As a further example, the following Curry program implements
the quicksort algorithm with a function \pr{split} which
splits a list into two lists containing the smaller and larger elements:
\startprog
split e []            = ([],[])
split e (x:xs) | e>=x = (x:l,r)
               | e<x  = (l,x:r)
               where
                 (l,r) = split e xs
~
qsort []     = []
qsort (x:xs) = qsort l ++ (x:qsort r)
             where
               (l,r) = split x xs
\stopprog
To distinguish between locally introduced functions and variables
(see also Section~\ref{sec-variable-sharing}),
we define a \emph{local pattern}\index{local pattern}\index{pattern!local}
as a (variable) identifier or an application where the top symbol
is a data constructor. If the left-hand side of a local declaration
in a \pr{let} or \pr{where} is a pattern, then all identifiers
in this pattern that are not data constructors are
considered as variables. For instance, the locally introduced
identifiers \pr{a}, \pr{b}, \pr{l}, and \pr{r} in the previous examples
are variables whereas the identifiers \pr{even} and \pr{square}
denote functions. Note that this rule exclude the
definition of 0-ary local functions since a definition of the form
``\pr{where f = \ldots}'' is considered as the definition of
a local variable \pr{f} by this rule which is usually the intended
interpretation (see previous examples).


\subsection{Free Variables}
\label{sec-freevars}

Since Curry is intended to cover functional as well as logic
programming paradigms, expressions (or constraints, see
Section~\ref{sec-equality}) might contain free
(unbound, uninstantiated) variables.
\index{variable!free}\index{free variable}\index{unbound variable}
The idea is to compute values
for these variables such that the expression is reducible
to a data term or that the constraint is solvable.
For instance, consider the definitions
\startprog
mother John    = Christine
mother Alice   = Christine
mother Andrew  = Alice
\stopprog
Then we can compute a child of \pr{Alice} by solving the
equation (see Section~\ref{sec-equality})
\pr{mother x =:= Alice}. Here, \pr{x} is a free variable
which is instantiated to \pr{Andrew} in order to reduce
the equation's left-hand side to \pr{Alice}.
Similarly, we can compute a grandchild of \pr{Chistine}
by solving the equation \pr{mother (mother x) =:= Christine}
which yields the value \pr{Andrew} for \pr{x}.

In logic programming languages like Prolog, all free variables
are considered as existentially quantified at the top-level.
Thus, they are always implicitly declared at the top-level.
In a language with different nested scopes like Curry,
it is not clear to which scope an undeclared variable belongs
(the exact scope of a variable becomes particularly important
in the presence of search operators, see Section~\ref{sec-encapsearch},
where existential quantifiers and lambda abstractions are often
mixed). Therefore, Curry requires that
\emph{each free variable $x$ must be explicitly declared}
using a declaration \index{declaration!free}\index{free declaration}
\index{variable!declaration}
of the form \pr{$x$ free}.\pindex{free} These declarations
can occur in \pr{where}-clauses or in a \pr{let} enclosing
a constraint. The variable is then introduced as unbound
with the same scoping rules as for all other local
entities (see Section~\ref{sec-localdecls}).
For instance, we can define
\startprog
isgrandmother g | let c free in mother (mother c) =:= g  = True
\stopprog
As a further example, consider the definition of the
concatentation of two lists:
\startprog
append []     ys = ys
append (x:xs) ys = x : append xs ys
\stopprog
Then we can define the function \pr{last} which computes
the last element of a list by the rule
\startprog
last l | append xs [x] =:= l  = x  where x,xs free
\stopprog
Since the variable \pr{xs} occurs in the condition but not in the
right-hand side, the following definition is also possible:
\startprog
last l | let xs free in append xs [x] =:= l  = x  where x free
\stopprog
%
Note that the \pr{free} declarations can be freely mixed with
other local declarations after a \pr{let} or \pr{where}.
The only difference is that a declaration like ``\pr{let x free}''
introduces an existentially quantified variable (and, thus,
it can only occur in front of a constraint) whereas
other \pr{let} declarations introduce local functions or parameters.
Since all local declarations can be mutually recursive,
it is also possible to use local variables in the bodies
of the local functions in one \pr{let} declarations.
For instance, the following expression is valid (where the functions
\pr{h} and \pr{k} are defined elsewhere):
\startprog 
let f x = h x y
    y free
    g z = k y z
in c y (f (g 1))
\stopprog
Similarly to the usual interpretation of local definitions
by lambda lifting \cite{Johnsson85}, this expression
can be interpreted by transforming the local definitions
for \pr{f} and \pr{g} into global ones by adding the non-local
variables of the bodies as parameters:
\startprog 
f y x = h x y
g y z = k y z
\ldots
  let y free in c y (f y (g y 1))
\stopprog



\subsection{Constraints and Equality}
\label{sec-equality}

A condition of a rule is a constraint which must be solved
in order to apply the rule. An elementary constraint\index{constraint} is an
\emph{equational constraint}\index{equational constraint}
\index{constraint!equational}\pindex{=:=}\index{equality!in constraints}
\pr{$e_1$=:=$e_2$} between two expressions
(of base type). \pr{$e_1$=:=$e_2$} is satisfied if both sides are reducible
to a same ground data term. This notion of equality, which is the
only sensible notion of equality in the presence of non-terminating functions
\cite{GiovannettiLeviMoisoPalamidessi91,MorenoRodriguez92}
and also used in functional languages, is also called
\emph{strict equality}.\index{strict equality}\index{equality!strict}
As a consequence, if one side is undefined
(nonterminating), then the strict equality does not hold.
Operationally, an equational constraint $e_1\pr{=:=}e_2$ is solved
by evaluating $e_1$ and $e_2$ to unifiable data terms.
The equational constraint could also be solved in an incremental
way by an interleaved lazy evaluation of the expressions
and binding of variables to constructor terms (see
\cite{LoogenLopezFraguasRodriguezArtalejo93PLILP}
or Section~\ref{sec-equationsolving} in the appendix).

Equational constraints should be distinguished from standard Boolean functions
(cf.\ Section~\ref{sec-builtin-types})
since constraints are checked for satisfiability. For instance,
the equational constraint \pr{[x]=:=[0]} is satisfiable if the variable
\pr{x} is bound to \pr{0}. However, the evaluation of \pr{[x]=:=[0]}
does not deliver a Boolean value \pr{True} or \pr{False},
since the latter value would require a binding of \pr{x}
to all values different from \pr{0} (which could be expressed
if a richer constraint system than substitutions, e.g.,
disequality constraints \cite{ArenasGilLopez94PLILP}, is used).
This is sufficient since, similarly to logic programming,
constraints are only activated in conditions of equations
which must be checked for satisfiability.

Note that the basic kernel of Curry only provides
strict equations $e_1\pr{=:=}e_2$ between expressions
as elementary constraints. Since it is conceptually fairly easy to add
other constraint structures \cite{LopezFraguas92},
extensions of Curry may provide richer constraint systems
to support constraint logic programming applications.

Constraints can be combined into a \emph{conjunction} written as
$c_1 \cconj c_2$.
\index{conjunction}\index{conjunction!concurrent}
\index{conjunction!of constraints}\pindex{\&}
The conjunction is interpreted \emph{concurrently}:
if the combined constraint $c_1 \cconj c_2$ should
be solved, $c_1$ and $c_2$ are solved concurrently.
In particular, if the evaluation of $c_1$ suspends,
the evaluation of $c_2$ can proceed which may cause the reactivation
of $c_1$ at some later time (due to the binding of common variables).
In a sequential implementation, the evaluation of $c_1 \cconj c_2$
could be started by an attempt to solve $c_1$.
If the evaluation of $c_1$ suspends,
an evaluation step is applied to $c_2$.

%Syntactically, constraints are always enclosed
%in curly brackets.\pindex{\{\ldots\}}
%Thus, \pr{\{x = f y, y = h 0\}} denotes the concurrent conjunction
%of the equational constraints \pr{x = f y} and \pr{y = h 0}.
%There is also a predefined infix operator \verb+/\+\index{/\@\verb+/\+}
%for the concurrent conjunction of two constraint
%which can be considered as defined by
%\startprog
%c1 \verb+/\+ c2 = \{c1, c2\}
%\stopprog
%
It is interesting to note that parallel functional computation models
\cite{BreitingerLoogenOrtega95,ChakravartyEtAl98Goffin}
are covered by the use of concurrent constraints.
For instance, a constraint of the form
\startprog
x =:= f t1  \&  y =:= g t2  \&  z =:= h x y
\stopprog
specifies
a potentially concurrent computation of the functions \pr{f}, \pr{g} and
\pr{h} where the function \pr{h} can proceed its computation
only if the arguments have been bound by evaluating the expressions
\pr{f t1} and \pr{g t2}. Since constraints
could be passed as arguments or results of functions
(like any other data object or function), it is possible
to specify general operators to create flexible communication
architectures similarly to Goffin \cite{ChakravartyEtAl98Goffin}.
Thus, the same abstraction facilities could be used for sequential
as well as concurrent programming. On the other hand,
the clear separation between sequential and concurrent computations
(e.g., a program without any occurrences of concurrent conjunctions
is purely sequential)
supports the use of efficient and optimal evaluation strategies
for the sequential parts \cite{Antoy97ALP,AntoyEchahedHanus94POPL},
where similar techniques for the concurrent parts are not available.

 

\subsection{Higher-order Features}

Curry is a higher-order language supporting the common
functional programming techniques by partial function applications
and lambda abstractions. \emph{Function application}\index{function application}
is denoted by juxtaposition
the function and its argument. For instance, the well-known \pr{map}
function is defined in Curry by
\startprog
map :: (a -> b) -> [a] -> [b]
map f []     = []
map f (x:xs) = f x : map f xs
\stopprog
However, there is an important difference w.r.t.{} to functional programming.
Since Curry is also a logic language, it allows logical variables
also for functional values, i.e., it is possible to evaluate the
equation \pr{map f [1 2] =:= [2 3]} which has, for instance, a solution
\pr{\{f=inc\}} if \pr{inc} is the increment function on natural numbers.
In principle, such solutions can be computed by extending (first-order)
unification to \emph{higher-order unification}
\cite{HanusPrehofer96RTA,NadathurMiller88,Prehofer95Diss}.
Since higher-order unification is a computationally expensive
operation, Curry delays the application of unknown functions
until the function becomes known \cite{Ait-KaciLincolnNasr87,Smolka95}.
Thus, the application operation can be considered as a function
(``\pr{@}'' is a left-associative infix operator)
\startprog
(@) :: (a -> b) -> a -> b
f@x = f x
\stopprog
which is ``rigid'' in its first argument (cf.\ Section~\ref{sec-opsem}).

In cases where a function is only used a single time, it is tedious
to assign a name to it. For such cases, anonymous functions
(\emph{$\lambda$-abstractions}\index{lambda-abstraction@$\lambda$-abstraction}),
denoted by
\startprog
\ttbs{}$x_1\ldots{}x_n$->$e$
\stopprog
are provided.


\section{Operational Semantics}
\label{sec-opsem}

Curry's operational semantics is based on the lazy evaluation
of expressions combined with a possible instantiation
of free variables occurring in the expression.
If the expression is ground, the operational model
is similar to lazy functional languages,
otherwise (possibly non-deterministic) variable instantiations
are performed as in logic programming.
If an expression contains free variables, it
may be reduced to different values by binding the
free variables to different expressions. In functional programming,
one is interested in the computed \emph{value}, whereas logic programming
emphasizes the different bindings (\emph{answers}).
Thus, we define for the integrated functional logic language Curry an
\emph{answer expression}\index{answer expression}\index{expression!answer}
as a pair ``$\sigma \ans e$''
consisting of a substitution $\sigma$ (the answer computed so far)
and an expression $e$.
An answer expression $\sigma \ans e$ is \emph{solved} if $e$
is a data term.
Usually, the identity substitution in answer expressions is omitted,
i.e., we write $e$ instead of $\{\} \ans e$ if it is clear from
the context.

Since more than one answer may exist for expressions containing
free variables, in general, initial expressions are reduced to disjunctions
of answer expressions. Thus, a
\emph{disjunctive expression}\index{disjunctive expression}\index{expression!disjunctive}
is a (multi-)set of answer expressions
$\{\sigma_1 \ans e_1,\ldots, \sigma_n \ans e_n\}$.
For the sake of readability, we write concrete disjunctive expressions
like
\[
\{\{\pr{x}=\pr{0},\pr{y}=\pr{2}\} \ans \pr{2}~,~\{\pr{x}=\pr{1},\pr{y}=\pr{2}\} \ans \pr{3}\}
\]
in the form \pr{\{x=0,y=2\}\,2\,|\,\{x=1,y=2\}\,3}.
Thus, substitutions are represented by lists of equations enclosed in
curly brackets, and disjunctions are separated by vertical bars.

A single \emph{computation step}\index{computation step}
performs a reduction in exactly one unsolved
expression of a disjunction (e.g., in the leftmost unsolved
answer expression in Prolog-like implementations).
If the computation step is
deterministic, the expression is reduced to a new one.
If the computation step is non-deterministic,
the expression is replaced by a disjunction of new expressions.
The precise behavior depends on the function calls occurring
in the expression.
For instance, consider the following rules:
\startprog
f 0 = 2
f 1 = 3
\stopprog
The result of evaluating the expression \pr{f 1} is \pr{3}, whereas
the expression \pr{f x} should be evaluated to the disjunctive expression
\startprog
\{x=0\} 2 | \{x=1\} 3 .
\stopprog
To avoid superfluous computation steps and to apply
programming techniques of modern functional languages,
nested expressions are evaluated lazily, i.e., the leftmost outermost
function call is primarily selected in a computation step.
Due to the presence of free variables in expressions, this
function call may have a free variable at an argument position
where a value is demanded by the left-hand sides of the function's rules
(a value is \emph{demanded} by an argument position of the left-hand side
of some rule, if the left-hand side has a constructor at this
position, i.e., in order to apply the rule, the actual value at
this position must be the constructor).
In this situation there are two possibilities to proceed:
\begin{enumerate}
\item
Delay the evaluation of this function call until the corresponding
free variable is bound (this corresponds to the
\emph{residuation}\index{residuation} principle
which is the basis of languages like Escher \cite{Lloyd94ILPS,Lloyd95},
Le Fun \cite{Ait-KaciLincolnNasr87}, Life \cite{Ait-Kaci90},
NUE-Prolog \cite{Naish91}, or Oz \cite{Smolka95}).
In this case, the function is called \emph{rigid}\index{rigid}.
\item
(Non-deterministically) instantiate the free variable to the different values
demanded by the left-hand sides and apply reduction steps using
the different rules (this corresponds to \emph{narrowing}\index{narrowing}
principle which is the basis of languages like
ALF \cite{Hanus90PLILP}, Babel \cite{MorenoRodriguez92},
K-Leaf \cite{GiovannettiLeviMoisoPalamidessi91}, LPG \cite{BertEchahed86},
or SLOG \cite{Fribourg85}).
In this case, the function is called \emph{flexible}\index{flexible}.
\end{enumerate}
Since Curry is an attempt to provide a common platform for different
declarative programming styles and the decision about the
``right'' strategy depends on the definition and
intended meaning of the functions, Curry supports both strategies.
The precise strategy is specified by \emph{evaluation annotations}
\index{evaluation annotation}
for each function.\footnote{%
Evaluation annotations are similar to coroutining
declarations \cite{Naish87Diss} in Prolog where the programmer
specifies conditions under which a literal is ready for
a resolution step.}
The precise operational meaning of evaluation annotations
is defined in Appendix~\ref{app-opsem}.
A function can be annotated as
\emph{rigid}\index{rigid}\index{annotation!rigid}\pindex{rigid} or
\emph{flexible}\index{flexible}\index{annotation!flexible}\pindex{flex}.
If an explicit annotation is not provided by the user, a default strategy
is used: functions with the result type \pr{Constraint}
are flexible and all other functions are rigid.
Functions with a polymorphic result type (like the identity)
are considered as rigid, although they can used as a constraint function
in a particular context.
This default strategy can be changed by providing explicit
evaluation annotations for each function or by pragmas
(see Section~\ref{sec-pragmas}).

For instance, consider the function \pr{f} as defined above.
If \pr{f} has the evaluation annotation
\startprog
f eval flex
\stopprog
(i.e., \pr{f} is flexible),
then the expression \pr{f x} is evaluated by instantiating \pr{x} to
\pr{0} or \pr{1} and applying a reduction step in both cases.
This yields the disjunctive expression
\startprog
\{x=0\} 2 | \{x=1\} 3 .
\stopprog
However, if \pr{f} has the evaluation annotation
\startprog
f eval rigid
\stopprog
(i.e., \pr{f} is rigid),
then the expression \pr{f x} cannot be evaluated since the argument
is a free variable. In order to proceed, we need a ``generator''
for values for \pr{x}, which is the case in the following constraint:
\startprog
f x =:= y  \&  x =:= 1
\stopprog
Here, the first constraint \pr{f x =:= y}
cannot be evaluated and thus suspends,
but the second constraint \pr{x=:=1} is evaluated by binding \pr{x} to \pr{1}.
After this binding, the first constraint can be evaluated and the entire
constraint is solved. Thus, the constraint is solved by the following
steps:
\startprog
f x =:= y  \&  x =:= 1
$\leadsto$ \{x=1\} f 1 =:= y
$\leadsto$ \{x=1\} 3 =:= y
$\leadsto$ \{x=1,y=3\}
\stopprog
(The empty constraint is omitted in the final answer.)


\section{Types}

\subsection{Built-in Types}
\label{sec-builtin-types}

The following types are predefined in Curry:
\begin{description}
\item[Boolean values:]
\pindex{Bool}\pindex{True}\pindex{False}
They are predefined by the datatype declaration
\startprog
data Bool = True | False
\stopprog
The (sequential) conjunction\index{conjunction}\index{conjunction!sequential}
is predefined as the left-associative infix operator \pr{\&\&}\pindex{\&\&}:
\startprog
(\&\&) :: Bool -> Bool -> Bool
True  \&\& x = x
False \&\& x = False
\stopprog
Similarly, the (sequential) disjunction\index{disjunction}
\pr{||}\pindex{"|"|} and the negation\index{negation} \pr{not}\pindex{not}
are defined as usual (see Appendix~\ref{app-prelude}).
Furthermore, the function \pr{otherwise}\pindex{otherwise}
is predefined as \pr{True}
to write rules with multiple guards more nicely.

Boolean values are mainly used in conditionals, i.e., the
conditional function \pr{if_then_else} is predefined as
\startprog
if_then_else :: Bool -> a -> a -> a
if True  then x else y = x
if False then x else y = y
\stopprog
where \pr{if b then x else y} is syntactic sugar for the application
\pr{if_then_else b x y}.

A function with result type \pr{Bool} is often called
a \emph{predicate}\index{predicate} (in contrast to constraints
which have result type \pr{Constraint}, see below).\footnote{Predicates
in the logic programming sense should be considered as constraints
since they are only checked for satisfiability and usually
not reduced to \pr{True} or \pr{False} in contrast to Boolean functions.}
There are a number of built-in predicates for comparing objects,
like the predicate ``\pr{<}'' to compare numbers (e.g., \pr{1<2} evaluates
to \pr{True} and \pr{4<3} evaluates to \pr{False}).
There is also a standard predicate ``\pr{==}''\pindex{==}\index{equality!test}
\index{strict equality}\index{equality!strict}
to test the convertibility of expressions to identical data terms
(``strict equality'', cf.\ Section~\ref{sec-equality}):
$e_1 \pr{==} e_2$ reduces to \pr{True} if
$e_1$ and $e_2$ are reducible to identical ground data terms,
and it reduces to \pr{False} if $e_1$ and $e_2$ are reducible to
different ground data terms. The evaluation of $e_1 \pr{==} e_2$
suspends if one of the arguments is a free variable
(i.e., \pr{==} is a ``rigid'' function, cf.\ Section~\ref{sec-opsem}).
If neither $e_1$ nor $e_2$ is a free variable, $e_1 \pr{==} e_2$
is reduced according to the following rules:
\startprog
C         == C        \,\,= True \hfill\%{\rm for all $0$-ary constructors} C
For faster browsing, not all history is shown. View entire blame