-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathMachTypes.lhs
484 lines (423 loc) · 21.3 KB
/
MachTypes.lhs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
% -*- LaTeX -*-
% $Id: MachTypes.lhs 3206 2016-06-07 07:17:22Z wlux $
%
% Copyright (c) 1998-2016, Wolfgang Lux
% See LICENSE for the full license.
%
\subsection{Basic Types}
\begin{verbatim}
> module MachTypes where
> import Env
> import Error
> import Combined
> import IO(IOMode(..),Handle)
> import IOExts()
\end{verbatim}
\emph{The empty import of module \texttt{IOExts} is necessary in order
to import the instance declarations for the \texttt{Ref} when
compiling with hbc.}
In our abstract machine interpreter, we do not interpret the abstract
machine statements directly, but instead convert each statement into a
state transformer when the program is loaded. Using continuation
passing style, every statement is parameterized by the next
statement(s) to be executed. The whole program then finally delivers
some result. We use a combined
monad~\cite{LiangHudakJones95:ModInterp} which encapsulates the
current machine state and also encapsulates the error handling.
Finally, the combined monad is parameterized by an IO monad because we
make use of mutable references in the implementation of the machine.
Not very surprisingly, the final result of a computation is the final
state.
\begin{verbatim}
> type MachStateT a = StateT State (ErrorT IO) a
> type Instruction = MachStateT (Maybe State)
\end{verbatim}
\ToDo{Should we define \texttt{Instruction} using a \texttt{newtype}
declaration? This could make a lot of \texttt{Show} instances
derivable.}
In order to implement the heap we use mutable references, which are
provided by most Haskell implementations. Every node allocated in the
heap is also assigned an ``address''. We use a simple allocation
counter for that purpose, which is part of the machine's state.
\subsubsection{Functions}
A function is described by its name, entry point, and arity.
\begin{verbatim}
> type Function = (String,Instruction,Int)
\end{verbatim}
\subsubsection{Graph Nodes}
The heap contains nodes which are used to build a graph corresponding
to the final result of the goal being evaluated.
Integer and floating-point numbers are represented by the respective
cases of the \texttt{Node} type and comprise only the number's value.
A variable node represents an unbound variable and has two fields. The
first field holds a list of constraints for the variable and the
second holds the list of threads that are blocked by the variable.
Once the variable is instantiated the node is overwritten
destructively.
Closure and lazy application nodes represent functions and their
applications. A plain function corresponds to a closure node without
arguments. A closure node consists of the code to be executed for the
function, the arity, and the list of arguments to which the function
has been applied. In addition, the name of the function is included in
order to make the output more readable. Besides that it has no
semantic meaning. In contrast to closure nodes, lazy application nodes
are overwritten with their result after they have been
evaluated.
Partial applications of functions are represented by closure nodes as
well using fewer arguments than the function's arity. Partial
applications of data constructors cannot be represented directly.
However, when a program is loaded an auxiliary function with the same
name is implicitly defined for each data constructor for this purpose.
In order to prevent multiple threads from evaluating the same
application, lazy application nodes are overwritten by a queue-me node
when a thread starts their evaluation. A wait queue similar to
variable nodes is used to collect the threads that wait for the
application's result becoming available. Once evaluation of a lazy
application completes successfully, the queue-me node is overwritten
with (an indirection pointing to) the result of the application.
Indirection nodes are used when a variable or a lazy application node
is overwritten in order to preserve the sharing of nodes. An
indirection node solely consists of a pointer to the new node.
Global application and global variable nodes represent (possibly
unevaluated) applications and (possible unbound) logical variables,
respectively, that are passed as free variables to search goals. These
global reference nodes combine a pointer to the referenced node and
the search space in which that node is defined. \\
\textbf{Invariant:} \emph{The interpreter maintains the invariant that
the search space field of a global reference node to which a search
goal has direct access never refers to the goal's local search
space.}
Search continuation nodes represent the alternative continuations
returned by an encapsulated search. A search continuation saves the
goal application and variable of the goal being evaluated. In
addition, the state of all local threads and the corresponding search
space are saved.
\begin{verbatim}
> data NodeTag =
> CharTag Char | IntTag Integer | FloatTag Double
> | ConstructorTag Int String Int | VariableTag
> | ClosureTag | LazyTag | QueueMeTag
> | IndirTag | GlobalAppTag | GlobalVarTag | SearchTag
> deriving Show
> instance Eq NodeTag where
> CharTag c == CharTag d = c == d
> IntTag i == IntTag j = i == j
> FloatTag e == FloatTag f = e == f
> ConstructorTag t1 _ _ == ConstructorTag t2 _ _ = t1 == t2
> VariableTag == VariableTag = True
> ClosureTag == ClosureTag = True
> LazyTag == LazyTag = True
> QueueMeTag == QueueMeTag = True
> IndirTag == IndirTag = True
> GlobalAppTag == GlobalAppTag = True
> GlobalVarTag == GlobalVarTag = True
> SearchTag == SearchTag = True
> _ == _ = False
> data Node =
> CharNode Char | IntNode Integer | FloatNode Double
> | ConstructorNode Int String [NodePtr]
> | VarNode [Constraint] ThreadQueue
> | ClosureNode String Int Instruction [NodePtr]
> | LazyNode String Int Instruction [NodePtr]
> | QueueMeNode ThreadQueue
> | IndirNode NodePtr
> | GlobalAppNode NodePtr SearchSpace
> | GlobalVarNode NodePtr SearchSpace
> | SearchContinuation NodePtr NodePtr ThreadQueue SearchSpace
> instance Show Node where
> showsPrec p (CharNode c) =
> showParen (p >= 10) $ showString "CharNode " . shows c
> showsPrec p (IntNode i) =
> showParen (p >= 10) $ showString "IntNode " . shows i
> showsPrec p (FloatNode f) =
> showParen (p >= 10) $ showString "FloatNode " . shows f
> showsPrec p (ConstructorNode tag name args) = showParen (p >= 10) $
> showString "ConstructorNode " . shows tag . showChar ' ' .
> showString name . flip (foldr showArg) args
> where showArg arg = showChar ' ' . shows arg
> showsPrec p (VarNode constraints waitqueue) = showParen (p >= 10) $
> showString "VarNode " . shows constraints . showChar ' ' . shows waitqueue
> showsPrec p (ClosureNode name arity code args) = showParen (p >= 10) $
> showString "ClosureNode " . showString name . showChar ' ' .
> shows arity . flip (foldr showArg) args
> where showArg arg = showChar ' ' . shows arg
> showsPrec p (LazyNode name arity code args) = showParen (p >= 10) $
> showString "LazyNode " . showString name . showChar ' ' .
> shows arity . flip (foldr showArg) args
> where showArg arg = showChar ' ' . shows arg
> showsPrec p (QueueMeNode waitqueue) =
> showParen (p >= 10) $ showString "QueueMeNode " . shows waitqueue
> showsPrec p (IndirNode ptr) =
> showParen (p >= 10) $ showString "IndirNode " . shows ptr
> showsPrec p (GlobalAppNode ptr space) = showParen (p >= 10) $
> showString "GlobalAppNode " . shows ptr . showChar ' ' .
> showsPrec 10 space
> showsPrec p (GlobalVarNode ptr space) = showParen (p >= 10) $
> showString "GlobalVarNode " . shows ptr . showChar ' ' .
> showsPrec 10 space
> showsPrec p (SearchContinuation app var rq space) = showParen (p >= 10) $
> showString "SearchContinuation " . shows app . showChar ' ' .
> shows var . showChar ' ' . shows rq . showChar ' ' .
> showsPrec 10 space
> data NodePtr = Ptr Integer (Ref Node)
> instance Eq NodePtr where
> Ptr adr1 _ == Ptr adr2 _ = adr1 == adr2
> instance Ord NodePtr where
> Ptr adr1 _ `compare` Ptr adr2 _ = adr1 `compare` adr2
> instance Show NodePtr where
> showsPrec _ (Ptr adr _) = showString "node@" . shows adr
> nodeTag :: Node -> NodeTag
> nodeTag (CharNode c) = CharTag c
> nodeTag (IntNode i) = IntTag i
> nodeTag (FloatNode f) = FloatTag f
> nodeTag (ConstructorNode t c xs) = ConstructorTag t c (length xs)
> nodeTag (VarNode _ _) = VariableTag
> nodeTag (ClosureNode _ _ _ _) = ClosureTag
> nodeTag (LazyNode _ _ _ _) = LazyTag
> nodeTag (QueueMeNode _) = QueueMeTag
> nodeTag (IndirNode _) = IndirTag
> nodeTag (GlobalAppNode _ _) = GlobalAppTag
> nodeTag (GlobalVarNode _ _) = GlobalVarTag
> nodeTag (SearchContinuation _ _ _ _) = SearchTag
> nilTag, consTag, unitTag :: NodeTag
> nilTag = ConstructorTag 0 "Prelude.[]" 0
> consTag = ConstructorTag 1 "Prelude.:" 2
> unitTag = ConstructorTag 0 "Prelude.()" 0
> isTupleName :: String -> Bool
> isTupleName ('P':'r':'e':'l':'u':'d':'e':'.':'(':',':cs) =
> dropWhile (',' ==) cs == ")"
> isTupleName _ = False
\end{verbatim}
\subsubsection{Machine State}
The abstract machine uses a data stack for the arguments of a
function, a return stack to save the return context of a function
call, a choicepoint stack to implement global search via backtracking,
and a search context stack for (nested) encapsulated search
invocations. The state of the abstract machine is composed of the
following information:
\begin{verbatim}
> data State = State {
> tid :: Integer, -- thread id of running thread
> env :: LocalEnv, -- local environment
> ds :: DataStack, -- argument stack
> rs :: ContStack, -- return stack
> rq :: ThreadQueue, -- list of runnable threads
> hp :: Integer, -- ``allocation pointer'' in the heap
> bp :: FailureStack, -- choicepoint stack
> tp :: Trail, -- trail
> ct :: Integer, -- thread counter
> sc :: SearchContext, -- current search context
> ss :: SearchSpace -- pointer to current search space
> } deriving Show
\end{verbatim}
The local environment maps the names of the local variables onto the
addresses of the associated nodes.
\begin{verbatim}
> type LocalEnv = Env String NodePtr
\end{verbatim}
The data stack is implemented as a list of nodes addresses.
\begin{verbatim}
> type DataStack = [NodePtr]
\end{verbatim}
The return stack has to maintain the return address of a function and
also saves the caller's local environment.
\begin{verbatim}
> type ContStack = [Cont]
> data Cont = Cont Instruction LocalEnv
> instance Show Cont where
> showsPrec p (Cont ip env) = showParen (p >= 10) $
> showString "Cont <<Instruction>> " . showsPrec 10 env
\end{verbatim}
In a choice point, all machine registers have to be saved so that the
machine state can be restored upon backtracking. In addition, the next
instruction to be executed after backtracking must be saved here. Note
that choice points are used only in the global search space, so the
current search space does not have to be saved.
\begin{verbatim}
> type FailureStack = [Choicepoint]
> data Choicepoint = Choicepoint Instruction Integer LocalEnv DataStack
> ContStack ThreadQueue Trail
> instance Show Choicepoint where
> showsPrec p (Choicepoint ip tid env ds rs rq tp) = showParen (p >= 10) $
> showString "Choicepoint <<Instruction>> " . shows tid .
> showsPrec 10 env . showChar ' ' . shows ds . showChar ' ' . shows rs .
> showChar ' ' . shows rq . showChar ' ' . shows tp
\end{verbatim}
\subsubsection{Threads}
In order to implement the concurrent evaluation, a simple thread
facility is integrated into the abstract machine. Every computation is
executed within a thread. The abstract machine is executing the
running thread. A list of threads which are runnable but not active
are saved in the ready queue. Threads which are suspended due to an
access to an unbound variable or locked application node are collected
in the wait queues of the corresponding node.
A thread can be member of more than one queue which makes it possible
to wake a thread using one of a list of conditions. This is
implemented by including a \texttt{ThreadSurrogate} for the thread in
each of the queues. As soon as the thread is woken through one of the
queues, all of its surrogates are released.
A context switch will occur only if the current thread suspends itself
or exits. The current thread suspends itself if it finds an
uninstantiated variable in a demanded position during pattern matching
inside a rigid function or when it tries to evaluate a locked application
node. In addition, a thread may reschedule itself in order to allow
deterministic computations to be performed when the current thread can
otherwise proceed only non-deterministically.
The information about a thread which is not running is saved in a
thread node that is allocated in the heap. Every thread frame
includes a unique thread id, which just provides a nice means
to display threads during tracing and otherwise has no functionality.
For the running thread, this id is maintained in the \texttt{tid}
machine register.
\begin{verbatim}
> type ThreadQueue = [Thread]
> newtype ThreadPtr = ThreadPtr (Ref (Maybe Thread))
> data Thread = Thread Integer Instruction LocalEnv DataStack ContStack
> | ThreadSurrogate Integer ThreadPtr
> instance Show Thread where
> showsPrec p (Thread id ip env ds rs) = showParen (p >= 10) $
> showString "Thread " . shows id . showString " <<Instruction>> " .
> showsPrec 10 env . showChar ' ' . shows ds . showChar ' ' . shows rs
> showsPrec p (ThreadSurrogate id _) = showParen (p >= 10) $
> showString "ThreadSurrogate " . shows id
> instance Show ThreadPtr where
> showsPrec p (ThreadPtr _) = showString "<<ThreadPtr>>"
\end{verbatim}
\subsubsection{Local Search Spaces}
Local search spaces are used as a foundation for the implementation of
encapsulated search. They serve as a means to isolate the effects
of the different alternatives of a non-deterministic
computation. Logic variables introduced during the unification of
argument expressions are available only to the computation space in
which they were created and to local spaces defined inside that
space. They can be bound only by computations operating in
the same local space. Computations in a local space may have non-local
effects, though. They can force the evaluation of an application node
to weak head normal form. Because of the strong separation of the
search spaces, these evaluations need not be undone when the local
computation fails.\footnote{The same is in fact true for the usual
implementation using backtracking and a depth first search, but it is
not easy to detect such computations, while for local spaces simply
the membership in a given search space has to be tested.}
A new local space is introduced by the primitive function
\texttt{try}, which reduces its argument function in this space until
the computation either fails, succeeds, or splits
non-deterministically into several alternative computations. In these
cases \texttt{try} returns either an empty list, a singleton list
containing one search continuation in solved form, or a list with
one search continuation for each possible continuation.
In order to maintain the different bindings of variables and lazy
applications, each call to \texttt{try} uses a new search space with
its own set of bindings. Thus, \texttt{try} can be used to implement
different search strategies besides the depth-first search strategy
usually employed by implementations of logic and functional logic
languages. In addition, it allows -- to a certain extent -- to
encapsulate the non-determinism of a computation by lifting it into a
list of alternatives. See~\cite{HanusSteiner98:Control} and the
diploma thesis of Frank Steiner~\cite{Steiner97:Diplom} for a more
detailed account on the operational semantics of the encapsulated
search.
Whenever \texttt{try} is invoked, the current machine state including
the current search space has to be saved. A \texttt{SearchContext} is
allocated for that purpose on the control stack. This state does not
include the instruction pointer because the return address for the
\texttt{try} call is already saved in the corresponding environment
frame. In addition, the \texttt{SearchContext} saves the reference to
a (locked) lazy application, which will be updated when the evaluation
of the search goal finally succeeds, and the variable that was applied
to the goal in order to start its evaluation.
A search space maintains the bindings of all local variables and lazy
applications. As we use destructive updates on the graph, this means
that we have to save the old state of the variables, i.e., the one
outside the search space, when the variable is updated. This simply
means that the trail has to be saved. In addition, we must save the
state of the variables and lazy application nodes inside the search
space. This is handled by adding a second trail, called the script, to
the search space.\footnote{The name is borrowed from Amoz, the
abstract machine for Oz~\cite{MehlScheidhauerSchule95:Amoz}, which
uses a similar strategy.} We use the fact that the state of a search
space that was restored from a search continuation is shared by the
current search space and maintain a hierarchy of search spaces, where
the root of the tree is the space that was created when the search
goal closure was passed to the primitive function \texttt{try} for the
first time. This means that only the differences between the restored
space and the current search space have to be saved.
Since variables, which are local to a search goal, do not affect
computations outside an encapsulated search, it is possible to update
the local bindings of search goal lazily, i.e. when entering an
encapsulated search with a different search continuation of the same
goal than used the last time before. In order to implement this lazy
update strategy, we keep a reference to the space whose bindings are
in effect for every such tree of search spaces. For simplicity we add
this pointer to the search space nodes, but actually it will be used
only on the root space.\footnote{The reason for this double
indirection is that the root of search space may change -- when a
search continuation is restored, the current space becomes a child
of the restored space and therefore its root must be changed to the
one of the restored space -- and that the current reference must be
shared among all members of a single tree. We could have used a
\texttt{Ref (Ref SearchSpace)} instead.}
Note that neither search contexts nor search spaces save the current
choice point. This is because no choice points will be created inside a
local space and therefore the current choice point will not
change.\footnote{Actually, it would be possible to implement search
contexts as a special kind of choice point, but this would require
to check the invariant that all ``normal'' choice points must above
the search contexts in the control stack. It simplifies the code if we
let the type system ensure this property.}
\begin{verbatim}
> data SearchContext =
> IOContext
> | GlobalContext
> | SearchContext NodePtr NodePtr Integer DataStack ContStack ThreadQueue
> Trail SearchContext SearchSpace
> data SearchSpace =
> GlobalSpace
> | LocalSpace { spaceId :: Integer,
> root :: Ref SearchSpace,
> parent :: Ref SearchSpace,
> trail :: Trail,
> script :: Trail,
> active :: Ref SearchSpace }
> instance Eq SearchSpace where
> GlobalSpace == GlobalSpace = True
> GlobalSpace == LocalSpace _ _ _ _ _ _ = False
> LocalSpace _ _ _ _ _ _ == GlobalSpace = False
> LocalSpace id1 _ _ _ _ _ == LocalSpace id2 _ _ _ _ _ = id1 == id2
> instance Show SearchContext where
> showsPrec p IOContext = showString "IOContext"
> showsPrec p GlobalContext = showString "GlobalContext"
> showsPrec p (SearchContext goal var tid ds rs rq tp sc ss) =
> showParen (p >= 10) $
> showString "SearchContext " . showsPrec 10 goal . showChar ' ' .
> showsPrec 10 var . showChar ' ' . shows tid . showChar ' ' .
> shows ds . showChar ' ' . shows rs . showChar ' ' . shows rq .
> showChar ' ' . shows tp . showChar ' ' . showsPrec 10 sc .
> showChar ' ' . showsPrec 10 ss
> instance Show SearchSpace where
> showsPrec p GlobalSpace = showString "GlobalSpace"
> showsPrec p (LocalSpace id _ _ _ _ _) = showParen (p >= 10) $
> showString "LocalSpace " . shows id . showString " ..."
\end{verbatim}
\subsubsection{Constraints}
At present, the abstract machine supports equality and disequality
constraints. Equality constraints involving variables are implemented
by binding the variable to a term, disequality constraints by
collecting a list of terms at the variable.
\begin{verbatim}
> data Constraint = DisEq NodePtr deriving Show
\end{verbatim}
\subsubsection{The Trail}
The trail is used for saving the old value of a node when it is
overwritten as well when the state of a thread surrogate is
changed. During backtracking or when switching between search spaces
the old state of the computation may be recovered with the help of the
trail. We will simply save the whole node here. In a real
implementation only the updated field and its value need to be saved.
\begin{verbatim}
> type Trail = [UpdateInfo]
> data UpdateInfo =
> NodeBinding NodePtr Node
> | ThreadState ThreadPtr (Maybe Thread)
> deriving Show
\end{verbatim}