You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 2, 2019. It is now read-only.
This issue discusses the design choices of allowing functions and Mu instructions to return more than one value (or a tuple of values).
Status quo
The one-return-value assumption has a chain of causes and effects to the design of the IR.
First of all, Mu functions have multiple parameters and exactly one return value. i.e. f : (P1,P2,P3,...,Pm) -> R This is the tradition of C-like languages as well as LLVM. In case it does not return useful values, it returns void
Following this tradition, all Mu instructions return exactly one value, too. e.g. %rv = CALL <...> @some_func (%a1 %a2 ... %am). In case more than one values need to be returned, a function shall return a struct<T1 T2 ... Tm>. In case no useful value is returned, it returns void. For example:
%result = CMPXCHG WEAK SEQ_CST SEQ_CST <T> %loc %expected %desired
// %result is a struct<T @i1>. The second result indicates if it is successful.
%oldval = EXTRACTFIELD <@cmpxchg_result_T 0> %result
%is_successful = EXTRACTFIELD <@cmpxchg_result_T 1> %result
Since all Mu instructions have a single return value, the SWAPSTACK instruction shall have exactly one return value, too. That return value is the value received from another stack (or injected by the client). Since one stack also passes value using the SWAPSTACK instruction, SWAPSTACK takes one parameter, too, since the peer only receives exactly one return value. e.g. in the sender: SWAPSTACK %receiver ... PASS_VALUE <@T> %val; in the receiver: %retval = SWAPSTACK %sender RET_WITH <@T> ...
Since a swapped-away stack always stops at so called "resumption point" (resumable by swapstack) instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL), all of which receive exactly one return value, then the state of a stack is READY<T> when swapped away, where T is a single type -- the return type of the resumption point. This makes one important special case: a newly-created frame (by NEWSTACK or push_frame (OSR)). Currently since a function takes multiple parameters but SWAPSTACK passes only one argument, there is no way to supply all arguments during SWAPSTACK. As a compromise, NEWSTACK initialises all arguments and the subsequent SWAPSTACK passes void.
As a side effect, the return value uniquely identifies the instruction. This can bring some profits: during OSR, the variable that holds the return value can identify the OSR-point (instructions that may be the "current instruction" in a READY<T> stack) instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL, all happen to be resumption points. This is actually intentionally designed as so.). This is especially useful for the TRAP instruction, in which case the trap handler always first identify which TRAP is triggered. Because of the one-to-one mapping between instructions and return values, the term "SSA variable of an instruction" and "the instruction" are used interchangeably.
Proposed change
The change starts with adopting "multiple return values".
First, functions now return 0 or more values. i.e. f : (P1,P2,P3,..,Pm) -> (R1,R2,...,Rn). For compatibility with C programs (in the unsafe native interface) , a 0-tuple is returned when the C function returns void. i.e. either c_function: (P1,P2,...,Pm) -> (R1) or c_function: (P1,P2,...,Pm) -> ().
Then Mu instructions may return 0 or more values. When no useful values are returned, the instruction returns (). The CALL instruction has exactly as many return values as the callee's return values. For example:
Then SWAPSTACK can pass more than one values to the swappee. The swappee expects to receive a list of values (statically decided at the particular SWAPSTACK instruction). If the swapper does not match the value required by swappee, it has undefined behaviour. For example:
Since all resumption point instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL) may receive multiple values (in fact, all instructions can), the state of a "swapped-away" stack is READY<T1 T2 ... Tn>. When binding a thread to a stack, exactly that many values need to be bound.
Then the state of a newly created frame is READY<T1 T2 ... Tn> where T1 T2 ... Tn are the parameter types of the function (also the entry block). NEWSTACK no longer supply arguments, but the first SWAPSTACK supplies the parameters to the new frame. For example:
Now not all instructions are identifiable. If an instruction does not return value, e.g. a trap that does not expect values from the client (which is the most common case): TRAP <> KEEPALIVE(...), then there is no identifiers for these instructions. As a compromise, explicit names can be given to instructions using a slightly different syntax:
() = [%the_important_trap] TRAP <> KEEPALIVE (%local %vars)
// or simply
[%the_important_trap] TRAP <> KEEPALIVE (%local %vars)
// it also works if there are return values:
(%rv1 %rv2) = [%the_important_call_site] CALL <@sig> @func (%arg1 %arg2)
And SSA variables and instructions completely divorce. Now an instruction has a list of results, each of which is an SSA variable. The new SSA variable hierarchy is:
SSA variable
global variable
constant, global cell, function, exposed function
local variable
parameter
one result of an instruction (was just "instruction")
This change also undoes #43, a previous proposal to make void more like "unit". Now we have the real "unit" return value: (), and the void type has only two use cases:
hybrid<void T> for hybrids without a fixed part, and
ref<void>, iref<void>, weakref<void> and ptr<void>, for referencess/pointers to "anything".
Coroutine Abstractions in High-level Languages
Similar to the old Mu model
Python's generator is similar to the old model. All arguments are supplied when the generator is created. The first g.send() must send None.
defgen(a,b,c):
print("Hello!", a, b, c)
d= (yield)
print("d = ", d)
returng=gen(1,2,3) # initialises a, b and cg.send(None) # must be None, otherwise TypeError: can't send non-None value to a just-started generator# prints Hello 1 2 3try:
g.send(4) # received by d. The generator throws StopIteration to the main coroutine.# prints d = 4exceptStopIterationase:
print("bye")
If the new model is desired, the program can be rewritten as:
defgen():
defactual_gen():
a,b,c= (yield)
print("Hello!", a, b, c)
d= (yield)
print("d = ", d)
returnag=actual_gen()
ag.send(None)
# now ag stops at the first yieldreturnagg=gen() # just creates the actual_geng.send((1,2,3)) # received by a,b,c at the first yield. a,b,c are not really parameters.# prints Hello 1 2 3try:
g.send(4) # received by d. The generator throws StopIteration to the main coroutine.# prints d = 4exceptStopIterationase:
print("bye")
Similar to the new model
Lua's coroutine model is similar to the new model. Arguments are supplied at the first coroutine.resume.
functiongen(a,b,c)
print("Hello", a, b, c)
locald=coroutine.yield()
print("d = ", d)
returnendco=coroutine.create(gen)
coroutine.resume(co, 1,2,3) -- initialises a, b, c. If less arguments are supplied, other parameters receive nil-- prints Hello 1 2 3coroutine.resume(co, 4) -- received by d-- prints d = 4
If the old model is desired, the code can be rewritten as:
functionmake_gen(a,b,c)
localfunctionactual_gen(a, b, c)
coroutine.yield()
print("Hello", a, b, c)
locald=coroutine.yield()
print("d = ", d)
returnendlocalcoro=coroutine.create(actual_gen)
coroutine.resume(coro, a,b,c) -- immediately initialises a,b,c. They are not parameters of actual_gen, but "up-values" of it.returncoroendco=make_gen(1,2,3)
coroutine.resume(co) -- It is stopping on its first empty yield() point-- prints Hello 1 2 3coroutine.resume(co, 4) -- received by d-- prints d = 4
In conclusion, both models of swapstack are equivalent. It just takes a few initial swapstack operations to mimic each other.
The text was updated successfully, but these errors were encountered:
This issue discusses the design choices of allowing functions and Mu instructions to return more than one value (or a tuple of values).
Status quo
The one-return-value assumption has a chain of causes and effects to the design of the IR.
First of all, Mu functions have multiple parameters and exactly one return value. i.e.
f : (P1,P2,P3,...,Pm) -> R
This is the tradition of C-like languages as well as LLVM. In case it does not return useful values, it returnsvoid
Following this tradition, all Mu instructions return exactly one value, too. e.g.
%rv = CALL <...> @some_func (%a1 %a2 ... %am)
. In case more than one values need to be returned, a function shall return astruct<T1 T2 ... Tm>
. In case no useful value is returned, it returnsvoid
. For example:Since all Mu instructions have a single return value, the SWAPSTACK instruction shall have exactly one return value, too. That return value is the value received from another stack (or injected by the client). Since one stack also passes value using the
SWAPSTACK
instruction, SWAPSTACK takes one parameter, too, since the peer only receives exactly one return value. e.g. in the sender:SWAPSTACK %receiver ... PASS_VALUE <@T> %val
; in the receiver:%retval = SWAPSTACK %sender RET_WITH <@T> ...
Since a swapped-away stack always stops at so called "resumption point" (resumable by swapstack) instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL), all of which receive exactly one return value, then the state of a stack is
READY<T>
when swapped away, where T is a single type -- the return type of the resumption point. This makes one important special case: a newly-created frame (by NEWSTACK or push_frame (OSR)). Currently since a function takes multiple parameters but SWAPSTACK passes only one argument, there is no way to supply all arguments during SWAPSTACK. As a compromise, NEWSTACK initialises all arguments and the subsequent SWAPSTACK passesvoid
.As a side effect, the return value uniquely identifies the instruction. This can bring some profits: during OSR, the variable that holds the return value can identify the OSR-point (instructions that may be the "current instruction" in a
READY<T>
stack) instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL, all happen to be resumption points. This is actually intentionally designed as so.). This is especially useful for the TRAP instruction, in which case the trap handler always first identify which TRAP is triggered. Because of the one-to-one mapping between instructions and return values, the term "SSA variable of an instruction" and "the instruction" are used interchangeably.Proposed change
The change starts with adopting "multiple return values".
First, functions now return 0 or more values. i.e.
f : (P1,P2,P3,..,Pm) -> (R1,R2,...,Rn)
. For compatibility with C programs (in the unsafe native interface) , a 0-tuple is returned when the C function returnsvoid
. i.e. eitherc_function: (P1,P2,...,Pm) -> (R1)
orc_function: (P1,P2,...,Pm) -> ()
.Then Mu instructions may return 0 or more values. When no useful values are returned, the instruction returns
()
. TheCALL
instruction has exactly as many return values as the callee's return values. For example:(%oldval %succ) = CMPXCHG WEAK SEQ_CST SEQ_CST %loc %expected %desired
:(%rv1 %rv2 %rv3) = CALL <..> @func (%a1 %a2 ...)
RET %rv1 %rv2 %rv3
() = BRANCH %blah(....)
or justBRANCH %blah(...)
Then SWAPSTACK can pass more than one values to the swappee. The swappee expects to receive a list of values (statically decided at the particular SWAPSTACK instruction). If the swapper does not match the value required by swappee, it has undefined behaviour. For example:
Since all resumption point instructions (CALL, TRAP, WATCHPOINT, SWAPSTACK, CCALL) may receive multiple values (in fact, all instructions can), the state of a "swapped-away" stack is
READY<T1 T2 ... Tn>
. When binding a thread to a stack, exactly that many values need to be bound.Then the state of a newly created frame is
READY<T1 T2 ... Tn>
where T1 T2 ... Tn are the parameter types of the function (also the entry block). NEWSTACK no longer supply arguments, but the first SWAPSTACK supplies the parameters to the new frame. For example:Now not all instructions are identifiable. If an instruction does not return value, e.g. a trap that does not expect values from the client (which is the most common case):
TRAP <> KEEPALIVE(...)
, then there is no identifiers for these instructions. As a compromise, explicit names can be given to instructions using a slightly different syntax:And SSA variables and instructions completely divorce. Now an instruction has a list of results, each of which is an SSA variable. The new SSA variable hierarchy is:
This change also undoes #43, a previous proposal to make void more like "unit". Now we have the real "unit" return value:
()
, and thevoid
type has only two use cases:hybrid<void T>
for hybrids without a fixed part, andref<void>
,iref<void>
,weakref<void>
andptr<void>
, for referencess/pointers to "anything".Coroutine Abstractions in High-level Languages
Similar to the old Mu model
Python's generator is similar to the old model. All arguments are supplied when the generator is created. The first
g.send()
must sendNone
.If the new model is desired, the program can be rewritten as:
Similar to the new model
Lua's coroutine model is similar to the new model. Arguments are supplied at the first
coroutine.resume
.If the old model is desired, the code can be rewritten as:
In conclusion, both models of swapstack are equivalent. It just takes a few initial swapstack operations to mimic each other.
The text was updated successfully, but these errors were encountered: