SIL Ownership Model Proposal (Refreshed)


(Michael Gottesman) #1

This is a refreshed proposal for the SIL ownership model that attempts to take what is currently in tree and rephrase the concepts into higher level concepts that are not based on the needs of the internal implementation details of the SIL ownership verifier. Below is an inline version and an online HTML version of the document is available here: https://gottesmm.github.io/drafts/sil-ownership-model/


Table of Contents

Abstract

Today, SIL does not express or validate ownership properties internally to a
SILFunction. This increases the difficulty of SILGening "ownership correct"
SIL and optimizing ownership in an effective, safe way. This document proposes
an ownership model for SIL based around an augmented form of SSA called
Ownership SSA (OSSA). This in combination with SIL expressing ownership
semantics along call graph edges will allow for entire Swift programs to be
verified as statically preserving ownership invariants. We describe OSSA
below:

Ownership SSA Form

Ownership SSA Form or OSSA is a derived form of SSA that augments the normal
SSA concepts (e.x. dominance) of SIL by defining the following rules: Let p be
a point in a SIL program then,

  1. Values are statically available or unavailable: For any dominating value
    v of p, there must exist a statically derivable mapping of v to one of
    {available, unavailable} at p.

  2. Unavailable values can not be used: A program is ill-formed if there
    exists a dominating value v of p that is both used at p and is
    unavailable at p.

  3. Values have static borrowed value sets: All dominating values v of p
    must be statically mappable to a (potentially) empty set of "borrowed" values
    at p.

  4. Values in a borrowed set must be available: A program is ill-formed if
    there exists dominating values v, v' of p with v' being unavailable
    at p and v' being an element of the borrowed set of v at p.

  5. Values with non-empty borrow sets must be available: A program is
    ill-formed if there exists a dominating value v of p that is both
    unavailable at p and has a non-empty borrow set at p.

  6. States are well-defined: A value v which dominates a block B must
    have the same incoming availability state and incoming borrow set from all
    predecessors of B.

  7. Values are never abandoned: Given a block B, a predecessor block
    Pred(B) of B, and a value v that does not dominate B, if v
    dominates Pred(B) then v must be unavailable in B. For the purposes of
    this rule, the terminators with no successors are treated as follows:

    • return, throw, and unwind are considered to have a destination which
      is never dominated by anything.
    • unreachable is considered to not have a destination and thus is
      unconstrained.
  8. Values cannot be resurrected: A dominating value v of p that is
    unavailable at p can not be available at any point reachable from p.

These rules naturally lead us to two important corollaries:

  • Uses can only cause values with empty borrow sets to become unavailable:
    If the evaluation of a use u of v causes v to become unavailable, then
    v must have an empty borrow set at u. Otherwise, we would be violating
    rule 4.

  • Uses can only add to borrow sets of available values. If the evaluation
    of a use u of v results in the addition of a value v' to v's borrow
    set, v must be available at u.

Using this model, we define 4 different categories of ownership semantics that a
value in SIL can have:

  • Any - An immortal value that after definition is always available. It can
    not have a non-empty borrow set and can not be placed into another value's
    borrow set. All trivially typed values have Any ownership as well as all
    SIL values in non-OSSA sil.

  • Owned - A value with a static set of joint-postdominating uses that
    consume the value. Owned values can have a borrow set, but are not allowed
    to be in another value's borrow set. This is used to model move only values as
    well as individual copies of reference counted values.

  • Guaranteed - A value v that is available within a single entry-multiple
    exit region of code and must be in the borrow set of a single value v' at
    all points within that region. This can be used to model "shared borrow" like
    constructs. Since a guaranteed value is in a borrow set at all points where it
    is available, naturally it cannot be consumed due to rule 4.

  • Unowned - A value that is only available up to the first side-effect
    having instruction after its definition. This is a special case needed for
    objc-compatibility and the ordering of the side-effect constraint is unmodeled
    in the ownership model today. We do model that unowned must be copied before
    being used in a guaranteed or owned context.

These are modeled as cases of the enum ValueOwnershipKind. Just like we
categorize the ownership semantics of values, we categorize the ownership
semantics of the effects of these upon their operand values:

  • Point - A read only use of an available value. The value can have any type
    of ownership semantics.

  • Consuming and Forwarding - A use of either an Any or Owned available
    value. If the input value has Any ownership, this is a point use since the
    input value is immortal. If the input value has Owned ownership, the value
    becomes unavailable after the use's execution.

  • Reborrow - A use that takes in an available value v with either Any
    or Guaranteed ownership. If v has Any ownership, this is a point use
    of the input value since it is immortal. If v has Guaranteed ownership,
    then the evaluation of the use adds all non-Any results of the use to all
    borrow sets containing v. The program is ill-formed if any of these results of
    the uses of v are available when v is unavailable. This results in recursive
    validation of the borrow scopes.

While Point and Consuming only effect the incoming value to the
instruction, Forwarding and Reborrow operands additionally constrain the
results of instructions as follows:

  • All Forwarding and Reborrow operands of an instruction must specify
    via an instruction specific rule the set of instruction results whose
    ownership semantics are dependent on the operand's ownership.

  • If all of the Forwarding and Reborrow operands mapped to a single
    result have Any ownership, then the result must have Any ownership.

  • If any of the Forwarding (Reborrow) operands mapped to a result have
    Owned (Guaranteed) ownership, then a non-trivially typed specified
    result must have Owned (Guaranteed) ownership. If the result value is
    trivially typed, then it will have Any ownership.

  • Ownership Consistency: A program is ill-formed if a Forwarding operand and
    Reborrow operand of an instruction specify that they effect the same
    instruction result.

These categories will be modeled in the compiler via the enum
OperandOwnershipUseKind.

SIL Changes for OSSA

To implement this proposal, several structural changes must be made to SIL. We
go through each below:

Functions

To distinguish in between functions in OSSA form and those that are not, we
define a new SILFunction attribute called [ossa] that signifies that a
function is in OSSA form and must pass the ownership verifier.

SILValue

We add two new methods to SILValue's API:

ValueOwnershipKind SILValue::getOwnershipKind() const;
bool SILValue::verifyOwnership() const;

The first API returns the ownership kind associated with the value. This is
implemented via a visitor struct that statically can map a value via its
ValueKind to the relevant ownership semantics.

The second API returns true if the value considered as a def is compatible with
its users.

Operand

The only changed required of Operand is the addition of a new API:

OperandOwnershipUseKind Operand::getOwnershipUseKind() const;

This returns the statically mapped use semantics of this operand.

Instruction

The addition of ownership does not change the textual or in memory
representation of SILInstructions but semantically adds new requirements upon
each instruction. Each of these ownership semantics will be documented in
SIL.rst and will be enforced via the ownership verifier. We describe the more
important categories of instructions below.

Borrow Instructions

The instructions are used to introduce new guaranteed values:

  • begin_borrow requires its argument to be available and adds its result
    value to the borrow set of its argument.
  • end_borrow requires that its argument value be an available result value
    of a begin_borrow that has an empty borrow set. Upon execution the
    end_borrow removes the begin_borrow value from /all/ borrow sets at
    the end_borrow.
  • load_borrow works analogously to begin_borrow.

Lifetime Maintenance Instructions

These instructions are used to manipulate the lifetime of non-trivial
values. Specifically:

  • copy_value acts as a Point use of its argument value and produces a
    new copy of the value with Owned ownership.
  • destroy_value acts as a consuming use of its argument.

Function Application/Coroutine Instructions

Function Application instructions such as apply, try_apply generally
have OperandOwnershipUseKind semantics that one would expect except for one
special case: function parameters that are passed @guaranteed. In this case
since the use of the @guaranteed value is immediate upon function
application, we treat the use as a point use.

In constrast, since coroutines are not evaluated instantaneously, we require
that coroutine instructions perform a form of "pseudo"
reborrowing. Specifically:

  • begin_apply can only accept available borrowed arguments in
    @guaranteed positions and adds itself to their borrow sets.
  • end_apply and abort_apply remove the begin_apply from the borrow
    sets of all its borrowed arguments. That the begin_apply is still in the
    borrow sets at that point is implied by the separate formation rules of
    begin_apply.

ForwardingOrReborrow Instructions

There are certain classes of instructions that we allow to have either
Forwarding or Reborrow operands. To ensure consistency, we only allow
for this property to be set when the instruction is constructed. Some examples
of these types of instructions:

  • Aggregate Forming Instructions - struct, tuple, enum.
  • Casts - unchecked_ref_cast, upcast, unconditional_checked_cast
  • Enum extraction operations - unchecked_enum_data.
  • Existential opening of classes - open_existential_ref

NOTE: That unchecked_enum_data is a special case since enums always contain
conceptually a single tuple as their payload.

Value Projections and Destructures

Value projections such as tuple_extract, struct_extract that allow for
one to access the internal fields of a value without destroying the value. This
implies that the original value can not be destroyed. As such we model these as
instructions with Reborrow semantics. In contrast, destructure operations
such as destructure_struct and destructure_tuple take in a single value
and split it up into the constituant first level fields of the value. As such,
we model destructure operations as having Forwarding ownership.

SILArgument

In order to support the ownership model, we require the following changes to
SILArgument:

  1. All SILArguments must specify a fixed ValueOwnershipKind on
    construction.

  2. All SILPhiArguments are treated like an instruction result of the set of
    predecessor terminators. This means that the rules around Forwarding and
    Reborrow operands apply to any such SILPhiArguments.

  3. Any SILArgument with Guaranteed ownership is not required to have pairing
    end_borrows. This is because a SILFunctionArgument naturally has an
    end_borrow like scope (the function itself) and SILPhiArguments with
    Guaranteed ownership rely on the end_borrow set of its incoming
    values.


(Andrew Trick) #2

I like that you were able to strip SIL ownership down to the fundamentals. Once we get this right, we can add more documentation and examples.

The first subsection in "Ownership SSA Form", with points 1-8, introduces new terminology that makes the following subsections harder to understand. It's talking about (owned) values and borrow sets, while subsequent definitions are trying to make the rules for individual SILValue instances. An attempt at formal SIL rules won't succeed until that's resolved. At least, I can't prove to myself that the subsequent rules are correct when thinking in only terms of the abstract formalism.

It is interesting to talk about the relationship between values and borrowed instances of that value, but it isn't actually necessary to define the rules in SIL. After the formal abstraction, we need to transition to talking about SILValues and scoped SIL operations rather than making reference to the borrow set terminology.

Once you have the notion of a scoped SIL operation. As in begin/end_borrow and begin/end_apply, that gives you a way to enforce the formal model one SILValue at a time. A value simply needs to be available at any use point, and scoped SIL operations extend the effective use up to the end of their scope.

For example, it's very strange to define "Guaranteed" ownership primarily in terms of a "single entry-multiple exit region of code", as opposed to defining it in terms of normal rules for SILValues and their uses.

The entire subsection of bullets after the phrase "Forwarding and Reborrow operands additionally constrain the results" is quite misleading. Note that if an operation is Forwarding, then it always takes ownership of its operand and produces an owned value. If an operation is Reborrowing, then it always reborrows a guaranteed value and produces a guaranteed value. This section should be replaced with an explanation that some SIL operations are capable of either Forwarding or Reborrowing their operand. The ownership effect is determined statically by their operand value's ownership. This in turn determines the ownership of the value produced by this operation: Owned or Guaranteed--actually a simple concept. An example would be good.

The "SIL Changes for OSSA" section looks great... except that some fully formed examples would help.

coroutine instructions perform a form of “pseudo” reborrowing

I don't know why this is a "pseudo" borrow instead of a normal reborrow. This would not be considered a special case if we just stopped talking about borrow sets and introduced the concept of a scoped use, which I think is a much more important concept for SIL developers to understand.

"ForwardingOrReborrow" could be "ForwardOrReborrow".


#3

Should ‘exit’ above read as ‘exists’?


(Michael Gottesman) #4

yes. Thanks.


(Andrew Trick) #5

The formalism in points 1-8 fundamentally distinguishes between owned and borrowed values, but the language doesn't always explicitly identify the kind of value that the rule applies to:

"[Owned] values have static borrowed value sets...All dominating [owned] values v of p..."

It may be worth introducing the concept of an owned and borrowed value before stating the rules. Multiple SSA values represent the same semantic value. A copy operation creates a new instance of that semantic value with independent lifetime. A borrow operation creates a new SSA value that refers to the same instance of the value and has dependent lifetime.

Later, we refer to borrowed values as guaranteed. If we're sticking with that terminology, then we need a transition between the abstract formalism in rules 1-8 and the SIL rules, where we would make statements like "borrowed values have guaranteed SIL ownership".

Also, as part of the transition, we should indicate that an end_borrow makes borrowed values unavailable and a consuming operation makes owned values unavailable. (The scoped borrow operation allows the compiler to model the dependent lifetime and enforce some subset of those formal rules).


(Michael Gottesman) #6

For anyone listening in, I am making some edits to the proposal given Andy's feedback.