Efficient yielding accessor ABI

Meaning, e.g. one whose type depends on a generic parameter? Otherwise, I can't imagine what this means.

I assume that would be necessary because the accessor itself may not be labeled async? Regardless, why wouldn't that be possible? I can understand not wanting to pay for generating two versions of the code, of course.

Sure, and no shade on you for taking that approach in the first place, but at some point we lock down ABIs and take the consequences. I'm really asking about the reasons for using coroutines ongoingly.

Edit: and can you offer an example of a situation with inversion-of-control problems?

In Swift, all struct and enum types except those declared @frozen are dynamically-sized across resilience boundaries.

1 Like

As in Hylo. Thanks, so easy to forget.

1 Like

The async pattern, for one, if you don't simply re-analyze it as a static scope within an async function. But more generally, imagine having a non-copyable type representing the access which you could just dynamically create and destroy.

1 Like

John, I wish I could understand any of what you wrote here. I can vaguely guess at the latter one but really I have no clue. Would you mind elaborating? It sounds interesting

The coroutine model is essentially the accessor returning a continuation function that the caller will call to end the access. There’s a lot of complexity that’s trying to get the allocation of the coroutine’s local context off the heap, but fundamentally, conceptually, the accessor is returning a continuation function that the client can call whenever it wants as long as it maintains the invariants of the accessor (e.g. the validity and exclusivity of arguments) the entire time. You can imagine a sort of lens value (not really the right term) that stores this continuation and can be consumed to call it, making the scope of the access completely dynamic.

1 Like

But not completely, right? To me that would mean it could escape. And if it can’t escape then you have the Hylo model.

I can certainly imagine an API which safely composes them dynamically. My real point is just that it’s more permissive in ways that we’ve already benefitted from. Certainly, if we accept that we only need to design around the current feature set of the language, you can optimize around that.

2 Likes

May I ask if I could get pointers to the relevant sources implementing code generation for accessors in swiftc? I have had a very hard time understanding LLVM’s API for coroutines and thought it would help me to look at Swift’s implementation. Unfortunately I got lost the last time I tried.

Also I’d like to confirm/refute some of my understanding so far, since Hylo’s implementation probably overlaps a lot with Swift’s.

Let’s bring an example to set the stage. Feel free to skip to the end for my actual questions. Say I have a subscript like this in Hylo (let’s forget about exception for the moment):

subscript neg(x: inout Int): Int {
  inout {
    var y = -x
    yield &y
    &x = -y
  }
}

which in Swift could be written like that:

extension Int {
  var neg: Int {
    // Or I guess `yielding mutate` with SE-0474
    _modify {
      var y = -self
      yield &y
      self = -y
    }
    get { -self }
  }
}

During code generation I will emit neg(_:) roughly like this:

declare void @_val_slide(ptr, i1 zeroext)

define private { ptr, ptr } @neg(ptr %0, ptr noalias nofree %1) {
prologue:
  %2 = alloca %Int, align 8
  %3 = alloca %Int, align 8
  %4 = call token @llvm.coro.id.retcon.once(i32 8, i32 8, ptr %0, ptr @_val_slide, ptr @malloc, ptr @free)
  %5 = call ptr @llvm.coro.begin(token %4, ptr null)
  br label %b0

b0:                                               ; preds = %prologue
  call void @"Int.prefix-"(ptr %1, ptr %3)
  %6 = call i1 (...) @llvm.coro.suspend.retcon.i1(ptr %3)
  call void @"Int.prefix-"(ptr %3, ptr %2)
  call void @llvm.memcpy.p0.p0.i32(ptr %1, ptr %2, i32 8, i1 false)
  %7 = call i1 @llvm.coro.end(ptr %5, i1 false)
  unreachable
}

As we can see, we’re using the llvm.coro.id.retcon.once intrinsic. Looking at this test from swiftc I believe Swift is doing the same. In fact, Hylo’s output is very similar to the one expected from the Swift test.

Then I’m using the default optimization pipeline and to let LLVM apply its splitting magic. Eventually the call site will look like this, which AFAICT is also quite similar to what eventually happens in Swift:

%2 = alloca %Int, align 8
; ...
%5 = alloca [8 x i8], align 8
%6 = call { ptr, ptr } @neg(ptr %5, ptr %2)
; ...
%8 = extractvalue { ptr, ptr } %6, 0
call void %8(ptr %5, i1 false)

With that context I have a couple of questions:

  1. The compilation scheme that @dabrahams suggested is not implemented by a _modify accessor, correct? I’d have to study what swiftc does when I use the yielding mutate syntax, correct?
  2. Is there a way I can ask swiftc to emit LLVM IR before coroutine splitting?
  3. Does swiftc uses something other than the default pipeline to perform coroutines splitting and if so where can I study that?
  4. Looking at this test it seems that the first argument passed to llvm.coro.id.retcon.once is -1. As I failed to find that documented in LLVM, I wonder why. Perhaps that’s linked to my next question.
  5. I noticed Swift’s emitting llvm.coro.id.retcon.oncedynamic (emphasis on the last component). As I failed to find that variant of the intrinsic documented in LLVM, I wonder what that does.
  6. Are there any other relevant differences between the code that Hylo and Swift produce in the above example that I would have missed?
  7. I have been unable to observe the behavior @Joe_Groff described, about the context of the ramp being left unpopped on the call stack; all definitions in LLVM IR look like good old functions. Perhaps there’s a test somewhere that I could use to study this special behavior?

-emit-irgen will get the raw output of IRGen without running any passes. We do not use a nonstandard splitting pass.

2 Likes

I feel like @Alvae asked a lot of good questions that have gone unanswered. I'd still like to know the truth here.

1 Like