MemoryLayout<T>.offset for classes

Noticeably the method for getting the offset for members in classes is not implemented. Is there a good reason for this? This useful for passing pointers of members to FFI. Also some FFI returns a pointer to this member and you have no idea what the parent class is, using the offset you can reconstruct the pointer to the parent class.

Anthoer question is Unmanaged.toOpaque() is this giving a raw pointer to the class instance or a pointer inside the class where the actual members are (past any metadata)?

This is strange indeed and looks like a bug:

class C {
    var x: Int = 0
    var y: Int = 0
    var z: Int = 0
}

print(MemoryLayout<C>.offset(of: \.x)) // nil
print(MemoryLayout<C>.offset(of: \.y)) // nil
print(MemoryLayout<C>.offset(of: \.z)) // nil

Here's a quick & dirty workaround, although it is not 1:1 as it needs a class instance vs a class itself and it wants explicit fields rather than keypaths:

let c = C()
let address = Int(bitPattern: ObjectIdentifier(c))
let xOffset = withUnsafePointer(to: &c.x) { Int(bitPattern: $0) } - address // 16
let yOffset = withUnsafePointer(to: &c.y) { Int(bitPattern: $0) } - address // 24
let zOffset = withUnsafePointer(to: &c.z) { Int(bitPattern: $0) } - address // 32

It is not a bug; it is the documented behaviour.

A property has inline, directly addressable storage when it is a stored property for which no additional work is required to extract or set the value. Properties are not directly accessible if they trigger any didSet or willSet accessors, perform any representation changes such as bridging or closure reabstraction, or mask the value out of overlapping storage as for packed bitfields. In addition, because class instance properties are always stored out-of-line, their positions are not accessible using offset(of:).

Given that we have offset(of:), it seems reasonable that a version should exist for class instance members.

EDIT: The proposal explains a bit more about why they can't be supported:

Class properties are always stored out-of-line, and require runtime exclusivity checking to access, so their offsets would not be available by this mechanism.

3 Likes

Let's say you have a simple struct as a member in a class, you can be pretty sure it is stored in the class. Can't you make an offset method that can give the offset of a member if it is available. For properties that hasn't any in-line member, just return nil. Shouldn't this be possible to do in compile time?

The "value" of a class is just a reference to the underlying instance storage, and so MemoryLayout<C> gives you the information about how the memory of that reference is laid out. Notably, the "size" of a class (as reported by MemoryLayout) is unrelated to the size of the members of the class:

class C {
    var x: Int = 0
    var y: Int = 0
}

struct S {
    var x: Int = 0
    var y: Int = 0
}

MemoryLayout<C>.size // 8
MemoryLayout<S>.size // 16

Adding more members to S will increase the value of MemoryLayout<S>.size, but this is not true for C. This is what the portion that Karl has quoted means: the properties are always stored out-of-line from the reference value. It is not talking about things like the fact that having an Array within a class might additionally store a buffer out-of-line from the enclosing class instance members.

7 Likes

There's also the exclusivity angle, which is kind of interesting.

If you read the proposal, mutation through a keypath and offset pointer are pitched as being exactly the same:

var root: T, value: U
var key: WritableKeyPath<T, U>

// Mutation through the key path...
root[keyPath: \.key] = value

// ...is exactly equivalent to mutation through the offset pointer...
withUnsafePointer(to: &root) {
  (UnsafeMutableRawPointer($0) + MemoryLayout<T>.offset(of: \.key))
    // ...which can be assumed to be bound to the target type
    .assumingMemoryBound(to: U.self).pointee = value
}

(Aside: I think there's a typo here - it should be withUnsafeMutablePointer(to: &root))

For structs this works, because in order to get the pointer to the root value, you must be in a scope where you are formally accessing it, and accessing the struct also access all of its stored properties. In other words, exclusivity is still enforced and these two ways of accessing the property are truly equivalent.

For classes, this is not the case. Each property must be formally accessed individually, and mutating through an offset pointer would bypass that and not enforce exclusivity. So they would not be equivalent - mutating through an offset pointer would be less safe, and there is no way to opt back in to runtime exclusivity enforcement to get that safety back.

You might say that's okay - after all, you can only use these offsets via unsafe pointers anyway, so you already have to manually enforce lots of preconditions for safe access. Also, this is really all for the benefit of C interop, which is unsafe in all kinds of other ways.

Maybe it matters, maybe it doesn't; it's a very particular difference between stored properties in structs and classes, and I find it interesting that the proposal mentions it. Presumably that inherent unsafety makes it less desirable to add to the standard library.

3 Likes

I don't know too many details of how dynamic exclusivity enforcement works under the hood, but I wonder if we could expose withUnsafeExclusiveAccessTo(pointer) hooks or something that would allow the user who 'knows what they're doing' to write the exclusivity bounds themselves. :thinking:

1 Like

I agree it's not a bug. It just looks like a bug :slight_smile:

I see. I must point out that it is totally non obvious (and when I saw that in the documentation first time I thought that's a bug in documentation), but now after your explanation I can see what they meant.

Give a try to the workaround method I posted above?

Yes, I'm saying that's ok. When you are playing around with pointers, offset, casting unassociated types back and forth you are pretty much on your own. This includes potential aliasing problems. Also when it comes to classes, I'd say if you want to get the offset of a out-of-line property, just return the offset of the underlying meta data structure and the programmers can do what they want with it.

Trying to engineer safety into primitives that are inherently unsafe doesn't make any sense.

Thank you for the suggestion, not sure if this helps me because this requires a runtime calculation and you need an initialized class in order to calculate the offset. I guess I can make a fake class during initialization but the best solution would be if I could work with the type only in order to obtain the offset.

Looking back this still feels strange despite of how it is explained in the docs and the comments in this topic (e.g. this is merely getting the offsets we are talking about here, e.g. to print them out; accessing fields via unsafe means is a different story altogether). Wouldn't it be reasonable if MemoryLayout<SomeClass>.offset(of: \.someField) was a compile-time error instead of returning nil at runtime?

1 Like

A compile-time warning might be reasonable, but a compile-time error isn't. MemoryLayout<SomeClass>.offset(of: \.someField) is a valid expression and it would be odd to arbitrarily stop it from compiling.

This expression is as good as nil written explicitly. MemoryLayout<SomeClass> is not very useful for anything (e.g. size/stride/alignment always return the same 4 or 8 result as it returns for UnsafeRawPointer). It's just not currently possible to restrict the T parameter of enum MemoryLayout<T> to be "value types only".

This is very questionable design choice. Just because a property has "willSet" doesn't mean it suddenly "stops having any offset whatsoever" - offset I could use, say, to print it to the console:

struct S {
    @0, 8 bytes:    var x: Int
    nil, 8 bytes:   var y: Int { willSet { ...} } // oops
    @16 8 bytes.    var z: Int
}

It looks like here the API "second guesses" developer's intentions one step too far and tries to introduce extra safety to the API that is going to be used with (inherently unsafe) UnsafePointer anyway.

I think it’s perfectly reasonable that the API doesn’t return the offset of a stored value that happens to back a computed property (which includes properties with will/didSet handlers). But there really ought to be a way to access the storage of a class instance.