StrictValueType protocol

henrikdvn · February 20, 2023, 2:55pm

I propose a protocol named StrictValueType which requires that conforming types are value-types in the traditional sense, i.e. types with fixed memory footprints that can be determined compile-time. With this protocol it will be possible to define types that can be:

stack allocated,
managed without ARC and
passed as parameters to low level C functions

using standard Swift protocols and generics.

Conforming types should include C value-types, primitive Swift types and Swift/C structs with conforming members. Reference-types and simulated value-types such as String, Array etc. should not be included. Neither should structs containing such types.

A reasonably safe, stack allocatable, no-ARC manageable array could be defined like this:

struct StrictValueArray<Storage: StrictValueType, Element: StrictValueType> {

	private var storage = Storage()

	var capacity: Int {
		return MemoryLayout<Storage>.stride / MemoryLayout<Element>.stride
	}
		
	subscript(index: Int) -> Element {
		get {
			assert(index >= 0 && index < capacity, "Index out of range")
			return withUnsafeBytes(of: storage) { (rawPtr) in
				return rawPtr.baseAddress!.assumingMemoryBound(to: Element.self)[index]
			}
		}
		set(newValue) {
			assert(index >= 0 && index < capacity, "Index out of range")
			withUnsafeMutableBytes(of: &storage) { (rawPtr) in
				rawPtr.baseAddress!.assumingMemoryBound(to: Element.self)[index] = newValue
			}
		}
	}
}

A string-convertible type with memory footprint equivalent to the C-type "char var[n]" could be defined like this:

struct StrictValueString<Storage: StrictValueType> {

	private var storage = Storage()

	var capacity: Int {
		return MemoryLayout<Storage>.stride
	}
	
	var stringValue: String {
		get {
			return withUnsafeBytes(of: storage) { (rawPtr) in
				return String(cString: rawPtr.baseAddress!.assumingMemoryBound(to: CChar.self))
			}
		}
		set(newValue) {
			assert(newValue.count < capacity, "String value overflow")
			strcpy(&storage, "\(newValue)")
		}
	}
}

ksluder · February 20, 2023, 3:53pm

Your proposed StrictValueType already exists as _Trivial.

henrikdvn · February 20, 2023, 5:03pm

Just tested _Trivial. String and class types were accepted without complaints.

_Trivial is not a protocol. According to the documentation it is a constraint that can be applied within a @_specialize function which "currently acts as a hint to the optimizer".

I guess _Trivial may (in principle) do the job at some point in the future, but a protocol would a better option in my opinion.

Joe_Groff · February 20, 2023, 5:22pm

_Trivial is, as you noted, currently an internal layout constraint that's only used by specialization. But we do plan to expose it as a BitwiseCopyable generic constraint that can be used on generic parameters and protocol requirements soon.

tera · February 20, 2023, 5:24pm

I welcome this proposal. Not sure if the name is the bestest.
Is this the same as "POD" conceptually or will there be some differences between the two?

Let me clarify: this is a pseudo protocol, and the conformances would be automatic, right? No way to opt-in, and perhaps no need to have a way to opt-out either.

I understand that what you outlined is a sketch, a few suggestions though:

A small typo in capacity – should be "size" in the numerator ("stride" in denominator is good).
I'd also change "asserts" to "preconditions" so release builds are checked as well.
The setter in stringValue implementation looks incorrect. Try testing it with emoticons or just non-ascii symbols, etc.

How do you guys use _Trivial? It says "Error: Cannot find type '_Trivial' in scope" for me.

This is not good then.. StrictValueType should neither accept classes nor "structs with classes", nor "structs with structs with classes", etc.

ksluder · February 20, 2023, 5:36pm

What if the class is move-only? Then you can bitwise-copy the struct as long as you destroy the original.

Joe_Groff · February 20, 2023, 5:42pm

I think that question is a good example of why "BitwiseCopyable" or something like it is a more descriptive name than traditional C++ terms like "trivial" or "POD", or other more indirect terms. If a type contains move-only things, the container also can't be copied, so it wouldn't be bitwise copyable.

ksluder · February 20, 2023, 5:44pm

But I can still imagine wanting to bitwise copy a struct with a move-only class reference in it (or a move-only struct with a known-unique class reference) if I were implementing a custom allocator.

I guess these are both possible to implement with unsafe memory operations and a move-only constraint, rather than a bitwise-copyable constraint.

tera · February 20, 2023, 5:46pm

I don't know if the original StrictValueType had this property or not, but I believe it is important to have a notion of types that could be fwritten and recreated with fread after app relaunch - obviously that won't work with references. I.e. this is much stricter than just "bitwise copyable".

Joe_Groff · February 20, 2023, 6:02pm

That sounds more like a bitwise move to me, since copying the bits of the value with the reference necessarily needs to transfer ownership of that reference to the value in the new place.

Yeah, that would have to be a stricter constraint, since things like pointers, metatypes, and other process-specific values can be bitwise-copied within the context of a process, but aren't serializable.

bbrk24 · February 20, 2023, 6:03pm

Perhaps a bit like Sendable: automatic for internal types, but explicit for public types as to prevent accidentally guaranteeing something it shouldn’t.

tera · February 20, 2023, 7:33pm

This deserves an example. E.g. you are making your own type like CGPoint and don't want it to be StrictValueType (why?).

On the naming front, "serialisable" to me suggests "an ability to serialise / deserialise", e.g. it could be achieved with some "var serialiseToBytes: Data; init(serialisedData: Data)" requirement, or the same way as Codable is done – in these regards using the name like "Serialisable" instead of "POD" would be quite confusing.

The term Passive data structure (PDS) is also used in the industry.

To sum up, we are discussing a few quite different things here:

POD/PDS values (those can be recreated as "copies" from bits in a different process).
values that can be recreated as copies from bits in the same process only.
a value that can be recreated as a unique copy from bits in the same process (original value is destroyed). Attempt to create a second copy from the same bits should fail (somehow), or successfully create a new unique copy with the previously unique copy destroyed the moment the new unique copy is created (Edited).

Coming from C++ the 1st option is the simplest to grasp.

bbrk24 · February 20, 2023, 9:07pm

Let me take an example from a library I've worked on:

public struct Scope: Equatable {
    internal let value: UnderlyingType
    
    public static let transient = Scope(value: .transient)
    public static let singleton = Scope(value: .singleton)
    public static let weak = Scope(value: .weak)
    
    internal enum UnderlyingType: Equatable {
        case transient
        case singleton
        case weak
    }
}

Here, we've intentionally hidden the underlying enum. Only a wrapper struct is exposed. If, in the future, I add a new case to the enum:

case foo(String)

Suddenly the wrapper struct is no longer StrictValueType, and it could be a breaking change to clients.

tera · February 20, 2023, 11:33pm

Note that we have a similar simple use case already:

Library defines:

public enum E { case e }

Client starts using it:

struct S: Equatable { var e: E }

then library adds a new enum constant:

public enum E {
    case e
    case n(NSObject)
}

breaking the client code as "E" is no longer Equatable.

Note that we don't have an opt-out from the automatic equatable conformance now. Perhaps there are other similar cases.

Jumhyn · February 21, 2023, 12:37am

IMO the implicit conformance of enums to Equatable is bad and I would love to see it removed if there’s a way to make the (likely significant) source break manageable. I certainly wouldn’t to see it used as precedent to introduce additional implicit conformances. (Though as with Sendable it’s not really an issue for non-public types.)

bbrk24 · February 21, 2023, 1:03am

Note that for public enums, adding a new case is already source-breaking. That's why we made that enum internal.

tera · February 21, 2023, 2:07am

This sounds like a reasonable compromise.

How does this mechanism work exactly? For example this compiles:

internal class S1 {}
public class S2 {}

func foo() {
    let s1: Sendable = S1()
    let s2: Sendable = S2() // compiles just fine
}

Shouldn't S2 require an explicit "Sendable" marker here?

henrikdvn · February 21, 2023, 11:56am

Ok. I guess that will do the trick. Good news for grumpy old C programmers like me

Yes. I now realise that "stride" in the numerator only works when Storage is a homogenous tuple consisting of Element values.

So: any chance of getting a HomogenousTuple constraint any time soon?

Something like this would of course also be useful:

HomogenousTuple.init(repeating: Element, count: Int)

Should be BitwiseCopyable if Element is BitwiseCopyable

henrikdvn · February 21, 2023, 2:44pm

I suggest opt-in for custom structs and automatic for primitive values. Not sure if it is possible to enforce protocol conformance for C types, but it would certainly be useful.

Agreed. But, as noted above, I think "stride" in the StrictValueArray struct should be ok when Storage is a homogenous tuple consisting of Element values. StrictValueString capacity, however, should definitely be based on "size".

Point taken

Think the setter works, but the size calculation is wrong. Should be changed to

newValue.lengthOfBytes(using: .utf8)

Here is my test code:

class Product {
	var name: String

	init(name: String) {
		self.name = name
	}
}

struct TrivialStorage<Storage, Element> {

	var storage: Storage
	@_specialize(where Storage: _Trivial, Element: _Trivial)

	init(storage: Storage) {
		self.storage = storage
	}
}

var products = TrivialStorage<Array<Product>, Product>(storage: [Product(name: "Chair")])

print("\(products.storage[0].name)")
// Chair

henrikdvn · February 21, 2023, 3:56pm

I fear this is just wishfull thinking, but given a BitwiseCopyable protocol which corresponds to _Trivial and a HomogenousTuple protocol, it would be possible to define a safe and storage efficient generic BitwiseCopyableArray like this:

// Type "without any reference counted properties"
protocol BitwiseCopyable {
}

// Tuple containing only Element values
protocol HomogenousTuple {
	associatedtype Element
	init()
}

struct BitwiseCopyableArray<Storage: HomogenousTuple, Element: BitwiseCopyable> where Storage.Element == Element  {
	
	internal var storage = Storage()
	
	var capacity: Int {
		return MemoryLayout<Storage>.stride / MemoryLayout<Element>.stride
	}
	
	subscript(index: Int) -> Element {
		get {
			precondition(index >= 0 && index < capacity, "Index out of range")
			return withUnsafeBytes(of: storage) { (rawPtr) in
				return rawPtr.baseAddress!.assumingMemoryBound(to: Element.self)[index]
			}
		}
		set(newValue) {
			precondition(index >= 0 && index < capacity, "Index out of range")
			withUnsafeMutableBytes(of: &storage) { (rawPtr) in
				rawPtr.baseAddress!.assumingMemoryBound(to: Element.self)[index] = newValue
			}
		}
	}
}