Yet another request for AnyValue, but along with Copyable this time

Here's the iterator type for a wrapping sequence that removes any duplicate value, even if the subsequent elements aren't next to the first one:

/// An iterator wrapping another and vending each element value at most once.
public struct DuplicateRemovalIterator<Base: IteratorProtocol, Set: SetAlgebra>
where Base.Element == Set.Element {
    /// The iterator to be filtered.
    var base: Base
    /// The elements to exclude from vending.
    var denyList: Set
}

extension DuplicateRemovalIterator: IteratorProtocol {
    public mutating func next() -> Base.Element? {
        while let upcoming = base.next() {
            if denyList.insert(upcoming).inserted {
                return upcoming
            }
        }
        return nil
    }
}

The code works just fine. But what happens if the Set argument is a reference type? Copying wouldn't actually make an independent object. My insertions into the set for accounting purposes has just corrupted an application-level object. Oops. I guess the luser should have not used class types outside of stuff that needs to inherit/use Apple's APIs. (Even then, Apple is trying to minimize that starting with SwiftUI.)

I can/should document it for my wrapping types, but Swift is so weighted to both use value types and to assume user-submitted types are value types that it'll require a lot of documentation. The AnyValue protocol concept is less to find a common interface for value types, but to poison use of reference types.

I think both Any and AnyObject can be used as existential box types. I guess we should do the same for AnyValue; it basically has an implementation like Any, with any optimizations for storing reference-type instances possibly removed.

Yes, I know we've discussed that what we really need to watch out for is value semantics, not value types. (There are ways to provide one without the other.). That's where the request to copy objects comes from. Right now, there is no definitive way to copy objects, because the Swift library assumes that value types and their built-in assignment (along with copy-on-write for inner class-based representations) will be enough. I think only three protocols (FloatingPoint, BinaryInteger, and RangeReplaceableCollection) provide ways to create independent copies of a value. There should be a new set of protocols to flag copy-construction, so we can use said semantic to ensure independent copies.

/// A type whose instances can make independent copies of themselves, but not
/// necessarily detached references to contained objects.
protocol ShallowCopier {

    /// Creates a copy of the given instance.  Properties or elements that are
    /// of reference types may still point to the same instances from the
    /// source; otherwise, this instance and the original can modifiy
    /// sub-objects independently.
    ///
    /// The initialzer should return `nil` only if there's a connected
    /// reference-based resource that shouldn't be shared, like a network port
    /// or a large file in virtual memory.
    init?(copying original: Self)

}

/// A type whose instances can always make independent copies of themselves,
/// except that reference-based sub-objects may not be necessarily detached.
protocol StrongShallowCopier: ShallowCopier {

    /// Creates a copy of the given instance.  Properties or elements that are
    /// of reference types may still point to the same instances from the
    /// source; otherwise, this instance and the original can modifiy
    /// sub-objects independently.
    init!(copying original: Self)

}

/// A type whose instances can make independent copies of themselves, including
/// at the sub-object level.
protocol Copier: ShallowCopier {

    /// Creates a copy of the given instance.  Even the sub-objects (*i.e.*
    /// properties or elements) should be independent copies from the analogous
    /// sub-objects in the source.
    ///
    /// The initializer should return `nil` if there's a connected resource that
    /// can't have an independent duplicate, like a network port or a large
    /// read-write file in virtual memory.
    init?(deeplyCopying original: Self)

}

extension Copier {

    // So types conforming to Copier only have to implement one initializer.
    public init?(copying original: Self) {
        self.init(deeplyCopying: original)
    }

}

/// A type whose instances can always make independent copies of themselves,
/// including at the sub-object level.
protocol StrongCopier: Copier, StrongShallowCopier {

    /// Creates a copy of the given instance.  Even the sub-objects (*i.e.*
    /// properties or elements) should be independent copies from the analogous
    /// sub-objects in the source.
    init(deeplyCopying original: Self)

}

The default numeric and string types should conform to StrongCopier. The default collection types should conform to ShallowCopier, upgrading to one of the others when all of the generic arguments match. The automatic conformance structural, enumeration, and tuple types can have for Equatable, etc. can be extended to ShallowCopier or higher. Yes, this is a lot of busy work for both the Swift and Apple-platform teams, but it should be relatively quick and represents a facility that should have been in Swift 1 or 2.

(If we go up to an ABI Version 2, or add retroactive base protocols, RangeReplaceableCollection should refine ShallowCopier. And conditionally higher if the Element type matches.)

Here's some previous threads I found:

  • Oh dear, I'm having an NSCopy/NSZone flashback.
  • Maybe you can still use Set<Element> here?

In all seriousness, I don't think NSCopy-style operation works for Swift. The act of copying is so ingrained in the language that you'd expect most struct to properly handle it, either by being trivial or adopting CoW.

I don't define any zones; whatever default allocation routines Swift uses for instances stay as-is.

The very problem is that although we expected it to be properly handled, and a lot of library code expects it to be properly handled, it technically isn't. That expectations breaks when the user innocently swaps in a reference type.

BTW, my default functions do use Set. The core code intentionally triggers on SetAlgebra instead so non-Hashable types can work and non-equality comparisons (e.g. case-insensitive strings) can work.

Do I have the sub-typing correct? Right now, a shallow-copying type can implement its copying by doing deep-copying 100% of the time. This allows the deep-copying protocols to refine the shallow-copying ones.

  • Is it workable that deep-copying types can pose as shallow-copying types?
  • Should it be flipped so shallow-copying types can pose as deep-copying types?
  • Or should the two capability lines be separated?

I was thinking that assignment-copying an Array does a shallow-copy on reference-type Element instantiations, and we need to take extra steps for deep copying. That's where I got the idea for the current hierarchy.

I’m not talking about the zone part. Even back in the day, zones are largely left to be nil. Adding .copy function is theoretically fine and all, but people will just do:

let copy = value

Also, while explicit copy is different from CoW, I think the feature overlaps far too much, and Swift is known to favor the latter.

The way you phrase it that shallow copy may leave the reference intact should mean that deep copy refines shallow copy. It’s not inherent to the notion though, strictly-shallow copying to might be required in some scenarios.

Hmm, I guess I should sever the relationship, and not have either copying protocol refine the other.

...

That was two days ago. The next day, I've rethought the design into:

/// Returns a new object that is observationally equivalent to the given value,
/// where subsidiary objects of reference type are themselves cloned (instead of
/// re-referenced) to the given depth.
///
/// Objects that can be readily cloned are either are of a value type or conform
/// to `Cloner` (or both).  See the documentation for `Cloner` for more details
/// on the copying semantics.
///
/// Unlike user-defined generic functions, the generic argument is always
/// pierced to determine if that type conforms to `AnyObject` and/or `Cloner`.
///
/// - Precondition: `depth >= 0`.
///
/// - Parameters:
///   - original: The source object to be copied.
///   - depth: When equal to zero, the returned object should re-reference the
///     same subsidiary objects of class type as `original`.  Otherwise, those
///     subsidiary objects should be cloned akin to a call to
///     `clone(subObject, depth - 1)` for the analogous positions in the
///     returned object.
/// - Returns: A new object as created by `.init?(_: toDepth:)` if the source
///   type conforms to `Cloner`.  Otherwise, a copy created akin to a constant
///   declaration when the source is a value type, or `nil` if it's a reference
///   type.
func clone<T>(_ original: T, toDepth depth: Int) -> T? {
    // PUT COMPILER MAGIC HERE
}

/// A type where an instance may be an independent copy of another instance,
/// with various levels of copying for sub-objects.
protocol Cloner {

    /// Creates a copy of the given instance, where sub-objects are themselves
    /// cloned to the given depth.
    ///
    /// The cloning of sub-objects should target those that act like elements of
    /// a collection or a wrapped object of an `Optional`.  Other sub-objects
    /// that are needed to manage the top object's accounting of its data and
    /// its invariants should apply a copying philosophy that keeps the top
    /// object's invariants stable.
    ///
    /// For instance, take a value type like `Array` take stores its elements
    /// with a copy-on-write philosophy upon an auxillary sub-object of a
    /// `class` type.  If the cloning `depth` is zero, then the elements act as
    /// if they were shallow-copied and simply reusing a reference to the
    /// source's auxillary object is sufficient until either the source or copy
    /// invokes C.o.W. on a mutating call.  When the `depth` is above zero, then
    /// the new object needs to make a new auxillary sub-object anyway to store
    /// all the cloned elements.
    ///
    /// The required intiailzier is fail-able because certain instances may not
    /// be copyable.  For instance, a type like `Data` may link its stored bytes
    /// into memory, or into a page file representing virtual memory.
    /// Sometimes, the page file(s) may be too big to copy, leading to that
    /// instance to fail to be copied.  If a sub-objects copying fails, so must
    /// the containing object(s)' copying.
    ///
    /// - Precondition: `depth >= 0`.
    ///
    /// - Parameters:
    ///   - original: The source object to be copied.
    ///   - depth: When equal to zero, the new object should simply copy all the
    ///     subsidiary objects across akin to a constant declaratin, even if
    ///     that would result in a shallow copy.  Otherwise, the subsidary
    ///     objects should be cloned across as if by a call to
    ///     `clone(subObject, depth - 1)`.
    /// - Postcondition: The new object is observationally equivalent to
    ///   `original`, except element(-like) sub-objects are cloned when `depth`
    ///   is above zero.  Fails if either at least one sub-object is of a
    ///   reference type that does not conform to `Cloner` and/or at least one
    ///   sub-object conforming to `Cloner` fails its initializer.
    init?(copying original: Self, toDepth depth: Int)

}

/// A type where an instance can always be made as an independent copy of
/// another instance, essentially ignoring the sub-object cloning depth.
protocol StrongCloner: Cloner {

    /// Creates a copy of the given instance, where sub-objects themselves are
    /// either always successfully cloned or are used in a context where their
    /// clone status doesn't matter.
    ///
    /// Since cloning must always succeed, the element(-like) sub-objects of
    /// the conforming type must have types that match (at least) one of:
    ///
    /// - A value type.
    /// - Conforms to `StrongCloner` themselves.
    /// - Conforms to `Cloner`, but used in a context where `depth` isn't
    ///   reduced.
    /// - A reference type that can't clone, but used in a context where cloning
    ///   isn't applied.
    ///
    /// As with `Cloner`, cloning shouldn't be applied to instance properties
    /// that are not user data, but are instead used for accounting to maintain
    /// invariants.
    ///
    /// - Precondition: `depth >= 0`.
    ///
    /// - Parameters:
    ///   - original: The source object to be copied.
    ///   - depth: The copying depth of sub-objects, which should be ignored for
    ///     conforming types.
    /// - Postcondition: The new object is observationally equivalent to
    ///   `original`.
    init!(copying original: Self, toDepth depth: Int)

}

Instead of separate shallow- and deep-copying protocols, they're fused into one where the cloning depth is a function parameter. For the guys that work on the Swift runtime and compiler, note the part for the new global function where I say:

Unlike user-defined generic functions, the generic argument is always pierced to determine if that type conforms to AnyObject and/or Cloner.

Is this implementable? Or at least implementable most of the time? (If it's just most of the time, then we can define the function to return nil when conformance testing fails.)