If this was in response to my suggestion to remove implicit copying, I'm not suggesting that Vector be made non-copyable.
For a long time, the design for @noImplicitCopy included a type-level annotation for structs that were potentially expensive to copy. You can still copy them, but the compiler will tell you when those copies happen and give you an opportunity to avoid them. I still think it could make sense here - a lot of things in the language do just borrow by default (e.g. arguments to normal functions).
I think we have a general problem in Swift, with an unfortunate combination of implicit copying and very large (or just expensive to copy) value types. The takeaway of this discussion for me is a greater appreciation of how much we have depended on copy-on-write indirect storage to make implicit copies in the language model viable until now.
For example, slices and collection wrappers (ReversedCollection, and all the stuff in swift-algorithms). We're going to need to replace all of this stuff with types that borrow rather than copy - both for non-copyable collections, and possibly-expensive-to-copy collections like Vector - which is a huge amount of work but definitely seems worth it.
But this problem isn't limited to Vector or the Collection family of protocols - vectors are just easy ways to create enormous value types.
I mentioned recently that model structs in particular can suffer from similar issues because they are large and complex to copy, which is bad news for SwiftUI's result-builder syntax, since it makes heavy use of initialisers, which are consuming, and hence cause implicit copies:
Some of the replies noted that indirect structs could be a good solution for that problem.
I wonder if there is some way that we could allow for individual vectors to be annotated as indirect? So you could have an indirect Vector<10_000, String> and that would put it in a COW box.
AIUI, the current design being discussed for indirect structs allows both the entire struct and individual fields to be marked indirect, just like enums allow for the entire enum to be marked indirect or individual cases.
This means that if you have a var names: Vector<10_000, String> field in a struct, then you can just mark it indirect, and it should be possible to define something like:
struct Box<T: ~Copyable>: ~Copyable {
indirect var wrapped: T
}
extension Box: Copyable where T: Copyable {
// implement copy-on-write here (if indirect doesn't do it automagically)
}
For when you need an indirect local variable. Although whether that should be a struct like this or a property wrapper is open to debate.
This is, to be clear, a "vibes" thing rather than a formal analysis of memory safety. But the concern I have is that Vector is a new Collection type that it is really easy to pass into contexts where someone would treat it as a collection, in ways that tuple is rarely used as-is, because of the inconvenience of accessing its elements collectively. It's unlikely you are going to find an API that takes a 512-element tuple! The standard playbook for doing any sort of collection-y thing with an imported C array is that you take the tuple and form an UnsafeBufferPointer out of it, in a way that is IMO pretty noisy and easy to search for. If you pass this to an API that expects a collection and it misbehaves, I think it's not generally difficult to see that it was passed an UnafeBufferPointer and realize that some of its elements may not be initialized, and that special care needs to be taken to handle it. This is situation is called out repeatedly in its documentation, for example.
On the flip side, Vector, or whatever we end up calling it, seems to be generally useful for a lot of things that will never leave it unitialized, and also a likely currency type for those operations. I think it's a lot less likely that someone who gets a Vector that has been handed around half a dozen times is thinking about whether it originated from a C API where some of the elements were left uninitialized. In most Swift codebases, unsafe types quickly "expire" as people write safe layers on top of unsafe C APIs. Nobody wants to expose an UnsafePointer in their Swift interface! But a Vector seems attractive to plumb through directly.
I don't disagree with you that calling any C function is, in theory, inherently unsafe. But in practice when tracking down memory safety errors a C array imported as a tuple doesn't usually go very far from the place that produced it, because its shape makes it practically impossible to accidentally pass into an API that could accept it as-is. The type that people reach for to represent it instead has "unsafe" in its name. For Vector, you don't need to do this, and you can basically just hand it directly to any function that is willing top operate on one. I see this as allowing it to travel very far in a codebase under the guise of a type that doesn't obviously indicate that it is partially initialized. Fundamentally, and somewhat counterintuitively, a homogenous tuple is not actually very useful, while a Vector is. So my concern centers around what happens when someone tries to actually use one of these because it's a lot easier to do so and it's in a state where this is unsafe.
Would CArray differ from Vector in that it may contain initialized elements? Types like CInt canβt have an invalid or uninitialized representation, so CArray would probably need Unsafe in the name to indicate this difference. At that point, the βCβ becomes superfluous, as the type could be named UnsafeVector to convey the same meaning in a language-agnostic way.
Long needed addition, and imo the name is right: Vector<Double> will basically be what many people would recognize as a vector in math or 3d-graphics context, and we should try to do better than other languages rather than repeating their errors.
Although Swift already has some serious surprises for mathematicians (Set vs Sequence⦠and + does what when used on arrays?!? ;-), it is never too late to embrace established conventions.
I want to share a semi-useful thought that occurred to me several times while following this thread:
Software development != Mathematics
What I mean by that is that - while certainly connected - the craft of software development is not the craft of mathematics. Each has their own historically grown and established lingo, and often concepts with the same name don't fully match.
So, without any desire to discuss any specific term, I think "not surprising a mathematician" is much less a goal than "not surprising a software developer".
It is basically two languages which have a Vector that does not match the original "specification", isn't it? Programming languages rise and fall, but math has been here for hundreds of years, and it will (hopefully) still be taught to students when anyone hardly remembers C++, Java or even Swift.
Even today, I don't think it is that common to learn Swift after learning C++, whereas every student should know how to do basic calculations with vectors.
Really looking forward to Vector and Span, and being able to forget all about Unsafe[...]Pointer! (sic)
As far as safely interoperating with C libraries goes, the client of a Swift wrapper should expect that whatever they are handed is safe to use. Checking whether or not a C type's been properly initialised before sticking it inside a Swift type is kind of the main point of wrapping C code, so a library that doesn't do this isn't really doing its job... It's not the Vector that implies the safety, it's the wrapper.
On the point about accidentally copying huge Vectors all the time, it's obviously something to mention in bold type in the documentation, but as Karl pointed out, it's a problem with the design of Swift collections as a whole. And also a question of choosing the right tool for the job. Are there really any egg-laying woolly-milker-pig types in the standard library? I thought that's why people keep coming up with new ones.
As for the name, I'm afraid I just don't see the problem:
You can turn a Vector into a proper mathematical vector with a couple of lines in an extension (where Element == SomeCoordinateType), but it'd take a lot more effort to turn it into a fixed-length Array. That's how metaphors are supposed to work.
Vectors imply slots, arrays feel more like strings (it'd be weird to implement a vector type as a linked list; for an array, it's just ill-advised).
I've been confused by the STL's usage for thirty years (interesting to learn that Stepanov sympathises).
Anyway, whatever it ends up being called, please don't stick bits of fruit all over it, just to handle a few edge cases. It's a simple thing.
I'm coming a little late to this but +100 to this proposal. The name is fine, it reflects what it does perfectly and docs will clear anything up if there is confusion. This feature is utterly necessary for embedded systems and being able to reason about performance and memory use on those systems. Also +1 on the C interop, not having to deal with a n-length tuple will be a huge help.