Custom iterator

Is this correct iterator implementation?

// const unsigned char* _Nonnull getMemory(NSInteger* _Nonnull count);
// void releaseMemory(const unsigned char* _Nonnull memory);

struct TestIterator: IteratorProtocol {
    var bytes: UnsafePointer<UInt8>?
    var index = 0, count = 0
    
    init() {
        bytes = getMemory(&count)
        precondition(bytes != nil && count >= 0)
    }

    mutating func next() -> UInt8? {
        guard let bytes else { return nil }
        guard index < count else {
            releaseMemory(bytes)
            self.bytes = nil
            return nil
        }
        index += 1
        return bytes[index - 1]
    }
}

It seems like a perfectly good Iterator, but not a very good owner of memory, because (a) you don't have to finish iterating through an iterator; you can just stop, leading to a leak…and (b) you can copy iterators (even though it would have been nice to disallow that), leading to a potential double release.

EDIT: if you are adapting a C API for your own project, the safest and most correct thing to do would be to use a class that owns the buffer and have struct iterators that keep references to it, but if that's too much trouble you can just call it "UnsafeTestIterator" and document how you expect it to be used. Not every project has to uniformly be held to the same standards as a public API.

1 Like

Is it possible to disallow iterator copying?

I’m sure the Language folks are working on something, but IteratorProtocol itself is frozen at this point and that means it has an implicit Copyable requirement. It’s also questionable to begin with, since some iteration wants to consume a collection and some wants to borrow from it, and only the former can currently be expressed in Swift.*

* This is not strictly true, a mutating _read accessor would allow for “borrowing iteration”. But it sure would be confusing to have a property with a side effect like that. You could also have separate value and advance operations, but then the iterator has to hold the current value; something dynamic like the sequence functions wouldn’t work, or at least not as well. Again, I’m sure the Language team is weighing alternatives.

1 Like

Interesting. I wonder, if we didn't have it already and did it from scratch, would we make iterator a reference type?

I think we would make it a non-copyable protocol. It’s not really expected to copy iterators, after all. But the compiler won’t stop you.

1 Like

Huh, indeed, changing struct TestIterator to class TestIterator and removing mutating from next works, and now copying this iterator is safe indeed (no crashes, no leaks, etc). Looks like a good choice in this case considering the alternative of making an extra reference type to keep the buffer as you suggested above.

"Copying" iterator prohibition bothers me slightly because it's not enforced and it would be easy to make this mistake and shoot oneself in the foot. Making it safe in the presence of accidental copying feels safer (although if we really want to prohibit copying I'd prefer having a runtime assertion at least).

Right, just using a class is now memory-safe, but “copies” of the iterator share their position, which is…probably not what a client would expect/want, but also not likely to come up in practice.

When it is a class and I user is doing let iterator2 = iterator, he/she is (consciously) not making a copy... a copy would be something like let iterator2 = iterator.copy() which is unavailable.. To put it differently, making it a class effectively prohibits creating "copies".. IRT shared position, IMHO that would be expected and behave similarly to FileHandle, which is a class and acts like an iterator with a slightly different API.

That is not quite right. Assigning to a local variable is still a semantic copy which just turns out to be a swift_retain under the hood for reference types. You can check if the object is uniquely owned and do a real copy if you want to. That's how Copy on Write (CoW) works. You can see this easily if you use an ownership modifier which will force you to use explicit copy operations in code e.g.:

final class Foo {}

func foo(_ foo: borrowing Foo) {
    // copy operator required or otherwise we get a compiler error 
    let copyOfFoo = copy foo 
}

I would recommend separating the iteration state, which should be per iterator, from the memory owner. You can do this by creating a separate class that allocates and deallocates the memory and your iterator is just a struct that holds an instance of this class + the index. If you access the pointer outside of the class that owns it you will need to use withExtendedLifetime to make sure that the memory doesn't get deallocated.

I have implemented something similar for NIOSSL a while a ago: Add `NIOSSLCertficiateExtensions` by dnadoba · Pull Request #359 · apple/swift-nio-ssl · GitHub

One big difference is that this implements RandomAccessCollection instead of IteratorProtocol directly. If you implement RandomAccessCollection you get a default implementation for Iterator which should be sufficient for your use case.

2 Likes

I actually like the flexibility of FileHandle model that is the reference object containing not just the "thing" but everything else including the "reading/writing position" as well:

  • if I need two independent "copies" I make two independent FileHandles:
let a = FileHandle(forReadingFrom: file)
let b = FileHandle(forReadingFrom: file)

and can iterate them independently (one file handle position increment won't effect another).

  • otherwise I could copy the reference:
let a = FileHandle(forReadingFrom: file)
foo(a)
bar(a)
func foo(_ fileHandle: FileHandle) { try! fileHandle.seekToEnd() }
func bar(_ fileHandle: FileHandle) { fileHandle is at end }