How to define self-referential types with pointers which are never nil

For a circular doubly linked list this would be a simple node type:

class Node {
    let value: Int
    var pred : Node?
    var succ : Node?

    init(value: Int) {
        self.value = value
    }
}

Now let's assume that every node has always a predecessor and a successor, i.e. it is always a member of a circular doubly linked list (possibly consisting of this node alone). The pred/succ pointers are never nil, so I would like to define them as non-optionals, initialized to self. This naive approach does not compile:

class Node {
    let value: Int
    var pred : Node
    var succ : Node

    init(value: Int) {
        self.value = value
        self.pred = self // Variable 'self.pred' used before being initialized
        self.succ = self // Variable 'self.succ' used before being initialized
    }
}

I understand the error message, but I wonder what the best/Swiftiest solution would be.

If I leave them as optionals then I have always to unwrap the pointers, e.g.

self.pred!.succ = self.succ
self.succ!.pred = self.pred

when removing a node from the list. I can define them as implicitly unwrapped optionals:

class Node {
    let value: Int
    var pred : Node!
    var succ : Node!
    // ...
}

That solves most of the problems, but I still have to unwrap them when using intermediate variables (where the IUO becomes a strong optional):

let left = self.pred
left!.succ = self.succ

One could also define private stored IUO properties with public computed wrappers:

class Node {
    let value: Int
    private var _pred : Node!
    private var _succ : Node!
    
    var pred: Node {
        get { return _pred }
        set { _pred = newValue }
    }
    var succ: Node {
        get { return _succ }
        set { _succ = newValue }
    }

    init(value: Int) {
        self.value = value
        self._pred = self
        self._succ = self
    }
}

That seems to solve all problems, but is a lot of code for such a simple problem.

Are there other/better approaches? How would you define self-referential class types with pointers that cannot be nil?

Regards, Martin

Technically speaking the value needs to be nil at some point for deinitialization. So any solution will likely have optional-ness somewhere.

With that said, I’d use property wrapper.

@propertyWrapper
struct NeverNil<Value> {
  private var storage: Value?
  var wrappedValue: Value { ... }

  func reset() { storage = nil }
}

class Node {
  @NeverNil var pred: Node
  ...
}
1 Like

You can hack it up like this, as long as you don’t mind a little unsafe bit-casting:

class Node {
  var value: Int
  var previous: Node
  var next: Node
  
  init(value: Int) {
    self.value = value
    
    // Warning, unsafe temporary placeholder values
    self.previous = unsafeBitCast(0, to: Node.self)
    self.next = unsafeBitCast(0, to: Node.self)
    
    // Replace the placeholders with real values
    self.previous = self
    self.next = self
  }
}

How would one deinit an instance though? Perhaps having a ground (immortal) instance to assign the variables to when removing?

2 Likes

Good point.

It’s worth noting that even if the links are Optional, one must still manually clear those links, or else the instance will not be deinitialized when it goes out of scope.

Using an immortal object works…although it is still possible for users to make their own objects leak if they try. Even if the immortal instance is private, making a delete method that assigns the immortal instance to self.previous and self.next means the user could call delete and then start assigning things to the links of the immortal object via, eg. self.next.next.

Thus, even though the immortal instance is supposed to always link exclusively to itself, users can get access to it and relink it elsewhere, thus keeping other instances alive as well.

How about lazy properties?

class Node {
    let value: Int
    lazy var pred = self
    lazy var succ = self

    init(value: Int) {
        self.value = value
    }
}

…but as @Lantua points out there's no way to deinit an instance.

Technically these could be nil, but how about this?


class Node {
	let value: Int
	weak var pred: Node!
	weak var succ: Node!
	
	init(value: Int) {
		self.value = value
		self.pred = self
		self.succ = self
	}
}

The links shouldn’t be weak because then the instances would die immediately.

And the OP already discussed IUOs.

Of course, IUOs are probably the right answer, unless the OP wants to create a dummy superclass with no properties. But that’s just a roundabout way of doing essentially the same thing as an IUO anyway.

3 Likes

Indeed: Deallocation and avoiding retain cycles is what I completely forgot. Thanks to all for their feedback, this gives me something to think about.

In fact, this is a larger problem than the one posed in this thread, because this remains true when there is more than one node in the list. It's a semantic design flaw (perhaps) resulting from the attempt to use Node as the type for both an individual node and the list as a whole.

On the original problem, another approach would be to do something like this:

class Node {
    private static let noNode = Node(value: 0)
    let value: Int
    var pred : Node
    var succ : Node
    init(value: Int) {
        self.value = value
        self.pred = Node.noNode
        self.succ = Node.noNode
        self.pred = self
        self.succ = self
    }
}

This sort of thing is problematic if instances are large, or expensive to maintain, or have side effects, but it's a fairly simple way of making the compiler happy.

This is undefined behavior. Don't do it.

IUOs have historically been the answer for handling circular references, for better or worse, because at some point you fundamentally need to have a transient nil in order to set up the objects. These days, a property wrapper that internally contained an Optional, but presented a non-optional interface, and dynamically enforced that you assign into the variable before trying to read from it, would probably be the best way to go.

4 Likes

How exactly is it UB?

We are setting the bit-pattern of a class instance (aka. a pointer) to all zeros.

We never read from that pointer. We never even read the value of the pointer itself. We immediately overwrite the pointer with a valid pointer.

It seems completely well-defined to me.

Moreover, if it is UB, then I propose that we make it not be UB. It is a bog-standard usage of unsafe constructs: writing bits directly to memory. When using such unsafe constructs, it is the responsibility of the person using them to ensure that the memory ends up in a valid state, before it is used in a type-safe manner. And we do exactly that, by subsequently assigning self in place of the temporarily-invalid data.

Zero is not a valid value for a non-optional class reference. You aren't writing zero values to memory, you're creating a Node value that has an invalid representation then assigning that to a variable. The compiler may choose to use a null-unsafe entry point to release the reference when the invalid temporary Node value goes out of scope or when the value is implicitly read from during assignment to release the old value, or it may make other arbitrary transformations to the code that assume that the reference is not null. unsafe constructs don't check validity, but they must still follow the language rules for how values are represented and memory is accessed.

Okay, so what is the correct unsafe way to manually construct a class instance like this byte-by-byte from scratch, with the programmer taking full responsibility for ensuring that the final result is a valid instance?

What's your use case?

Totally forgot that it does that (which makes sense). The program needs to release the old 0x00 class during assignment.

Constructing a valid class instance byte-by-byte from scratch.

Doing that in the general case depends on a lot of variables, and it should almost never be necessary. If you have a specific reason to do so, it'd be easier to answer the question.

The specific reason is so that I can learn how to do it correctly.

I think in Swift, as the saying goes there's always a better way. I personally don't even touch unsafe unless it's a struct with well-known layout.