How do you understand two types depending on each other?

rayx · August 30, 2022, 5:34am

I have been using an identifier type like the following. I use it in more complex scenarios (e.g., generic struct, protocol, etc) and it works fine.

struct Identifier<Value>: Hashable, Codable {
    let uuid: UUID = UUID()
}

struct Foo {
    let id: Identifier<Foo>
    // ...
}

However, I find I always have difficulty in understanding it on concept level, because there are mutual dependence between the two types in the above example: Foo <--> Identifier<Foo>.

I used to do a lot of OOP programing and I never thought too much about cyclic dependence between classes, because in my opinion it's typical in OOP. I mainly use value types nowadays and I'm aware that I might introduce mutual dependency unconsciously in value type programming too, like the following. However, for unclear reason I don't think too much about it either.

struct Foo {
    var x: Int
    // Foo references Bar here.
    var y: Bar
}

struct Bar {
    var a: Int

    // Bar's method parameter references Foo.
    func doSomething(foo: Foo) {
        // ...
    }
}

But I always feel a bit uneasy when I use the above identifier type, although I know it will just work. Below is another (a bit more complex) example when I refactoring my code today.

First, the initial code are straightforward:

// MARK: Foo
struct FooEditableProperties {
    var y: Int
}

struct Foo {
    var x: Int
    var y: Int

    mutating func modify(props: FooEditableProperties) {
        y = props.y
    }
}

// MARK: Bar
struct BarEditableProperties {
    var y: String
}

struct Bar {
    var x: String
    var y: String

    mutating func modify(props: BarEditableProperties) {
        y = props.y
    }
}

In my code I put Foo and Bar in seprate arrays. I extend array to add custom APIs and I'd like to share the code for the two arrays. That means I need general types (either protocol or generic struct) to represent Foo and Bar, FooEditableProperties and BarEditableProperties, respectively.

The best way in this case is to use generic struct for both, as below:

struct EditableProperties<Value> {
    var y: Value
}

struct Entity<Value> {
    var x: Value
    var y: Value

    mutating func modify(props: EditableProperties<Value>) {
        y = props.y
    }
}

Just for dicussion purpose (BTW, it's actually true in my real code), let me suppose I don't want to use generic type for Foo and Bar, the code will be like the following:

protocol EntityProtocol {
    associatedtype Value
    var x: Value { get set }
    var y: Value { get set }
}

struct EditableProperties<EntityProtocol> {
    var y: Int
}

// MARK: Foo
struct Foo: EntityProtocol {
    var x: Int
    var y: Int

    mutating func modify(props: EditableProperties<Foo>) {
        y = props.y
    }
}

The code works fine. But again I have some difficulty in understanding
the relation between the types. I mean, the relation is not as straightforward as the original code. Actually I'm not even sure if there is cyclic dependence in this example. If we think Foo depends on EditableProperties, and EditableProperties depends on EntityProtocol. That looks simple. But if we think Foo depends on EditableProperties<Foo>, then we have cyclic dependence: Foo -> EditableProperties<Foo> -> Foo.

While my main purpose of the question is to look for a simple way to understand cyclic relations on concept level, I suspect my confusion is also because I don't know how compiler handles these situations internally. I wonder how you guys think about it? Should cyclic dependency among types be avoided unless it's really necessary, or is it just normal?

gwendal.roue · August 30, 2022, 6:47am

I don't know if my answer will help, but I'll try anyway.

When contemplating cyclic dependencies in types, one can wonder how the compiler makes sense of them. How can the compiler understand the type A if it depends on type B, which can't be understood until the type A is defined?

The answer lies in the fact that the compiler "understanding" of our code in performed in multiple steps.

When the compiler parses our code, struct Foo builds a struct definition named "Foo", and the Foo in let id: Identifier<Foo> is just a reference to some type identified as "Foo". At the end of this stage, the compiler has a soup of definitions and references. Types do not exist yet, and there is no cyclic dependency because "dependencies" are just references by name.

When the compiler later builds actual types, it picks from the above soup the required definitions and references. At this stage, when the compiler deals with let id: Identifier<Foo> nested inside struct Foo { ... }, the Foo type has been already created (though in an incomplete state), and the reference to the type named "Foo" can be resolved. Now the cyclic dependency has been created. And it is totally OK.

At the end of the previous stage, "types" are defined, but they are still incomplete: we need to compute their memory layout, which defines how a runtime value of this type is represented in actual computer memory, when our programs run.

In your specific example, Identifier<Value> has a known memory layout, which does not depend on the Value type parameter: it is the size of a UUID. So the compiler has no problem computing the memory layout of Foo: its id property requires as many bytes as Identifier<Foo> (the bytes of a UUID).

However, some cyclic dependencies make it impossible for the compiler to compute memory layouts. Those create compiler errors:

struct Container<Value> {
    var value: Value
}

struct MyStruct {
    // Error: Value type 'MyStruct' cannot have a
    // stored property that recursively contains it
    var container: Container<MyStruct>
}

struct AStruct {
    // Error: Value type 'AStruct' cannot have a
    // stored property that recursively contains it
    var b: BStruct?
}

struct BStruct {
    // Error: Value type 'BStruct' cannot have a
    // stored property that recursively contains it
    var a: AStruct?
}

struct CStruct {
    // Error: Value type 'CStruct' cannot have a
    // stored property that recursively contains it
    var c: CStruct?
}

// Error: Recursive enum 'MyEnum' is not marked 'indirect'
enum MyEnum {
    case a
    case b(MyEnum)
}

gwendal.roue · August 30, 2022, 6:50am

By the way, you may be interested in Tagged, a library that had your exact same idea, and maybe has pushed the convenience a little farther.

rayx · August 30, 2022, 7:15am

Thanks. I think this is called forward declaration, a term I learned today when investigating this. I think what you described is similar to how we app programmers initialize two instances which reference to each other. Both perform multiple steps to set up the cyclic relation. That said, I haven't found a good way to think about the relation on concept level.

I got the idea for the identifier type from this article. I'll certainly take a closer look at the library you suggested in future. Thanks.

filip-sakel · August 30, 2022, 7:25am

Also, I’m not sure if it helps, but cyclic dependencies are usually identified through directed acyclic graphs. In your case, the compiler could use a DAG to build memory layouts as @gwendal.roue said, although I’m not sure about what the actual implement uses. You can play around with these graphs in Swift, with libraries such as GitHub - fireblade-engine/graph: A lightweight, fast and easy to use directed acyclic graph (DAG) implementation in Swift. This library in particular is used for a game engine; other times you’ll see them used in declarative UI frameworks. All in all, they’re pretty cool data structures.

gwendal.roue · August 30, 2022, 7:50am

A great investigation :-)

Now, in the Swift compiler, I'm not sure compiler engineers would describe this as "forward declarations".

A common understanding of the term "forward declaration" is something which is materialized in the source code, something that the programmer must do for the program to compile. For example, in C and Objective-C, one can not use a function or a type unless it is defined, or forward-declared, prior to its use in a considered file:

In C:

// Forward declaration of function foo before its use
int foo(int);

int g() {
    // OK
    return foo(1);
}

In Objective-C:

// Forward declaration of class Foo before its use
@class Foo;

@interface Bar: NSObject
// OK
@property Foo* foo;
@end

C and Objective-C headers (files in .h) are essentially made of forward-declarations.

Swift does not need such forward declarations, because it does not use the same compilation techniques as a C or Objective-C compiler. A C compiler is (at least conceptually) a one-pass compiler that requires forward declaration in order to make sense of our code.

Languages with multi-pass compilers, such as Swift or Java, do not need forward-declaration, and often provide no syntax for it. When interpreting our code, the Swift compiler is able to consider all functions and types defined in a module, without needing any help from the programmer.

In our initial context of cyclic type dependencies in the compiler, I wouldn't say "forward declarations". For example, the result of the parsing phase contains types definitions identified by their names. Those types definitions may themselves refer to other types definitions identified by their names. There is no need for the concept of "forward declaration" in order to understand the previous sentences.

Nickolas_Pohilets · August 30, 2022, 1:00pm

I think your confusion comes from not separating instantiations of generic types (EditableProperties<Foo>) and archetype declarations (EditableProperties).

Foo -> EditableProperties<Foo> -> Foo is indeed a cycle. But Foo -> EditableProperties is not. Generic archetype EditableProperties remains reusable.

Cycles usually are undesirable, because they prevent code reuse. But even if there is no reuse required, preventing cycles is still useful as a guiding constraint for structuring code.

But if two types are parts of the same entity, then cycles are ok. For example, I have a few entities which need to be used in Objective-C, so they are modelled not as a struct, but as an immutable class + builder class, depending on each other.

rayx · August 30, 2022, 2:04pm

I was aware of the difference. But I wasn't very sure which one I should use to understand the relation between the types in that example. My guess is the latter one.

That's how I understand it too. I think a simple way to understand types with cyclic dependence is to consider them as a "single" type. For example, they can't be in different modules.

I agree on the code reuse part but I'm not sure if cyclic dependence is a bad practice in general (although I did find it make it difficult to understand the code - that's why I brought up the question). As I mentioned in my original question, I think people can easily introduce cyclic dependence unconsciously. If it's a bad practice, I'd expect there are tools included in IDEs to detect them. But I never heard about such tools.