I'm making a directory app, where the data will only be updated occasionally, but the dataset is complex and full of inter-relationships.
To seed the data, I implemented a domain specific language, but I want to expand upon it. I want to both:
- Easily collect model instances (in an array, for example).
- Specify relationships between model instances.
After the initial creation of these model instances, they might be persisted using SwiftData, or similar. However, for now I'm interested in the initial data entry.
Initial implementation
Let's say I have builder structs Teacher
and School
, and several results builders, that I use like this:
@ModelsBuilder
func seedData() -> [ModelBuilder] {
Teacher("John", "Smith")
.contacts {
Address("1 The Road", "London")
Tel("020 1234 5678")
Email("johnsmith@school.com")
}
.payCategory(.newlyQualified)
Teacher("Shirley Anne", "Waters")
.id("shirleywaters")
.payCategory(.headteacher)
School("Trinity")
.staff {
Role(department: "administration", id: "shirleywaters") <--- String IDs used here!
Role(department: "science", id: "johnsmith") <--- String IDs used here!
}
}
This is a contrived example, but analogous to my current implementation. The idea is that the data entry is readable but succinct.
Processing the builders proceeds as follows:
- The builders are collected in an array by the
@ModelsBuilder
results builder. - The array of builders is iterated to create the model instances.
- Each model instance either has a string ID specified ("shirleywaters" in the example above), or infers a string ID ("johnsmith" above).
- A dictionary is used to map string IDs to model instances.
- The array of builders is iterated again to resolve string ID references.
- For each string ID reference, the dictionary is used to find the target instance and the relationship is created.
This is all particularly error prone, because of the size and complexity of the data.
Option 1
I'm thinking of leveraging Swift's type system to refer to model instances at compile-time, to avoid runtime errors with missing or incorrect string IDs. Maybe to allow a syntax something like this:
School("Trinity")
.staff {
Role(department: DeptAdministration.self, id: ShirleyWaters.self)
Role(department: DeptScience.self, id: JohnSmith.self)
}
I wondered about adopting the model used by SwiftUI for my builders and using macros to avoid boilerplate. Maybe something like this (if it's even possible):
@Teacher("John", "Smith")
.contacts {
Address("1 The Road", "London")
Tel("020 1234 5678")
Email("johnsmith@school.com")
}
.payCategory(.newlyQualified)
Expanding to:
struct JohnSmith : Teacher {
init() {
super.init("John", "Smith")
}
var model: some ModelBuilder {
self
.contacts {
Address("1 The Road", "London")
Tel("020 1234 5678")
Email("johnsmith@school.com")
}
.payCategory(.newlyQualified)
}
}
So every builder instance would become a separate type, allowing compile-time checking of relationships. Circular references (and there are many in the real dataset) would be handled, too.
Problems:
- I'd need a new way to collect the builders, as these could not be defined in a results builder. I definitely wouldn't want to curate a list of builder types manually, because that would create a new source of errors (models missing from the list could be referenced at compile-time, but wouldn't actually be created at runtime).
- I don't know if Swift has a practical limit for the number of types defined... this could potentially run to approximately 20,000 types. That might brick the compiler.
- Maybe it harms readability to use Swift macros to create underlying code so different from the code as written.
Questions:
- Is there a way to use reflection to iterate over all implementations of a protocol (of
Teacher
say)? That would avoid the need for a results builder to collect model instances. - Alternatively, could the macro create some peer definition or statement that would append that type to a global list, or allow reflection or similar?
Option 2
I wondered about the possibility of a "ModelID" macro, just for those model instances that will be referenced by other instances. Maybe to allow something like this:
@ModelsBuilder
func seedData() -> [Model] {
@ModelID
Teacher("John", "Smith")
.contacts {
Address("1 The Road", "London")
Tel("020 1234 5678")
Email("johnsmith@school.com")
}
.payCategory(.newlyQualified)
@ModelID("ShirleyWaters")
Teacher("Shirley", "Waters")
.payCategory(.headteacher)
...
}
Maybe the macro would create a helper type that could be used in references, something like this:
struct JohnSmith : ModelReference {
let id = "johnsmith"
}
Problems:
- I don't think it's possible for a macro here to create a type that would be visible globally.
Other options
Maybe I'm going about this in the wrong way? I could persist the data in JSON, I suppose, and write a parser, but that seems verbose and error-prone. I like the idea of leveraging the compiler to maximise compile-time data input. I'm the one creating the dataset, and it's already very labour intensive. Are there other ways of seeding complex data?
This question is too long and probably too vague, for which I apologise. I'm not sure my contrived example does justice to the problem.
I wondered if the named references in the RegexBuilder
DSL provides an analogous solution, but I can't work out how it helps me.
Any thoughts or suggestions very gratefully received. Thank you.