Set of "no less than one" member enforced by the compiler


(Gwynne Raskind) #1

I’ve been spinning my wheels on this one for a couple of hours and can’t come up with a way to do it which doesn’t require checking for at least one potential data inconsistency at runtime with precondition() or assert(), so I’m wondering if anyone else has a solution.

I have a data type which is conceptually a Set (not an OptionSet, it’s not a bitfield and has no concrete representation, simply a Set). This Set may contain members chosen from a limited set of choices, which in Swift I represent as an enum:

enum Kind {
  case FirstKind
  case SecondKind
  case ThirdKind
}

typealias Kinds = Set<Kind>

The data to which this applies is a structure-

struct DataItem {
  let kinds: Kinds
  // other fields
}

This structure demands that the Kinds set must always contain at *least* one element, because conceptually "no kind" is not a sensible state for a data item. However, I can’t use just the enum because the data can be of more than one kind at a time. It can just never be of no kinds - in that case, there is no data at all, and I don’t need a state *within* the data to represent that, I would simply have no DataItem in the collection which holds the list of data items.

In a relational dataset, I would say that the "no kind" state was denormalized, because it can be represented by there being no row in a table. The other obvious alternative, to duplicate the DataItem for each Kind, is also denormalized (multiple rows whose data is not solely defined by the candidate keys). The less obvious alternative of having a map of [Kind : [DataItem]] (in a database this would be akin to having a kindID column in the DataItem table with a separate Kinds lookup table), while fully normalized, is not only inefficient and difficult to manipulate but also disassociates a critical piece of information about the item from the item itself. My in-memory data model is not a relational database.

Half-digression into relational theory aside, I can’t figure a way to make the compiler enforce this. An enum is "exactly one", a Set is "zero or more", a structure or class is "this group", an Optional is "zero or one". Obviously, I can do "precondition(kinds.count > 0)" to check this at runtime, but that admits that the data can exist in an inconsistent state. Is there some way, even a hacky one, to express in the data model itself (such that the compiler will enforce it) that there must be at least one "kind", as the enum already does for "only one at a time"?

There’s probably some bit of type theory which explains why this is either difficult or impossible in practice, but I’m hoping maybe it’s just a corner of knowledge I’ve yet to previously delve.

-- Gwynne Raskind


(David Sweeris) #2

I've been thinking about this, too. At one point I had a reason to want an Array that couldn't be empty (though I can't recall what the reason was now).

Anyway, I think I'd do something like this:
struct NonEmptyCollection <T : CollectionType> : CollectionType {
    private var _head: T.Generator.Element? //optional because it needs to have a default value in case the init has to fail. Seems like there ought to be a work-around that doesn't have the overhead, but I'm drawing a blank.
    var head: T.Generator.Element {
        get { return _head! } //always safe to unwrap because there's no way to say `_head = nil` once the init succeeds
        set { _head = newValue }
    }
    var tail: T
    init?(_ c: T) {
        // bail if c is empty
    }
    // whatever else CollectionType needs
}

There's undoubtedly a better way to do it, but I *think* that should work.

HTH,
- Dave Sweeris

···

On Jun 25, 2016, at 18:54, Gwynne Raskind via swift-dev <swift-dev@swift.org> wrote:

I’ve been spinning my wheels on this one for a couple of hours and can’t come up with a way to do it which doesn’t require checking for at least one potential data inconsistency at runtime with precondition() or assert(), so I’m wondering if anyone else has a solution.

I have a data type which is conceptually a Set (not an OptionSet, it’s not a bitfield and has no concrete representation, simply a Set). This Set may contain members chosen from a limited set of choices, which in Swift I represent as an enum:

enum Kind {
   case FirstKind
   case SecondKind
   case ThirdKind
}

typealias Kinds = Set<Kind>

The data to which this applies is a structure-

struct DataItem {
   let kinds: Kinds
   // other fields
}

This structure demands that the Kinds set must always contain at *least* one element, because conceptually "no kind" is not a sensible state for a data item. However, I can’t use just the enum because the data can be of more than one kind at a time. It can just never be of no kinds - in that case, there is no data at all, and I don’t need a state *within* the data to represent that, I would simply have no DataItem in the collection which holds the list of data items.

In a relational dataset, I would say that the "no kind" state was denormalized, because it can be represented by there being no row in a table. The other obvious alternative, to duplicate the DataItem for each Kind, is also denormalized (multiple rows whose data is not solely defined by the candidate keys). The less obvious alternative of having a map of [Kind : [DataItem]] (in a database this would be akin to having a kindID column in the DataItem table with a separate Kinds lookup table), while fully normalized, is not only inefficient and difficult to manipulate but also disassociates a critical piece of information about the item from the item itself. My in-memory data model is not a relational database.

Half-digression into relational theory aside, I can’t figure a way to make the compiler enforce this. An enum is "exactly one", a Set is "zero or more", a structure or class is "this group", an Optional is "zero or one". Obviously, I can do "precondition(kinds.count > 0)" to check this at runtime, but that admits that the data can exist in an inconsistent state. Is there some way, even a hacky one, to express in the data model itself (such that the compiler will enforce it) that there must be at least one "kind", as the enum already does for "only one at a time"?

There’s probably some bit of type theory which explains why this is either difficult or impossible in practice, but I’m hoping maybe it’s just a corner of knowledge I’ve yet to previously delve.

-- Gwynne Raskind

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev


(Brent Royal-Gordon) #3

Half-digression into relational theory aside, I can’t figure a way to make the compiler enforce this. An enum is "exactly one", a Set is "zero or more", a structure or class is "this group", an Optional is "zero or one".

You can write an enum that's "one or more":

  enum OneOrMore<Element> {
    case one (Element)
    indirect case more (first: Element, rest: OneOrMore<Element>)
  }

Or, equivalently:

  struct OneOrMore<Element> {
    var first: Element
    var rest: [Element]
  }

`first` could instead be `last`, of course; I've done it this way because `reduce` is a left fold.

It should be possible to conform either of these to Collection or even MutableCollection, but not to RangeReplaceableCollection, which assumes you can create an empty instance.

···

--
Brent Royal-Gordon
Architechies