[Pitch] Swift Predicates

Oh, and will these APIs be offered from Foundation as extensions to Collection or anything? It would be nicer to use these as if they were instance functions rather than globals. That is

let result = collection.applying(predicate: #predicate(...)) // Not sure what the verb would be here.

would feel much nicer than

let predicate = #predicate(...)
let result = predicate.evaluate(collection)

If that sort of functionality isn't offered it seems like something that will be added pretty frequently, for predicate users at least. Best to nip that in the bud right away.

I'm also not sure this should be in Foundation, but... :man_shrugging:

2 Likes

Operators are resolved like global functions, so any new version of + or == or other standard operator is considered an overload, even if they operate on completely new types. That's the real pitfall of global operators -- adding new operators over types that are not used anywhere can still break existing code with type checker timeouts.

The problem with diagnostics in the operator overloading approach is specifically that the error messages will mention types that are specific to the predicate representation, which the programmer doesn't really care about. As the expression gets more complex, so does the type that attempts to encode the structure of the expression tree. You'd end up getting error messages that mention highly nested generic types, such as PredicateExpressions.Equal<PredicateExpressions.Value<String>, PredicateExpressions.Variable<Int>>, and the programmer has to mentally parse this giant type to figure out the types that they wrote that didn't match were String and Int.

Operators don't provide all of the same expressivity as macros. For example, a macro allows you to represent syntax in an expression tree that cannot be overloaded by operators, such as coercions and the ternary operator. Custom operators also rely on overload resolution and bidirectional type inference, which are incredibly difficult features to understand. With the compile-time performance and diagnostics implications, I personally don't think there's any usability win for the operator approach over macros.

To my mind, the beauty of the macro approach here is that the code in a #predicate closure will type check against the existing operator overloads that programmers are already familiar with, not against custom operators over predicate-specific types. So, the semantics of a predicate expression use concepts that the programmer is already familiar with. If and when the programmer wants to understand how predicates are represented, e.g. to write custom operations, only then do they need to dig deeper into the PredicateExpression representation.

A macro expansion may also have an opportunity to produce more actionable diagnostics that are domain-specific, which is something that has been a frequent pain point of using library-defined DSLs with standard Swift type checker diagnostics; the @available trick is not powerful enough to identify many common mistakes. If we go down the route of semantic macros, a macro will have more than enough information to produce errors about API misuse, possibly even with custom fix-its that are specific to the library. This is something that I would love to see explored more as part of the macro evolution discussions!

5 Likes

There is already one extension to Sequence in the pitch, pretty much what you suggest:

extension Sequence {
    public func filter(_ predicate: Predicate<Element>) throws -> [Element]
}

We can add others as needed.

1 Like
let predicate = Predicate<Message> {} //<--- better
let predicate = #predicate<Message> {}

I think the first line looks better for me.

Should the design of a language and features really depends on possible compile time issues?

I think compilers evolve and compile time can be optimized under the hood in the feature.

3 Likes

My biggest concern is that this has any relation to Foundation. I’m not familiar with FoundationEssentials but I would greatly appreciate any new proposals and code being as far away from legacy monolithic constructs like Foundation as possible.

1 Like

FoundationEssentials is a new effort regarding the open source version of Foundation's Swift implementations. You can read more about this here if you're curious about the details, but this effort does involve breaking up the Foundation module into various smaller components to avoid the monolithic structure we currently have. Our idea is for the predicate APIs to land in the core/smaller FoundationEssentials package, but other Foundation APIs related to XML, networking, internationalization, etc. will be broken out into separate packages.

There is a typo in one of the examples. An extra quote is in the middle of this string.

NSPredicate(format: "SUBQUERY(recipients, $recipient, recipient.firstName == sender.firstName").@count > 0")

Hi all, thank you everyone for your feedback so far! We've made some small updates to the pitch, and I've updated the pitch document linked in this post (the document here for reference). The main changes include:

  • Adding the full definitions of each expression operator
  • Renaming #predicate to #Predicate to align with capitalized names typically used for invoking a type's initializer
  • Substituting a new build_KeyPath function invoked by the macro for previous uses of dynamic member lookup

We appreciate all of your input, and we'd love any feedback you may have with these new revisions. Feel free to let us know if you have any comments or questions!

1 Like

I don't think we've established macro naming as part of the guidelines, but this seems to be an odd choice to me. If it starts with # it's a macro, whether or not that macro initializes a type under the hood. I'd expect macros to be lower cased. If you want to initialize a type, can't you make a normal Predicate type that takes the macro closure in one of the initializers? Frankly I'd prefer that rather than seeing # hanging around so much.

9 Likes

This is very interesting; having an example of multiple cutting-edge language features (variadic generics, macros) working in unison to tighten type-safety is compelling.

I'm curious how the set of operations supported by the Predicate macro system will be understood by developers. I see a couple avenues where this may arise:

  1. Diagnostics, when a developer attempts to use an operation not supported by the macro transform;
  2. Autocomplete support when typing a macro, as stated in goal (2) of the proposal's Motivation section.

For example, suppose a developer is interested in achieving the semantics of this example from the proposal:

let predicate = #Predicate<Message> { message in
    message.recipients.filter {
        $0.firstName == message.sender.firstName
    }.count > 0
}

but, seeing what looks like freely-written Swift in the provided closure, instead attempts to write:

let predicate = #Predicate<Message> { message in
    message.recipients.map(\.firstName).filter {
        $0 == message.sender.firstName
    }.count > 0
}

where map is an example of an unsupported operator (and presumably this fails to compile).

Can diagnostics & autocomplete inform the developer that e.g. the only supported operations on sequences are filter(_:) , contains(_:), contains(where:), allSatisfy(_:), etc.?

If supported in autocomplete, how is that relationship between macros and the IDE communicated? I'm guessing the macro definition itself isn't enough, but if that relationship is described in one of the macro proposals and I've missed it, my apologies.

(I recognize some developer experience-type questions may be outside the scope of this pitch, but thought I'd ask!)

1 Like

You're right, we haven't quite established naming as part of the guidelines (tagging @Douglas_Gregor since we've briefly discussed this). I don't think dropping the # altogether is a direction we want to go towards. While the semantics of the pre- and post-expansion code are the same, there's quite a bit of heavy lifting going on in the macro here that we'd like to be clearly evident to the developer by writing #Predicate at the construction site rather than hiding the macro invocation in the declaration of the initializer. Given the choice between something like #predicate and #Predicate, we felt that #Predicate looked more natural to indicate that the macro initializes a type, rather than something like #assert which acts more like a function call. Doug might have some more thoughts here.

2 Likes

Potentially a bit out of scope for the pitch here, but still an important question nonetheless! Currently diagnostics are our main tool here, and in fact this is one of the compelling reasons why we'd like to use macros instead of a solution like operator overloading. The macro will produce diagnostics for this case that tell the developer that the function is not able to be used in the context of the predicate they are creating. For common functions, we also have the opportunity to provide fix-its or suggestions as applicable within these diagnostics, but in general the diagnostic will just alert the developer that this function can't be used. Currently, our main source of truth for developers to see a list of supported operations would be the documentation (for example, predicates support all expressions that conform to StandardPredicateExpression). Macros don't currently have a way to influence the autocomplete results, but that is an avenue that could be interesting to bring up on the macro proposal.

2 Likes

This is a much larger discussion that doesn't really need to be here, so I've threaded it.

Macro Naming One simple objection is, why is a macro that returns a type (or otherwise looks like an init) different than a global or other function that does the same thing? If we had a global `predicate` factory, by this logic shouldn't it be `Predicate` (barring the inevitable collision with the real type)? Also, given that result builder closures are unmarked (despite feedback in the review, IIRC), why wouldn't we allow or expect macro closures to also be unmarked? If we're really that's concerned, adding a `#{}` form for macros (where the actual `#predicate` type is inferred) seems logical so we can have `Predicate {}` and `Predicate #{}` rather than `Predicate {}` and `#Predicate {}`. But then I may be in the minority that really doesn't want yet another marker / syntax for macros.
4 Likes

This is an interesting pitch which provides a lot of useful functionality!

So I have one fundamental question - what would be the approach for allowing for a UI that edits a representation of a predicate and transforms it back to the serialised entity that can be sent over to another process?

Specifically, we'd like to use predicates as a generic filter mechanism, where e.g. a frontend can edit a predicate (including values that are used as part of the expressions) and get a serialised representation that can be sent over the network to another process that then applies it (either directly or using the described support for transform to e.g. SQL).

It seems the needed mechanisms are halfway there ("Tree walking" section), but it is not (yet) quite clear to me if this would be flexible enough of a hook to go from serialised predicate -> UI. The reverse (going from some flexible/dynamic internal representation that allows a human being to interact with it, rather than coding stuff, over to the Predicate format) seems out of scope currently (but would be a critical piece to make this deeply more useful for many use cases).

Perhaps I'm missing something, but would be interested to hear your ideas here.

3 Likes

What do you see as the obstacles to doing what you propose? I think it's an entirely reasonable goal we want to support.

Not sure there are obstacles, I'm just trying to understand how do to a few things based of the pitch description (it's harder when you can't play with it :slight_smile: ) - maybe two questions to begin with:

  1. To confirm: to programatically (dynamically at runtime) generate a custom predicate tree, we'd use the low-level building blocks as outlined in the section "Macro processing" (that shows how the macros are expanded) ?
  2. In the 'tree walking' section, it is outlined how to add custom predicate processing - but how do you actually trigger a tree walk of myPredicate - is it by simply calling evaluate ? In that case, will a full evaluation always be done of all nodes in the tree, or can logic be short circuited before all nodes are traversed in the tree?
2 Likes
  1. Yes, you'd construct the tree using the public initializers on the various PredicateExpression types.
  2. As the tree walking section shows, this is typically done by defining a protocol, and then conforming Predicate and all the PredicateExpression types you want to handle to that protocol via extensions. If the protocol is (e.g.) ParseToResult like so:
protocol ParseToResult {
    func parse() -> Result
}

after writing extensions to conform Predicate et. al. to ParseToResult, you'd then do this:

let myResult = aPredicate.parse()

Note that the parse function could have an argument that would act as global state that could be passed down to component PredicateExpression values. While the example works if you have your own Predicate type, to do it for standard Predicate you need to cast:

extension Predicate: ParseToResult
    func parse() -> Result {
        return (expression as! ParseToResult).parse()
    }
}

If you wanted to allow for the case where you don't handle every PredicateExpression type, you could make parse() return an optional and use as? instead.

evaluate() is not involved in tree walking. You only use that if you wish to supply input to the predicate and evaluate it.

3 Likes

Thanks for the clarification @dgoldsmith (we missed parse() somehow, it was the missing piece for us - EDIT: in fact, looking at the pitch, I can't find it, perhaps it can be added to the Tree Walking section as an example? It would be clearer than spotlightQuery at least for us) - we've discussed the pitch with our team internally and overall we think it would be a great addition and definitely would make heavy use of this, looks really promising.

Our only remaining major concern (which obviously is hard to know from a pitch, but please view this as an open question) would as mentioned be performance (of evaluate() - specifically the use of keypaths, as even though SE-061 have the following note on performance:

The performance of interacting with a property/subscript via KeyPaths should be close to the cost of calling the property directly.

There are a few references so far to (quite significant) performance issues with the current keypath implementation, e.g.:

and

I understand the optimization of keypath handling is handled by a different set of priorities, I just wanted to point it out if there is anything that can be done to minimize that possible impact.

I guess it's only tangential to the pitch, but wanted to at least call it out as performance of evaluate() is critical to the usability of Predicates (at least for us) - and the pitch overall really looks great and we'd be super happy to use it (as long as performance is ok).

So, anyway - big +1 for the pitch overall.

3 Likes

Thanks for the explanation! I played around with expression macros and (although I got quite a few errors and couldn’t run anything) I understand why they’re used in the pitch. My only concern of macros in general, which does extend to this pitch, is the risk duplicating common functionality in different projects. For example, even after the advent of macros, property wrappers are still a great way of adding behavior to properties in a way most Swift programmers understand. I think this is the case with the proposal’s macros aa they are mainly used for custom operators. However, this behavior is not unique to predicates, the power assert discussed in the expression-macro threads is another great example and I imagine scoped operators being used for numerical computing too. In other words, you made a great case for why predicates would benefit from scoped operators instead of overloading, but we should generalize this feature to extend beyond Foundation. Otherwise the Swift ecosystem will become fragmented with each library author choosing their own version of scoped operators. The following is a simple, generic design we could use for this feature:

macro Predicate<R>(body: () -> R) = #scopedOperators(
  OperatorDescriptor(
    infix: “+”, 
    implementation: PredicateExpressions.Equal.init
  ),
  body
)
2 Likes

Quite an interesting idea. A similar idea, but for result builders, was recently pitched here:

Maybe a more general feature could solve both problems. I guess namespace and automatic usages of namespaces could be a nice thing but of course a lot of work to design and implement. I imagine this would also make autocomplete work more seamless.

2 Likes