Require parameter names when referencing to functions

I oppose any implementation that forces reference to init(_:) or max(_:_:) but support an option to disambiguate when needed.

7 Likes

We could also allow the bare function name to refer to any functions which have no parameter labels; that is, foo could reference foo, foo(_:), foo(_:_:), etc. without allowing it to reference foo(bar:). That change would be slightly more breaking, but I think has more justifiable semantics. We don't currently allow typing as much of a property name as is required to disambiguate:

struct Baz {
    var foo: Int = 0
    var bar: Int = 1
}

print(Baz().f) // we only need the 'f' to know we're talking about foo!

I view this proposal as disallowing the same sort of construction for method/function names.

1 Like

I don't like this proposal.

The source of the problem comes from a lack of standardization of naming. As it is, the standard library often uses the name of return values as method names – that is, the name of what the method would be if it were a property. (first is an example.) There will be edge cases, but generally, disallowing that eliminates the confusion; my general solution is to prefix all such methods with get.

Any language with functions that can be referred to with constants/variables needs to tackle this. It's not obvious until you come across your own related problem – like Soroush has!

I want the method. My proposal to get it in is to start a precedent for better naming. As a property, it's count. As a method, it should be getCount.

I like this approach — disambiguate as needed.

Although expectation is named incorrectly (it is more accurately makeExpectation), I don't think it this is a problem in practice, because we make an expectation using the XCTestCase instance:

let expectation = self.expectation(description: "")

And while better naming would eliminate self for common usage, we'd still need to disambiguate using what Soroush is proposing, if we cared to store the method, and there were overloads.

extension XCTestCase {
  func makeExpectation(description: String) -> XCTestExpectation {
    expectation(description: description)
  }

  func makeExpectation(
    for predicate: NSPredicate,
    evaluatedWith object: Any?,
    handler: XCTNSPredicateExpectation.Handler? = nil
  ) -> XCTestExpectation {
    expectation(
      for: predicate,
      evaluatedWith: object,
      handler: handler
    )
  }
}

final class TestCase: XCTestCase {
  func test() {
    let expectation = makeExpectation(description: "")
    let makeExpectation = makeExpectation(description:)
  }
}

It's because of cases where overload disambiguation isn't needed, that I'm against this proposal.

let makeExpectation: (String) -> XCTestExpectation
makeExpectation = self.makeExpectation

I'm a big proponent of this (clarity at the point of use, even over brevity), but I think people have made really good points about it flipping from helpful to noise for functions with unlabeled arguments, like operators or max or even converting initializers. The danger in those cases is around overloading based on type, and that's always going to be a concern for Swift.

I'm against the "disambiguate as needed" approach because it makes it harder for library authors to change their code. If a method has a sensible base name, and you want to add functionality that would share the same base name but uses different argument labels, you shouldn't have to worry about whether someone might be referring to the method with just the base name. I think a lot of the reason this feels weird is because argument labels aren't considered part of a method's name in many other languages, but we do think they are in Swift and have been moving the compiler in that direction for a long time.

(I'm also against the "methods and properties should not have overlapping names" suggestion, but I think that's only one of the motivations here.)

EDIT: I forgot to post my conclusion: "so I support the middle ground where foo can refer to foo(), foo(_:), foo(_:_:), etc., but not to foo(bar:) or foo(_:baz:)".

DOUBLE EDIT: …even though, ugh, this is going to be annoying to implement.

30 Likes

What if we stop trying to do the 'fully disambiguated' case inline? Let's make a syntax for getting a fully qualified and unambiguous function/method that isn't necessarily meant to be used inside a call to map but can be used before and yields a value that you can pass to map

Love it!

I often run into stuff like this:


Would that also get fixed?

10 Likes

Another reason to support the bare name only for unlabeled arguments is that once argument labels make their way to local variable declarations, it would be pretty annoying to have to write:

let foo(_:_:_:) = bar(arg1:arg2:arg3)
4 Likes

I have a PR open to fix this ([Typechecker] Allow initialising a variable with a function with the same name as the variable by theblixguy · Pull Request #23695 · apple/swift · GitHub), just need to address some comments. I’ll try to find some time soon.

3 Likes

We already have the *option* to provide argument labels when desired.

The ability to omit them is a convenience which people are able to use when they want to, but they are not forced to do so.

Any line of argument related to compilation time can immediately be rebutted with “if the unlabeled version makes an expression slow to compile, then add labels”.

• • •

The matter of:

let foo = foo(x)

Seems a separate issue to me. It should simply work.

2 Likes

That’s a poor argument. The design of APIs should not have to consider compilation performance. They should be designed for call-site clarity. If the optimal call-site clarity calls for no labels, then that is the correct API design.

5 Likes

That is exactly my point. The pitch in this thread would *prohibit* people from writing the best, clearest code in many situations.

The argument for doing so is a weak one, related to the complexity of the compiler.

The counterargument is straightforward: if there is a situation where the compiler is too slow, we already have the ability to work around it in exactly the way the pitch would force us to use everywhere.

The pitch does not gain us anything, it only takes things away.

4 Likes

It seems a bit odd to allow code to link that would not actually re-compile to the same result. Meaning, if you refer to an unambiguous foo in some library, and that library later adds a second foo (differing only in argument labels), presumably that's binary compatible because the originally referenced foo remains with unchanged mangled name? If so, then new compilation of that exact same client code, against the updated library, should intuitively also work fine (and have the same behaviour; resolve to the same foo as before).

I don't actually care about the source compatibility - in principle it's fine to me that a new version of a library might require some source changes to adopt (no different from enums changing, for example), but if that's true, then logically the library should break ABI compatibility at the same time.

Otherwise, I do like the approach of requiring user intervention only upon actual ambiguity. I don't like the idea of a passive-aggressive compiler, i.e. one that just runs really slowly if it turns out I wrote something unclear to it. That's making a significant developer pain point (compile times) even harder to fix.

If I wrote let fn = count in the presence of func count() and func countWhere(_ condition: ...), nobody would find that ambiguous - I clearly meant the first one. I don't see why it should be any different if the 'where' part happens to be on the other side of the opening parenthesis. If I meant to refer to the other, I should have written its actual name - the compiler is being very nice to me if it let me use the abbreviation count at some point, and that niceness should immediately end as soon as there's any ambiguity (to a human - it's no good the compiler being a genius at divining the only technically valid option if anyone actually reading the code doesn't know what the code's doing). The only question in my mind is whether the compiler should have been so nice to me to begin with.

If there's some other identifier under the name count in scope (e.g. a variable), I'd actually expect the compiler to just error out. It's confusing to a reader to have duplicates like that, and burdening every reader with arbitrary rules on conflict resolution (even if it then makes the compiler's life easy) seems far worse than just requiring unambiguous names.

In the max case, it's not apparent to me why there's two versions in the standard library to begin with - why does the two-argument case need to be specialised over the N-argument one?

And in playing with max briefly, it seems apparent that it's already seriously broken - e.g. this doesn't even compile:

[(3, 5, 7), (2, 6, 1), (1, 0, 2)].map(max)

Yet this does:

[(3, 5), (2, 6), (1, 0)].map(max)

That makes no sense.

4 Likes

Wait, what? How?

Not sure how this is supposed to be affected by default parameters.

Default parameters only affect call sites, and this pitch is not about call sites.

We don't have the option to provide the lack-of-argument-labels when that is desired. For instance, the problem with count(where:) did not concern performance when you actually wanted count(where:); it was about performance when you wanted plain count. The compiler had to consider the possibility that you might have been talking about count(where:), and there was no way to rule out that possibility.

Thinking about it a little, I am somewhat concerned about this. It would be nice if code completions and diagnostics could notice that a candidate matches only in its uncalled state and fill it in with the parameter labels only instead of inserting placeholders for the parameters.

(But keep in mind that point-free style is a mildly advanced feature and users who don't know about it can always write a closure with an explicit call.)

Sure, but map(max(_:_:)) is actually more clear than map(max). The latter could be referring to a different max, like the self.max() and self.max(by:) that would be available in an extension on any Collection type. The former clearly doesn't refer to either of those.

(This is especially important for something like max(_:_:) because it's actually kind of weird that it's a candidate at all.)

Let's be clear here: Omitting the labels doesn't always improve clarity—it improves brevity. Improving brevity can sometimes improve clarity by allowing you to omit unimportant details, but sometimes—arguably more often—it hinders clarity by making the code vague and ambiguous, making it impossible to be completely certain what it means without actually type-checking it. And we already encourage making the names and labels brief, so you're not talking about adding much extra text.


I invite you to consider an example that's very similar to Nevin's:

a.map(String.init)

Do you know which initializer this calls? It ends up depending on the type of a.Element, and in many cases it's ambiguous and already disallowed by the compiler. Here's the list generated by referencing String.init in a file that imports Foundation (I've deleted entries with Builtin types):

init()
init(_ c: Character)
init(_ cocoaString: NSString)
init(_ content: Substring.UnicodeScalarView)
init(_ scalar: Unicode.Scalar)
init(_ substring: __shared Substring)
init(_ unicodeScalars: String.UnicodeScalarView)
init(_ utf16: String.UTF16View)
init(_ utf8: String.UTF8View)
init(_cocoaString: AnyObject)
init(_sel: Selector)
init(cString: UnsafePointer<CChar>)
init(cString: UnsafePointer<UInt8>)
init(contentsOf url: __shared URL) throws
init(contentsOf url: __shared URL, encoding enc: String.Encoding) throws
init(contentsOfFile path: __shared String) throws
init(contentsOfFile path: __shared String, encoding enc: String.Encoding) throws
init(extendedGraphemeClusterLiteral value: Self.StringLiteralType)
init(format: __shared String, arguments: __shared [CVarArg])
init(format: __shared String, locale: __shared Locale?, arguments: __shared [CVarArg])
init(from decoder: Decoder) throws
init(repeating repeatedValue: Character, count: Int)
init(repeating repeatedValue: String, count: Int)
init(stringInterpolation: DefaultStringInterpolation)
init(stringLiteral value: String)
init(unicodeScalarLiteral value: Self.ExtendedGraphemeClusterLiteralType)
init(utf16CodeUnits: UnsafePointer<unichar>, count: Int)
init(utf16CodeUnitsNoCopy: UnsafePointer<unichar>, count: Int, freeWhenDone flag: Bool)
init?(_ codeUnits: Substring.UTF16View)
init?(_ codeUnits: Substring.UTF8View)
init?(bytesNoCopy bytes: UnsafeMutableRawPointer, length: Int, encoding: String.Encoding, freeWhenDone flag: Bool)
init?(cString: UnsafePointer<CChar>, encoding enc: String.Encoding)
init?(data: __shared Data, encoding: String.Encoding)
init?(utf8String bytes: UnsafePointer<CChar>)
init?(validatingUTF8 cString: UnsafePointer<CChar>)

Give yourself a minute to soak in this list.

Did you realize that map(String.init) could unsafely read arbitrary amounts of memory if there's an UnsafePointer? Did you realize that the specifics of this behavior can actually vary slightly depending on whether the result type is optional?

Did you realize that map(String.init) could perform blocking network I/O if it ran on a URL and there happened to be a try on an enclosing expression?

Did you realize that map(String.init) could call underscored initializers that aren't even listed in the documentation?

Did you realize that you can't use map(String.init) for the vast majority of types (init(describing:) and init(reflecting:) could both match most types), and it will only work on types that conform to LosslessStringConvertible, TextOutputStreamable, or CustomStringConvertible? (Recall that CustomStringConvertible should almost never be something you constrain a generic parameter to—it's supposed to be a hook that the string formatting machinery uses, not a behavior-changing conformance.)

And that one used to be even worse—in Swift 4.2 and earlier, most calls to String.init ended up resolving to String.init(stringInterpolationSegment:), which is not supposed to be called directly at all. Many, many code bases were calling it without realizing they were. Alamofire was one of them.

This lack of clarity is frankly dangerous. It's the kind of thing that could not only cause bugs, but bugs with CVE numbers. In the face of that reality, I think the minor gains in brevity it offers have to yield.

It sure would be nice to know that a map onto String can only perform network I/O if you actually see String.init(contentsOf:) in the source code, wouldn't it?

41 Likes

Thank you for writing this, it makes clear what problem the pitch intends to address, in a way that I did not glean from the earlier description.

4 Likes

What about a rule like this one:

  • When there is no overload, the short form can be used to match the function regardless of the parameter labels.

  • In the presence of overloads, the short form can only be used to match functions with no parameter label.

The first part is to keep things convenient when there can be no complication, for instance when you use your own uniquely named function as a callback or selector.

The second part mimics the expectations of those designing an API: with no label you match by the number of parameters and their types (which can be easily inferred), whereas with labels you are expected to actually have to write the labels at the call site (the labels are never inferred at a call site).

I think these rules would work nicely to avoid the scary behaviors of map(String.init). The API was designed with the idea that labels couldn’t be omitted, but, while true at a regular call site, that assumption doesn’t hold when resolving to a function type. This broken assumption is the danger here.

1 Like