Scanner with Swift 3


(Oliver Drobnik) #1

Hello,

This is a repeat of an earlier post where I did not get a single response... might have fallen through the cracks...

Earlier I posted this proposal to the Swift Evolution Mailing list. Then I looked at the NSScanner implementation in core libs and found experimental API using the return value to for returning the scanned results. See my comments on that below:

···

Working on a function for Foundation’s Scanner I stumbled on this LLVM crash: https://bugs.swift.org/browse/SR-3295

This got me thinking about a workaround and I would like to prose this:

When importing Foundation into Swift 3, all

AutoreleasingUnsafeMutablePointer<T?>?

should instead be exposed as simple:

inout T?

e.g.

open func scanString(_ string: String, into result: AutoreleasingUnsafeMutablePointer<NSString?>?) -> Bool

would become

open func scanString(_ string: String, into result: inout String?) -> Bool

The call would stay exactly the same for normal use cases where you specify a receiving variable:

var string: String?
scanString("=", into: &string)

because inout parameters require a &

for the use case where you don’t require a receiving parameter, a second method without result parameter would be generated:

open func scanString(_ string: String) -> Bool

This is necessary because you cannot specify nil or an immutable value for an inout parameter.

A fixit/migration would change calls to

scanString(“foo", into result: nil)

into

scanString(“foo")

The normal call with receiving variable would stay the same. But the case without return would become more concise.

What do you think?

kind regards
Oliver Drobnik

The new experimental API does not consider the situation that one might just want to scan a token in a long string and since I know what I am scanning I don’t care about the result. For example in my scanning I would like to processed if I hit a “foo“ …

The experimental API would force me to do this:

if let _ = scanString(“foo“) { }

My proposal is more elegant:

if scanString(“foo”) {}

And if you do lots and lots of these scans then performance would benefit from the compiler not having to pack the return for an autoreleasing value which would have to be released right away due to assignment to _.

IMHO my proposal is more in line with the spirit of the original API: 1) return Bool if the scan was successful, 2) optionally return the scanned value.

kind regards
Oliver Drobnik


(Philippe Hausler) #2

It is worth noting that this is not the only case that has this “flaw”. In many of the AUMPs exposed in Foundation they are really either “out” parameters or in a few cases “inout” parameters (and I think there are a handful of “in” style parameters too).

The major issue for refining this in the overlay is the implementation of subclassers.

Lets say you have a NSScanner subclass (I am dubious on how common this is but it is a decent example).

If we were to add a refinement for swift of scanString the problem would be that refinement would not be called as expected by subclassers;

lets say we refine via overlay to:

class Scanner : NSObject {

func scanString(_ string: String, into result: inout String?) -> Bool
// which hides this NS_REFINED_FOR_SWIFT as:
func __scanString(_ string: String, into result: AutoreleasingUnsafeMutablePointer<NSString?>?) -> Bool

}

class MyScanner : Scanner {
override func scanString(_ string: String, into result: inout String?) -> Bool { … }
}

when a MyScanner is passed into an objective-c API calling the method -[NSScanner scanString:intoString:] will invoke the __scanString swift method instead of the override of scanString (effectively making the override final).

This presents a big issue with NSFormatter subclasses; which are arguably more common to be subclassed than Scanner. So the funnel APIs of

func getObjectValue(_ obj: AutoreleasingUnsafeMutablePointer<AnyObject?>?, for string: String, errorDescription error: AutoreleasingUnsafeMutablePointer<NSString?>?) -> Bool

probably should be written as:

func getObjectValue(_ obj: inout Any?, for string: String, errorDescription error: inout String?) -> Bool

or better yet:
class Formatter<Subject> : NSObject {
...
func getObjectValue(_ obj: inout Subject?, for string: String, errorDescription error: inout String?) -> Bool
...
}

Shy of re-implementing these classes all in swift it would be difficult with the current importing mechanisms to correct this problem. Ideally I think we should have some mechanism to annotate in, out, and inout parameters so that the intent of the API is properly exposed as well as being more swift type friendly.

One thought is that we could re-use the distributed object annotations for parameters as type annotations to infer to the compiler on the proper semantics of how to handle the parameters.

e.g.

- (BOOL)scanString:(NSString *)string intoString:(out NSString * _Nullable * _Nullable)result;

could infer to the compiler that the parameter result with the label of into would mean that it is the responsibility of this method to populate a value and similarly to error cases the boolean value determines that the populated value is actually set (else it is nil)
* if out is not an option then we could still interpret inout and out as swift’s inout.

If this were the case it would import as:

func scanString(_ string: String, into result: out String?) -> Bool

this would mean that usage could then be

let str: String?
if myScanner.scanString(“foo”, into: &str) {
}

Overall this would be safer because the variable str would be protected idiomatically from any further mutation past the initialization via the method scanString. It would also be potentially more efficient since it would could avoid autoreleasing an object. It would be valid for subclassing in that the method could be called by objc in a reasonable manner. And as a nice bonus the compiler could know about the out parameter’s optional value inside of the if guard scope (similar to guard/let, but that is perhaps icing on the cake if that would work).

tl;dr - I am of the opinion that most of Foundation’s APIs that are imported as AUMP are not exactly “intended” to be that and probably would be better off as something else.

···

On Dec 5, 2016, at 10:55 AM, Oliver Drobnik via swift-corelibs-dev <swift-corelibs-dev@swift.org> wrote:

Hello,

This is a repeat of an earlier post where I did not get a single response... might have fallen through the cracks...

Earlier I posted this proposal to the Swift Evolution Mailing list. Then I looked at the NSScanner implementation in core libs and found experimental API using the return value to for returning the scanned results. See my comments on that below:

Working on a function for Foundation’s Scanner I stumbled on this LLVM crash: https://bugs.swift.org/browse/SR-3295

This got me thinking about a workaround and I would like to prose this:

When importing Foundation into Swift 3, all

AutoreleasingUnsafeMutablePointer<T?>?

should instead be exposed as simple:

inout T?

e.g.

open func scanString(_ string: String, into result: AutoreleasingUnsafeMutablePointer<NSString?>?) -> Bool

would become

open func scanString(_ string: String, into result: inout String?) -> Bool

The call would stay exactly the same for normal use cases where you specify a receiving variable:

var string: String?
scanString("=", into: &string)

because inout parameters require a &

for the use case where you don’t require a receiving parameter, a second method without result parameter would be generated:

open func scanString(_ string: String) -> Bool

This is necessary because you cannot specify nil or an immutable value for an inout parameter.

A fixit/migration would change calls to

scanString(“foo", into result: nil)

into

scanString(“foo")

The normal call with receiving variable would stay the same. But the case without return would become more concise.

What do you think?

kind regards
Oliver Drobnik

The new experimental API does not consider the situation that one might just want to scan a token in a long string and since I know what I am scanning I don’t care about the result. For example in my scanning I would like to processed if I hit a “foo“ …

The experimental API would force me to do this:

if let _ = scanString(“foo“) { }

My proposal is more elegant:

if scanString(“foo”) {}

And if you do lots and lots of these scans then performance would benefit from the compiler not having to pack the return for an autoreleasing value which would have to be released right away due to assignment to _.

IMHO my proposal is more in line with the spirit of the original API: 1) return Bool if the scan was successful, 2) optionally return the scanned value.

kind regards
Oliver Drobnik

_______________________________________________
swift-corelibs-dev mailing list
swift-corelibs-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-corelibs-dev