String initializers and developer ergonomics


(Austin Zheng) #1

Hello Swift users,

I wanted to run something past you folks and get some opinions/feedback.

About a month ago on Hacker News I saw someone commenting about how Swift's string-handling code was unbearably slow (3 seconds to run a code sample, vs. 0.8 in Java). I asked him to provide the code, and he obliged. Unfortunately, I didn't have time to dig into it until this morning. The code in its entirety can be found here: https://gist.github.com/austinzheng/d6c674780a58cb63832c4df3f809e683

At line 26 we have the following code:

result.append(begin == eos ? "" : String(cs[begin..<end.successor()]))

'cs' is a UTF16 view into an input string, while 'result' is a [String]. When I profiled the code in Instruments, I noticed that it was spending significant time within the reflection machinery.

It turns out that the initializer to make a String out of a utf16 view looks like this, and I believe this is the initializer the author intended to call:

init?(_: String.UTF16View) <>

However, the actual initializer being called was this String initializer in the Mirror code:

public init<Subject>(_ instance: Subject)

This seems like a tricky gotcha for developers who aren't extremely familiar with both the String and reflection APIs. His code looked reasonable at a first glance and I didn't suspect anything was wrong until I profiled it. Even so, I only made the connection because I recognized the name of the standard library function from poking around inside the source files.

What do other people think? Is this something worth worrying about, or is it so rare that it shouldn't matter? Also, any suggestions as to how that code sample might be improved would be appreciated - my naive first attempt wasn't any better.

Best,
Austin


(Dennis Weissmann) #2

Hi Austin,

I further “swiftyfied” the code to this (swift-DEVELOPMENT-SNAPSHOT-2016-05-03-a):

import Foundation

let N = 1_000_000

func generateTestData() -> [String] {
  return (0…<N).map { _ in ",abc, 123 ,x, , more more more,\u{A0}and yet more, " }
}

func splitAndTrim(s: String, sep: Character) -> [String] {
  return s.characters.split(separator: sep, omittingEmptySubsequences: false)
                     .map(String.init)
                     .map { $0.trimmingCharacters(in: .whitespacesAndNewlines()) }
}

func doSplits(data: [String]) -> Int {
  return data.reduce(0) { $0 + splitAndTrim(s: $1, sep: Character(",")).count }
}

let data = generateTestData()
let start = NSDate()
let sum = doSplits(data: data)
print("elapsed: \(NSDate().timeIntervalSince(start))")
print("sum: \(sum)")

I didn’t expect it to run as fast as the original (or even faster) but the code above takes 13 seconds to run… But that’s just an aside :slight_smile:

To actually come back to the problem you mentioned:

You can disambiguate the initializer to call in the following ways (probably there are other ways):

Add an ! after the init call since the one which should be used is failable (the one that uses reflection does not), so the compiler chooses the “right” one.

result.append(begin == eos ? "" : String(cs[begin..<end.successor()])!)

Or you can reference the initializer and use it afterwards (more future-proof):

let utf16View = cs[begin..<end.successor()]
let initializer = String.init as (String.UTF16View -> String?)
result.append(begin == eos ? "" : initializer(utf16View)!)

This reduces the time to 2.2 seconds on my machine (from ~2.9), but that’s still far from 0.8.

What do other people think? Is this something worth worrying about, or is it so rare that it shouldn't matter? Also, any suggestions as to how that code sample might be improved would be appreciated - my naive first attempt wasn't any better.

It might be very surprising behavior and we could think about adding an external label to all those initializers that cause trouble. The performance is of course a little worrying for a zero-cost abstractions language, but I think there are currently more important goals (getting the design right, etc.). We are still at the early stages with a lot of stuff changing and I think when the syntax, ABI, etc. are settled, performance / optimization will get more attention :slight_smile:

- Dennis

···

On May 7, 2016, at 7:39 PM, Austin Zheng via swift-users <swift-users@swift.org> wrote:

Hello Swift users,

I wanted to run something past you folks and get some opinions/feedback.

About a month ago on Hacker News I saw someone commenting about how Swift's string-handling code was unbearably slow (3 seconds to run a code sample, vs. 0.8 in Java). I asked him to provide the code, and he obliged. Unfortunately, I didn't have time to dig into it until this morning. The code in its entirety can be found here: https://gist.github.com/austinzheng/d6c674780a58cb63832c4df3f809e683

At line 26 we have the following code:

result.append(begin == eos ? "" : String(cs[begin..<end.successor()]))

'cs' is a UTF16 view into an input string, while 'result' is a [String]. When I profiled the code in Instruments, I noticed that it was spending significant time within the reflection machinery.

It turns out that the initializer to make a String out of a utf16 view looks like this, and I believe this is the initializer the author intended to call:

init?(_: String.UTF16View) <>

However, the actual initializer being called was this String initializer in the Mirror code:

public init<Subject>(_ instance: Subject)

This seems like a tricky gotcha for developers who aren't extremely familiar with both the String and reflection APIs. His code looked reasonable at a first glance and I didn't suspect anything was wrong until I profiled it. Even so, I only made the connection because I recognized the name of the standard library function from poking around inside the source files.

What do other people think? Is this something worth worrying about, or is it so rare that it shouldn't matter? Also, any suggestions as to how that code sample might be improved would be appreciated - my naive first attempt wasn't any better.

Best,
Austin

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Joe Groff) #3

This definitely strikes me as a problem. The String<T>(_:slight_smile: constructor is very easy to call by accident if you're trying to hit another unlabeled initializer. It also strikes me as not particularly "value-preserving", since stringifying many types loses information. Perhaps we should propose giving it a label, String(printing:) maybe?

-Joe

···

On May 7, 2016, at 10:39 AM, Austin Zheng via swift-users <swift-users@swift.org> wrote:

Hello Swift users,

I wanted to run something past you folks and get some opinions/feedback.

About a month ago on Hacker News I saw someone commenting about how Swift's string-handling code was unbearably slow (3 seconds to run a code sample, vs. 0.8 in Java). I asked him to provide the code, and he obliged. Unfortunately, I didn't have time to dig into it until this morning. The code in its entirety can be found here: https://gist.github.com/austinzheng/d6c674780a58cb63832c4df3f809e683

At line 26 we have the following code:

result.append(begin == eos ? "" : String(cs[begin..<end.successor()]))

'cs' is a UTF16 view into an input string, while 'result' is a [String]. When I profiled the code in Instruments, I noticed that it was spending significant time within the reflection machinery.

It turns out that the initializer to make a String out of a utf16 view looks like this, and I believe this is the initializer the author intended to call:

init?(_: String.UTF16View)

However, the actual initializer being called was this String initializer in the Mirror code:

public init<Subject>(_ instance: Subject)

This seems like a tricky gotcha for developers who aren't extremely familiar with both the String and reflection APIs. His code looked reasonable at a first glance and I didn't suspect anything was wrong until I profiled it. Even so, I only made the connection because I recognized the name of the standard library function from poking around inside the source files.

What do other people think? Is this something worth worrying about, or is it so rare that it shouldn't matter? Also, any suggestions as to how that code sample might be improved would be appreciated - my naive first attempt wasn't any better.


(Jacob Bandes-Storch) #4

>
> Hello Swift users,
>
> I wanted to run something past you folks and get some opinions/feedback.
>
> About a month ago on Hacker News I saw someone commenting about how
Swift's string-handling code was unbearably slow (3 seconds to run a code
sample, vs. 0.8 in Java). I asked him to provide the code, and he obliged.
Unfortunately, I didn't have time to dig into it until this morning. The
code in its entirety can be found here:
https://gist.github.com/austinzheng/d6c674780a58cb63832c4df3f809e683
>
> At line 26 we have the following code:
>
> result.append(begin == eos ? "" : String(cs[begin..<end.successor()]))
>
> 'cs' is a UTF16 view into an input string, while 'result' is a [String].
When I profiled the code in Instruments, I noticed that it was spending
significant time within the reflection machinery.
>
> It turns out that the initializer to make a String out of a utf16 view
looks like this, and I believe this is the initializer the author intended
to call:
>
> init?(_: String.UTF16View)
>
> However, the actual initializer being called was this String initializer
in the Mirror code:
>
> public init<Subject>(_ instance: Subject)
>
> This seems like a tricky gotcha for developers who aren't extremely
familiar with both the String and reflection APIs. His code looked
reasonable at a first glance and I didn't suspect anything was wrong until
I profiled it. Even so, I only made the connection because I recognized the
name of the standard library function from poking around inside the source
files.
>
> What do other people think? Is this something worth worrying about, or
is it so rare that it shouldn't matter? Also, any suggestions as to how
that code sample might be improved would be appreciated - my naive first
attempt wasn't any better.

This definitely strikes me as a problem. The String<T>(_:slight_smile: constructor is
very easy to call by accident if you're trying to hit another unlabeled
initializer. It also strikes me as not particularly "value-preserving",
since stringifying many types loses information. Perhaps we should propose
giving it a label, String(printing:) maybe?

+1

···

On Mon, May 9, 2016 at 10:25 AM, Joe Groff via swift-users < swift-users@swift.org> wrote:

> On May 7, 2016, at 10:39 AM, Austin Zheng via swift-users < > swift-users@swift.org> wrote:

-Joe
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Zachary Waldowski) #5

I've been wondering about this since String(reflecting:) came about, as
plain print() is capable of doing about the same amount of reflection as
debugPrint(). I don't want to get into the bikeshedding of it, but I'd
really like to see printing String initializer get a verb label. It's
pretty strongly against the API guidelines for initializers.

Sincerely,
Zachary Waldowski
zach@waldowski.me

···

On Mon, May 9, 2016, at 01:25 PM, Joe Groff via swift-users wrote:

This definitely strikes me as a problem. The String<T>(_:slight_smile: constructor is
very easy to call by accident if you're trying to hit another unlabeled
initializer. It also strikes me as not particularly "value-preserving",
since stringifying many types loses information. Perhaps we should
propose giving it a label, String(printing:) maybe?


(Chris Lattner) #6

I agree, +1.

-Chris

···

On May 9, 2016, at 10:25 AM, Joe Groff via swift-users <swift-users@swift.org> wrote:

This seems like a tricky gotcha for developers who aren't extremely familiar with both the String and reflection APIs. His code looked reasonable at a first glance and I didn't suspect anything was wrong until I profiled it. Even so, I only made the connection because I recognized the name of the standard library function from poking around inside the source files.

What do other people think? Is this something worth worrying about, or is it so rare that it shouldn't matter? Also, any suggestions as to how that code sample might be improved would be appreciated - my naive first attempt wasn't any better.

This definitely strikes me as a problem. The String<T>(_:) constructor is very easy to call by accident if you're trying to hit another unlabeled initializer. It also strikes me as not particularly "value-preserving", since stringifying many types loses information. Perhaps we should propose giving it a label, String(printing:) maybe?


(Daniel Dunbar) #7

>
> Hello Swift users,
>
> I wanted to run something past you folks and get some opinions/feedback.
>
> About a month ago on Hacker News I saw someone commenting about how Swift's string-handling code was unbearably slow (3 seconds to run a code sample, vs. 0.8 in Java). I asked him to provide the code, and he obliged. Unfortunately, I didn't have time to dig into it until this morning. The code in its entirety can be found here: https://gist.github.com/austinzheng/d6c674780a58cb63832c4df3f809e683
>
> At line 26 we have the following code:
>
> result.append(begin == eos ? "" : String(cs[begin..<end.successor()]))
>
> 'cs' is a UTF16 view into an input string, while 'result' is a [String]. When I profiled the code in Instruments, I noticed that it was spending significant time within the reflection machinery.
>
> It turns out that the initializer to make a String out of a utf16 view looks like this, and I believe this is the initializer the author intended to call:
>
> init?(_: String.UTF16View)
>
> However, the actual initializer being called was this String initializer in the Mirror code:
>
> public init<Subject>(_ instance: Subject)
>
> This seems like a tricky gotcha for developers who aren't extremely familiar with both the String and reflection APIs. His code looked reasonable at a first glance and I didn't suspect anything was wrong until I profiled it. Even so, I only made the connection because I recognized the name of the standard library function from poking around inside the source files.
>
> What do other people think? Is this something worth worrying about, or is it so rare that it shouldn't matter? Also, any suggestions as to how that code sample might be improved would be appreciated - my naive first attempt wasn't any better.

This definitely strikes me as a problem. The String<T>(_:slight_smile: constructor is very easy to call by accident if you're trying to hit another unlabeled initializer. It also strikes me as not particularly "value-preserving", since stringifying many types loses information. Perhaps we should propose giving it a label, String(printing:) maybe?

+1

+1

- Daniel

···

On May 9, 2016, at 10:28 AM, Jacob Bandes-Storch via swift-users <swift-users@swift.org> wrote:
On Mon, May 9, 2016 at 10:25 AM, Joe Groff via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:
> On May 7, 2016, at 10:39 AM, Austin Zheng via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

-Joe
_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Austin Zheng) #8

Thanks to all who commented. I'll put together a proposal to rename that
initializer tonight, unless someone else wants to do it. This seems like a
pretty straightforward clarity gain, and it might even help a bit with
compilation times.

Austin

···

On Mon, May 9, 2016 at 10:54 AM, Daniel Dunbar via swift-users < swift-users@swift.org> wrote:

On May 9, 2016, at 10:28 AM, Jacob Bandes-Storch via swift-users < > swift-users@swift.org> wrote:

On Mon, May 9, 2016 at 10:25 AM, Joe Groff via swift-users < > swift-users@swift.org> wrote:

> On May 7, 2016, at 10:39 AM, Austin Zheng via swift-users < >> swift-users@swift.org> wrote:
>
> Hello Swift users,
>
> I wanted to run something past you folks and get some opinions/feedback.
>
> About a month ago on Hacker News I saw someone commenting about how
Swift's string-handling code was unbearably slow (3 seconds to run a code
sample, vs. 0.8 in Java). I asked him to provide the code, and he obliged.
Unfortunately, I didn't have time to dig into it until this morning. The
code in its entirety can be found here:
https://gist.github.com/austinzheng/d6c674780a58cb63832c4df3f809e683
>
> At line 26 we have the following code:
>
> result.append(begin == eos ? "" : String(cs[begin..<end.successor()]))
>
> 'cs' is a UTF16 view into an input string, while 'result' is a
[String]. When I profiled the code in Instruments, I noticed that it was
spending significant time within the reflection machinery.
>
> It turns out that the initializer to make a String out of a utf16 view
looks like this, and I believe this is the initializer the author intended
to call:
>
> init?(_: String.UTF16View)
>
> However, the actual initializer being called was this String
initializer in the Mirror code:
>
> public init<Subject>(_ instance: Subject)
>
> This seems like a tricky gotcha for developers who aren't extremely
familiar with both the String and reflection APIs. His code looked
reasonable at a first glance and I didn't suspect anything was wrong until
I profiled it. Even so, I only made the connection because I recognized the
name of the standard library function from poking around inside the source
files.
>
> What do other people think? Is this something worth worrying about, or
is it so rare that it shouldn't matter? Also, any suggestions as to how
that code sample might be improved would be appreciated - my naive first
attempt wasn't any better.

This definitely strikes me as a problem. The String<T>(_:slight_smile: constructor is
very easy to call by accident if you're trying to hit another unlabeled
initializer. It also strikes me as not particularly "value-preserving",
since stringifying many types loses information. Perhaps we should propose
giving it a label, String(printing:) maybe?

+1

+1

- Daniel

-Joe
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Karl) #9

Now that I see it, I realise I’ve run in to that a fair few times myself; I didn’t even know the string-view initialisers were failable.

I’m not sure I understand why reflection is exposed as an initialiser on String in the first place - it might make more sense as a global function.

Karl

···

On 9 May 2016, at 20:43, Austin Zheng via swift-users <swift-users@swift.org> wrote:

Thanks to all who commented. I'll put together a proposal to rename that initializer tonight, unless someone else wants to do it. This seems like a pretty straightforward clarity gain, and it might even help a bit with compilation times.

Austin

On Mon, May 9, 2016 at 10:54 AM, Daniel Dunbar via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

On May 9, 2016, at 10:28 AM, Jacob Bandes-Storch via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

On Mon, May 9, 2016 at 10:25 AM, Joe Groff via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:

> On May 7, 2016, at 10:39 AM, Austin Zheng via swift-users <swift-users@swift.org <mailto:swift-users@swift.org>> wrote:
>
> Hello Swift users,
>
> I wanted to run something past you folks and get some opinions/feedback.
>
> About a month ago on Hacker News I saw someone commenting about how Swift's string-handling code was unbearably slow (3 seconds to run a code sample, vs. 0.8 in Java). I asked him to provide the code, and he obliged. Unfortunately, I didn't have time to dig into it until this morning. The code in its entirety can be found here: https://gist.github.com/austinzheng/d6c674780a58cb63832c4df3f809e683
>
> At line 26 we have the following code:
>
> result.append(begin == eos ? "" : String(cs[begin..<end.successor()]))
>
> 'cs' is a UTF16 view into an input string, while 'result' is a [String]. When I profiled the code in Instruments, I noticed that it was spending significant time within the reflection machinery.
>
> It turns out that the initializer to make a String out of a utf16 view looks like this, and I believe this is the initializer the author intended to call:
>
> init?(_: String.UTF16View)
>
> However, the actual initializer being called was this String initializer in the Mirror code:
>
> public init<Subject>(_ instance: Subject)
>
> This seems like a tricky gotcha for developers who aren't extremely familiar with both the String and reflection APIs. His code looked reasonable at a first glance and I didn't suspect anything was wrong until I profiled it. Even so, I only made the connection because I recognized the name of the standard library function from poking around inside the source files.
>
> What do other people think? Is this something worth worrying about, or is it so rare that it shouldn't matter? Also, any suggestions as to how that code sample might be improved would be appreciated - my naive first attempt wasn't any better.

This definitely strikes me as a problem. The String<T>(_:slight_smile: constructor is very easy to call by accident if you're trying to hit another unlabeled initializer. It also strikes me as not particularly "value-preserving", since stringifying many types loses information. Perhaps we should propose giving it a label, String(printing:) maybe?

+1

+1

- Daniel

-Joe
_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org <mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users