quick, spot the UB in this code:


(Drew Crawford) #1

let completeFile = [112, 114, 105, 110, 116, 40, 34, 104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 34, 41]
let str = String(validatingUTF8: completeFile)

Did you see it? No?

What if our bytes are not UTF8? Well, one would hope that the constructor, um, validates them.

Turns out it does validate them, *but only if the bytes are null-terminated*. If they are not null-terminated, we get UB <https://github.com/apple/swift/blob/510f29abf77e202780c11d5f6c7449313c819030/stdlib/public/core/CString.swift#L41>.

IMO:

1. If this constructor insists on null-terminated bytes, it should say so in the name (e.g. validatingNullTerminatedUTF8:), and it should crash deterministically if it gets non-terminated bytes, or
2. It should not require null-terminated bytes

Drew


(Dmitri Gribenko) #2

and it should crash
deterministically if it gets non-terminated bytes, or

It can't, how would you check for this, only given a pointer?

2. It should not require null-terminated bytes

This operation converts a C string to a Swift string, so (2) is a non-starter.

Dmitri

···

On Wed, Apr 6, 2016 at 9:16 PM, Drew Crawford via swift-dev <swift-dev@swift.org> wrote:

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(Daniel Dunbar) #3

Could we get a method that takes a [UInt8] directly and performs the same basic function? In my experience I have frequently wanted such a thing (primarily when debugging things) when working with binary protocols that have embedded ASCII data.

- Daniel

···

On Apr 6, 2016, at 9:51 PM, Dmitri Gribenko via swift-dev <swift-dev@swift.org> wrote:

On Wed, Apr 6, 2016 at 9:16 PM, Drew Crawford via swift-dev > <swift-dev@swift.org> wrote:

and it should crash
deterministically if it gets non-terminated bytes, or

It can't, how would you check for this, only given a pointer?

2. It should not require null-terminated bytes

This operation converts a C string to a Swift string, so (2) is a non-starter.

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev


(Dmitri Gribenko) #4

I think the root of the surprise here is that the compiler converts
[UInt8] into an unsafe pointer. This is appropriate when the callee
is a C API, but usually not appropriate when it is a Swift API. This
is not the first time when this implicit conversion causes surprise.
I think we should discuss scoping that conversion to only C and
Objective-C callees.

But I agree with you, we should have a similar operation that works on
arbitrary collections.

Dmitri

···

On Wed, Apr 6, 2016 at 9:54 PM, Daniel Dunbar <daniel_dunbar@apple.com> wrote:

Could we get a method that takes a [UInt8] directly and performs the same basic function?

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(Daniel Dunbar) #5

Could we get a method that takes a [UInt8] directly and performs the same basic function?

I think the root of the surprise here is that the compiler converts
[UInt8] into an unsafe pointer. This is appropriate when the callee
is a C API, but usually not appropriate when it is a Swift API. This
is not the first time when this implicit conversion causes surprise.
I think we should discuss scoping that conversion to only C and
Objective-C calls.

+1 from me.

But I agree with you, we should have a similar operation that works on
arbitrary collections.

Cool, thanks.

- Daniel

···

On Apr 6, 2016, at 9:58 PM, Dmitri Gribenko <gribozavr@gmail.com> wrote:
On Wed, Apr 6, 2016 at 9:54 PM, Daniel Dunbar <daniel_dunbar@apple.com> wrote:

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(Drew Crawford) #6

Then it is inappropriately named. The name of the constructor is `validatingUTF8`, not `cString`.

···

On Apr 6, 2016, at 11:51 PM, Dmitri Gribenko <gribozavr@gmail.com> wrote:

This operation converts a C string to a Swift string, so (2) is a non-starter.


(Dmitri Gribenko) #7

Maybe! If you can think of a better name, please start a thread on
swift-evolution, but please keep the implicit conversion issue
separate.

Dmitri

···

On Wed, Apr 6, 2016 at 10:03 PM, Drew Crawford <drew@sealedabstract.com> wrote:

On Apr 6, 2016, at 11:51 PM, Dmitri Gribenko <gribozavr@gmail.com> wrote:

This operation converts a C string to a Swift string, so (2) is a
non-starter.

Then it is inappropriately named. The name of the constructor is
`validatingUTF8`, not `cString`.

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/