Swift currently maps the Int8 type to be equal to the char type of the target platform. On targets where char is unsigned by default, Int8 becomes an unsigned 8-bit integer, which is a clear violation of the Principle of Least Astonishment. Furthermore, it is impossible to specify a signed 8-bit integer type on platforms with unsigned chars.
I'm probably misunderstanding you, but are you sure that's what is
happening? I can't imagine how the standard library would just
silently make Int8 unsigned on Linux arm.
I think the best way to demonstrate this is through an example. Here is a sample swift program:
import Foundation
print(NSNumber(char: Int8.min).shortValue)
There is a lot happening in this snippet of code (including importing
two completely different implementations of Foundation, and the pure
swift one not being affected by Clang importer at all). Could you
provide AST dumps for both platforms for this code?
Of course. Here’s the AST on ARM:
wdillon@tegra-ubuntu:~$ swiftc -dump-ast example.swift
(source_file
...
And Darwin:
Falcon:~ wdillon$ xcrun -sdk macosx swiftc -dump-ast example.swift
(source_file
...
I want to point out that these are identical, as far as I can tell.
I agree. Then, the difference in behavior should be contained in the
NSNumber implementation. As far as this piece of code is concerned,
it correctly passes the value as Int8. Could you debug what's
happening in the corelibs Foundation, to find out why it is not
printing a negative number?
I want to be clear that this isn’t a problem specific to NSNumber. I chose that example because I wanted something that was trivial to check on your own, and limited to Swift project code, that demonstrates the issue. This behavior will occur in any case where a char is imported into swift from C. Fixing NSNumber will address the issue in only that one place. Even if all of of stdlib and CoreFoundation were modified to hide this problem, any user code that interfaces with C will have issues, and require fixes of their own.
I don’t think it’s reasonable to expect that the issue be known and addressed in literally thousands of places where chars from C APIs are present, especially as the issue is hidden from view by the nature of mapping char into Int8. An implementor would have to know that a given API returns char, that it’ll be imported as Int8, and that it might be an Int8 that was intended to be unsigned, then do the right thing.
In contrast, if C char is imported as CChar, it’s very clear what’s happening, and leads the user toward a course of action that is more likely to be appropriate.
I’ve create a github project that demonstrates this problem without using Foundation or CoreFoundation at all. This code creates a small C-based object that has three functions that return a char; one returns -1, one 1 and the last 255.
On signed-char platforms:
From Swift: Type: Int8
From Swift: Negative value: -1, positive value: 1, big positive value: -1
From clang: Negative value: -1, positive value: 1, big positive value: -1
On unsigned-char platforms:
From Swift: Type: Int8
From Swift: Negative value: -1, positive value: 1, big positive value: -1
From clang: Negative value: 255, positive value: 1, big positive value: 255
Code: https://github.com/hpux735/badCharExample.git
It’s clear that Swift is interpreting the bit pattern of the input value as a signed 8-bit integer regardless of how it’s defined in the target platform.
As another exercise, you can tell clang to use signed or unsigned chars and there will be no change:
wdillon@tegra-ubuntu:~$ swiftc example.swift -Xcc -funsigned-char
wdillon@tegra-ubuntu:~$ ./example
128
wdillon@tegra-ubuntu:~$ swiftc example.swift -Xcc -fsigned-char
wdillon@tegra-ubuntu:~$ ./example
128
And it makes sense, since the program you provided does not compile
any C code. It is pure-swift (though it calls into C via corelibs
Foundation).
Yep, that’s right.
What about a proposal where we would always map 'char' to Int8,
regardless of the C's idea of signedness?
In a very real sense this is exactly what is happening currently.
Sorry, I don't see that yet -- it is still unclear to me what is happening.
That’s ok. We’ll keep working on it until I’ve proven to everyone’s satisfaction that there really is a problem.
Given what you showed with corelibs Foundation, I agree there's a
problem. I'm just trying to understand how much of that behavior was
intended, if there are any bugs in the compiler (in implementing our
intended behavior), if there are any bugs in Foundation, and what
would the behavior be if we fixed those bugs. When we have that, we
can analyze our model (as-if it was implemented as intended) and make
a judgement whether it works, and whether is a good one.
I believe that, based on the comments in CTypes.swift
/// This will be the same as either `CSignedChar` (in the common
/// case) or `CUnsignedChar`, depending on the platform.
public typealias CChar = Int8
that the dual-meaning of Int8 is expected and intended, otherwise the author of this comment and code (Ted and Jordan respectively) don’t understand the intended behavior, and I find that hard to believe.
For example, if it turns out that the issue above is due to a bug in
the C parts of CoreFoundation that assumes signed char on arm (because
of iOS, say), then there's nothing that a language change in Swift
could do.
Hopefully I’ve been able to demonstrate that CoreFoundation is not a party to this issue, per se. Really, any time char gets imported into swift there is the possibility of unintended (and potentially very frustrating to diagnose) behavior.
Cheers,
- Will
···
On Feb 26, 2016, at 9:09 AM, Dmitri Gribenko <gribozavr@gmail.com> wrote:
On Fri, Feb 26, 2016 at 9:01 AM, William Dillon <william@housedillon.com> wrote:
On Feb 25, 2016, at 11:13 PM, Dmitri Gribenko <gribozavr@gmail.com> wrote:
On Thu, Feb 25, 2016 at 9:58 PM, William Dillon <william@housedillon.com> wrote: