[Draft] Target-specific CChar


(William Dillon) #1

Please see the gist for the most up-to-date drafts.

I appreciate any comments, concerns and questions!

Improve the portability of Swift with differently signed char.

Proposal: SE–004x
Author: William Dillon
Status: Draft
Review manager: TBD
Introduction

In C, the signness of char is undefined. A convention is set by either the platform, such as Windows, or by the architecture ABI specification, as is typical on System-V derived systems. A subset of known platforms germane to this discussion and their char signness is provided below.

char ARM mips PPC PPC64 i386 x86_64
Linux/ELF unsigned 1 unsigned 2 unsigned 3 unsigned 4 signed 5 signed 6
Mach-O signed [7] N/A signed [7] signed [7] signed [7] signed [7]
Windows signed [8] signed [8] signed [8] signed [8] signed [8] signed [8]
This is not a great problem in C, and indeed many aren’t even aware of the issue. Part of the reason for this is that C will silently cast many types into other similar types as necessary. Notably, even with -Wall clang produces no warnings while casting beteen any pair of char, unsigned char, signed char and int. Swift, in contrast, does not cast types without explicit direction from the programmer. As implemented, char is interpreted by swift as Int8, regardless of whether the underlying platform uses signed or unsigned char. As every Apple platform (seemingly) uses signed char as a convention, it was an appropriate choice. However, now that Swift is being ported to more and more platforms, it is important that we decide how to handle the alternate case.

The problem at hand may be most simply demonstrated by a small example. Consider a C API where a set of functions return values as char:

char charNegFunction(void) { return -1; }
char charBigPosFunction(void) { return 255; }
char charPosFunction(void) { return 1; }
Then, if the API is used in C thusly: C char negValue = charNegFunction(); char posValue = charPosFunction(); char bigValue = charBigPosFunction(); printf("From clang: Negative value: %d, positive value: %d, big positive value: %d\n", negValue, posValue, bigValue); You get exactly what you would expect on signed char platforms: From clang: Negative value: -1, positive value: 1, big positive value: -1 and on unsigned char platforms: From clang: Negative value: 255, positive value: 1, big positive value: 255 In its current state, swift behaves similarly to C on signed char platforms. From Swift: Negative value: -1, positive value: 1, big positive value: -1

This code is available here, if you would like to play with it yourself.

Motivation

The third stated focus area for Swift 3.0 is portability, to quote the evolution document:

Portability: Make Swift available on other platforms and ensure that one can write portable Swift code that works properly on all of those platforms.
As it stands, Swift’s indifference to the signness of char while importing from C can be ignored in many cases. The consequences of inaction, however, leave the door open for extremely subtle and dificult to diagnose bugs any time a C API relies on the use of values greater than 128 on platforms with unsigned char; in this case the current import model certainly violates the Principle of Least Astonishment.

This is not an abstract problem that I want to have solved “just because.” This issue has been a recurrent theme, and has come up several times during code review. I’ve included a sampling of these to provide some context to the discussion:

Swift PR–1103
Swift Foundation PR–265
In these discussions we obviously struggle to adequately solve the issues at hand without introducing the changes proposed here. Indeed, this proposal was suggested in Swift Foundation PR–265 by Joe Groff.

These changes should happen during a major release. Considering them for Swift 3 will enable us to move forward efficiently while constraining any source incompatibilities to transitions where users expect them. Code that works properly on each of these platforms is already likely to work properly. Further, the implementation of this proposal will identify cases where a problem exists and the symptoms have not yet been identified.

Proposed solution

I propose that the CChar be aliased to UInt8 on targets where char is unsigned, and Int8 on platforms where char is signed.

Detailed design

In principle this is a very small change to swift/stdlib/public/core/CTypes.swift:

///
/// This will be the same as either `CSignedChar` (in the common
/// case) or `CUnsignedChar`, depending on the platform.
+#if os(OSX) || os(iOS) || os(windows) || arch(i383) || arch(x86_64)
public typealias CChar = Int8
+#else
+public typealias CChar = UInt8
+#endif
Impact on existing code

Though the change itself is trivial, the impact on other parts of the project including stdlib and foundation cannot be be ignored. To get a handle on the scope of the required changes, I’ve performed this change on the swift project, and I encourage any interested party to invesigate. https://github.com/apple/swift/compare/master…hpux735:char This project fork builds on both signed and unsigned char platforms. There is one test failure on signed char platforms and two test failures on unsigned char platforms resulting from remaining assumptions about the signness of char. They should be trivial to address by someone skilled at lit tests, and will be fixed prior to any pull request.

In general, code that you write will fail to compile if you assume that C APIs will continue to import char as Int8. Your choice is to write code that interfaces with char using CChar or to break it out into seperate cases. Other than one test, which relies on breaking the char assumption for the purposes of generating an error, I have not seen a case that justifies using conditional compilation directives over CChar. There are cases where it is necessary to cast to a concretely-signed type, such as UInt8 or Int8 from CChar, but in those cases it encourages you to consider the impact of assuming the structure of the data that you’re working with. Very often, if you write your code using CChar it will be portable, and compile cleanly on all platforms.

Alternatives considered

The only real alternative is the status quo. Currently, Swift treats all unspecified chars as signed. This mostly works most of the time, but I think we can do better.

Footnotes

[7]: proof by construction (is it signed by convention?) ``` $ cat test.c char char(char a) { return a; } signed char schar(signed char a) { return a; } unsigned char _uchar(unsigned char a) { return a; }

$ clang -S -emit-llvm -target -unknown-{windows,darwin} ``` and look for “signext” OR “zeroext" in @_char definition

[8]: Windows char is signed by convention.


(Dmitri Gribenko) #2

It does violate the principle of least astonishment, but we should
acknowledge that the implementation-specific nature of C's char signedness
is making code *less* portable, not more -- because the same code can mean
different things on different platforms. Reflecting the same in Swift
makes Swift code less portable, too.

Dmitri

···

On Wed, Mar 2, 2016 at 9:56 AM, William Dillon via swift-evolution < swift-evolution@swift.org> wrote:

Please see the gist <https://gist.github.com/hpux735/eafad78108ed42879690>
for the most up-to-date drafts.

I appreciate any comments, concerns and questions!
Improve the portability of Swift with differently signed char.

   - Proposal: SE–004x
   <https://github.com/apple/swift-evolution/blob/master/proposals/004x-target-specific-chars.md>
   - Author: William Dillon <https://github.com/hpux735>
   - Status: *Draft*
   - Review manager: TBD

Introduction

In C, the signness of char is undefined. A convention is set by either
the platform, such as Windows, or by the architecture ABI specification, as
is typical on System-V derived systems. A subset of known platforms germane
to this discussion and their char signness is provided below.
char ARM mips PPC PPC64 i386 x86_64
Linux/ELF unsigned 1
<http://www.eecs.umich.edu/courses/eecs373/readings/ARM-AAPCS-EABI-v2.08.pdf> unsigned
2 <http://math-atlas.sourceforge.net/devel/assembly/mipsabi32.pdf> unsigned
3 <https://uclibc.org/docs/psABI-ppc.pdf> unsigned 4
<http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html> signed 5
<http://www.sco.com/developers/devspecs/abi386-4.pdf> signed 6
<http://www.x86-64.org/documentation/abi.pdf>
Mach-O signed [7] N/A signed [7] signed [7] signed [7] signed [7]
Windows signed [8] signed [8] signed [8] signed [8] signed [8] signed [8]

This is not a great problem in C, and indeed many aren’t even aware of the
issue. Part of the reason for this is that C will silently cast many types
into other similar types as necessary. Notably, even with -Wall clang
produces no warnings while casting beteen any pair of char, unsigned char,
signed char and int. Swift, in contrast, does not cast types without
explicit direction from the programmer. As implemented, char is
interpreted by swift as Int8, regardless of whether the underlying
platform uses signed or unsigned char. As every Apple platform
(seemingly) uses signed char as a convention, it was an appropriate
choice. However, now that Swift is being ported to more and more platforms,
it is important that we decide how to handle the alternate case.

The problem at hand may be most simply demonstrated by a small example.
Consider a C API where a set of functions return values as char:

char charNegFunction(void) { return -1; }
char charBigPosFunction(void) { return 255; }
char charPosFunction(void) { return 1; }

Then, if the API is used in C thusly: C char negValue =
charNegFunction(); char posValue = charPosFunction(); char bigValue =
charBigPosFunction(); printf("From clang: Negative value: %d, positive
value: %d, big positive value: %d\n", negValue, posValue, bigValue); You
get exactly what you would expect on signed char platforms: From clang:
Negative value: -1, positive value: 1, big positive value: -1 and on unsigned
char platforms: From clang: Negative value: 255, positive value: 1, big
positive value: 255 In its current state, swift behaves similarly to C on signed
char platforms. From Swift: Negative value: -1, positive value: 1, big
positive value: -1

This code is available here <https://github.com/hpux735/badCharExample>,
if you would like to play with it yourself.
Motivation

The third stated focus area for Swift 3.0 is *portability*, to quote the
evolution document:

   - *Portability*: Make Swift available on other platforms and ensure
   that one can write portable Swift code that works properly on all of those
   platforms.

As it stands, Swift’s indifference to the signness of char while
importing from C can be ignored in many cases. The consequences of
inaction, however, leave the door open for extremely subtle and dificult to
diagnose bugs any time a C API relies on the use of values greater than 128
on platforms with unsigned char; in this case the current import model
certainly violates the Principle of Least Astonishment.

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(William Dillon) #3

It does violate the principle of least astonishment, but we should acknowledge that the implementation-specific nature of C's char signedness is making code *less* portable, not more -- because the same code can mean different things on different platforms. Reflecting the same in Swift makes Swift code less portable, too.

Dmitri
That is a fair point, and I agree for the most part. However, It is my intent and expectation that the use of CChar would be limited to the margins where C APIs are imported. Once values become a part of Swift (and used in places outside of the C interface) they should have been cast into a pure Swift type (such as UInt8, Int8, Int, etc).

- Will


(Jeremy Pereira) #4

I propose that the CChar be aliased to UInt8 on targets where char is unsigned, and Int8 on platforms where char is signed.

This will make Swift code _less_ portable, since you will have to write different Swift code depending on whether your C implementation of char is signed or unsigned.

Alternatives considered

The only real alternative is the status quo.

No it isn’t. As long as CChar remains an alias to one of the integer types, it would be better to make it an alias to UInt8. The reason being that char * is often a pointer to a sequence of UTF-8 bytes and this interpretation is only going to get more common. String.UTF8View consists of a sequence of CodeUnits and UTF8.CodeUnit is, itself an alias to UInt8.

As others have said, it might be even better to have a new opaque type that’s 8 bits wide (I assume we are ignoring the fact that the C standard doesn’t define the width of a byte) but, in that case, I would argue that bitwise operations should be allowed on it.

···

On 2 Mar 2016, at 18:56, William Dillon via swift-evolution <swift-evolution@swift.org> wrote:

Currently, Swift treats all unspecified chars as signed. This mostly works most of the time, but I think we can do better.


(Ben Rimmington) #5

Apart from `CChar`, are the other typealiases in "CTypes.swift" also not portable?

For example, `CLong` and `CUnsignedLong` with LLP64 platforms (e.g. Microsoft Windows)?

Should the Clang Importer use the same typealiases in generated interfaces, to encourage portability?

- public func labs(_: Int) -> Int
+ public func labs(_: CLong) -> CLong

-- Ben


(Ben Rimmington) #6

Related bugs:

[SR-466] Imported APIs taking chars should use CChar
<https://bugs.swift.org/browse/SR-466>

[SR-747] CChar("\n") returns nil
<https://bugs.swift.org/browse/SR-747>

-- Ben


(Michel Fortin) #7

As it stands, Swift’s indifference to the signness of char while importing from C can be ignored in many cases. The consequences of inaction, however, leave the door open for extremely subtle and dificult to diagnose bugs any time a C API relies on the use of values greater than 128 on platforms with unsigned char; in this case the current import model certainly violates the Principle of Least Astonishment.

This is not an abstract problem that I want to have solved “just because.” This issue has been a recurrent theme, and has come up several times during code review. I’ve included a sampling of these to provide some context to the discussion:

  • Swift PR–1103
  • Swift Foundation PR–265
In these discussions we obviously struggle to adequately solve the issues at hand without introducing the changes proposed here. Indeed, this proposal was suggested in Swift Foundation PR–265 by Joe Groff

I don't want to downplay the issue, but I also wouldn't want the remedy to be worse than the problem it tries to solve.

One remedy is to have char map to Int8 or UInt8 depending on the platform, but this introduces C portability issues in Swift itself so it's not very good.

Another remedy is to have an "opaque" CChar type that you have to cast to Int8 or UInt8 manually. While this removes the portability issue, it introduces friction in Swift whenever you need to use a C API that uses char. It also introduces a risk that the manual conversion is done wrong.

Given the distribution of platforms using an unsigned char by default, I do wonder if there are libraries out there that actually depend on that. It seems to me that any C code depending on an unsigned char by default is already at risk of silently producing wrong results just by moving to a different CPU. In most circumstances, that'd be considered a bug in the C code.

So, my question is: are we contemplating complicating the language and introducing friction for everyone only for a theoretical interoperability problem that would never happen in practice? I would suggest that perhaps the best remedy for this would be to just translate char to Int8 all the time, including on those platforms-architecture combos where it goes against the default C behavior. Just document somewhere that it is so, perhaps offer a flag so you can reverse the importer's behavior in some circumstances, and be done with it.

···

--
Michel Fortin
https://michelf.ca


(Dmitri Gribenko) #8

True, but how can you cast a CChar portably into UInt8 or Int8? Only
via the bitPattern initializer, because the regular initializer will
trap on values outside of the 0..<128 range on signed platforms or
unsigned platforms.

Dmitri

···

On Wed, Mar 2, 2016 at 11:03 AM, William Dillon <william@housedillon.com> wrote:

It does violate the principle of least astonishment, but we should
acknowledge that the implementation-specific nature of C's char signedness
is making code *less* portable, not more -- because the same code can mean
different things on different platforms. Reflecting the same in Swift makes
Swift code less portable, too.

Dmitri

That is a fair point, and I agree for the most part. However, It is my
intent and expectation that the use of CChar would be limited to the margins
where C APIs are imported. Once values become a part of Swift (and used in
places outside of the C interface) they should have been cast into a pure
Swift type (such as UInt8, Int8, Int, etc).

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(William Dillon) #9

True, but how can you cast a CChar portably into UInt8 or Int8? Only
via the bitPattern initializer, because the regular initializer will
trap on values outside of the 0..<128 range on signed platforms or
unsigned platforms.

Dmitri

Yes, that’s true. And I think that really gets to the crux of the issue. Right now, Swift is doing that (bitPattern) for you, and you don’t have a say in the matter. However, when you transition from CChar to either Int8 or UInt8 you probably have an awareness of the implications of your actions. You should know whether the important quality is the bit pattern or the binary representation of a numeric value. If it’s the latter, a trap on overflow might by an excellent diagnostic of failed assumptions, if it’s the former use bitPattern.

- Will

···

On March 2, 2016 at 11:06:55 AM, Dmitri Gribenko (gribozavr@gmail.com) wrote:
On Wed, Mar 2, 2016 at 11:03 AM, William Dillon <william@housedillon.com> wrote:

It does violate the principle of least astonishment, but we should
acknowledge that the implementation-specific nature of C's char signedness
is making code *less* portable, not more -- because the same code can mean
different things on different platforms. Reflecting the same in Swift makes
Swift code less portable, too.

Dmitri

That is a fair point, and I agree for the most part. However, It is my
intent and expectation that the use of CChar would be limited to the margins
where C APIs are imported. Once values become a part of Swift (and used in
places outside of the C interface) they should have been cast into a pure
Swift type (such as UInt8, Int8, Int, etc).


(William Dillon) #10

I propose that the CChar be aliased to UInt8 on targets where char is unsigned, and Int8 on platforms where char is signed.

This will make Swift code _less_ portable, since you will have to write different Swift code depending on whether your C implementation of char is signed or unsigned.

I’m not sure I agree in general, but I can see why you think that way.

Alternatives considered

The only real alternative is the status quo.

No it isn’t. As long as CChar remains an alias to one of the integer types, it would be better to make it an alias to UInt8. The reason being that char * is often a pointer to a sequence of UTF-8 bytes and this interpretation is only going to get more common. String.UTF8View consists of a sequence of CodeUnits and UTF8.CodeUnit is, itself an alias to UInt8.

As others have said, it might be even better to have a new opaque type that’s 8 bits wide (I assume we are ignoring the fact that the C standard doesn’t define the width of a byte) but, in that case, I would argue that bitwise operations should be allowed on it.

I had a similar thought about bitwise operators. Please see the updated gist:

https://gist.github.com/hpux735/eafad78108ed42879690

- Will


(William Dillon) #11

Thanks so much, Ben!

I’ve added them to the gist: https://gist.github.com/hpux735/eafad78108ed42879690

- Will

···

On Mar 3, 2016, at 6:38 PM, Ben Rimmington <me@benrimmington.com> wrote:

Related bugs:

[SR-466] Imported APIs taking chars should use CChar
<https://bugs.swift.org/browse/SR-466>

[SR-747] CChar("\n") returns nil
<https://bugs.swift.org/browse/SR-747>

-- Ben


(William Dillon) #12

Hi All,

I’ve been exploring the idea of importing C chars as RawByte (and type aliasing RawByte to CChar) in my prototype branch of Swift, and I would appreciate some insight from the hive mind. Specifically, I’m interested in opinions about where to “draw the line” regarding C strings.

Currently (in the master branch) it’s easy convert between NSString, String and C-strings (often represented somewhat inconsistently by [CChar] or [(U)Int8], or UnsafePointer<CChar>). If we are interested in importing C char as RawByte, we have to decide where C strings become UTF8 (i.e. UInt8). I feel most comfortable drawing this line as close to C as possible, but I could see that there would be a reason for leaving it as CChar until it becomes proper UTF8.

If every reference to “Cstring” (for example: public init?(CString: UnsafePointer<CChar>…) is done with CChar, there can be a clean break, and an obvious place where Swift’s String handling begins, and C string handling ends. It may discourage people from trying to do string processing outside of the context of unicode (kinda). It would still be possible for people to work with characters in a C-string, however, as CChar will have a small number of operators defined over it, and you could always cast it to UInt8 or Int8.

I hope that this email opens up the discussion, and we can come up with a good idea that works for most people. I’m kinda spinning because I don’t have a firm idea of the trade-offs for each solution. Any consensus here would certainly help me finish the prototype.

Thanks!
- Will


(William Dillon) #13

As it stands, Swift’s indifference to the signness of char while importing from C can be ignored in many cases. The consequences of inaction, however, leave the door open for extremely subtle and dificult to diagnose bugs any time a C API relies on the use of values greater than 128 on platforms with unsigned char; in this case the current import model certainly violates the Principle of Least Astonishment.

This is not an abstract problem that I want to have solved “just because.” This issue has been a recurrent theme, and has come up several times during code review. I’ve included a sampling of these to provide some context to the discussion:

  • Swift PR–1103
  • Swift Foundation PR–265
In these discussions we obviously struggle to adequately solve the issues at hand without introducing the changes proposed here. Indeed, this proposal was suggested in Swift Foundation PR–265 by Joe Groff

I don't want to downplay the issue, but I also wouldn't want the remedy to be worse than the problem it tries to solve.

Believe me when I say that any solution that is worse than the current state would be unacceptable to me.

One remedy is to have char map to Int8 or UInt8 depending on the platform, but this introduces C portability issues in Swift itself so it's not very good.

Another remedy is to have an "opaque" CChar type that you have to cast to Int8 or UInt8 manually. While this removes the portability issue, it introduces friction in Swift whenever you need to use a C API that uses char. It also introduces a risk that the manual conversion is done wrong.

I’ve updated the Gist; the second remedy that you mentioned is the current preferred solution. It’s true that the manual conversion could be done wrong, but please remember that the conversion is happening currently, it’s just done for you by Swift. The possibility of inappropriate conversion is still very real, but the user has no control over it.

Given the distribution of platforms using an unsigned char by default, I do wonder if there are libraries out there that actually depend on that. It seems to me that any C code depending on an unsigned char by default is already at risk of silently producing wrong results just by moving to a different CPU. In most circumstances, that'd be considered a bug in the C code.

That is not an unreasonable position.

So, my question is: are we contemplating complicating the language and introducing friction for everyone only for a theoretical interoperability problem that would never happen in practice? I would suggest that perhaps the best remedy for this would be to just translate char to Int8 all the time, including on those platforms-architecture combos where it goes against the default C behavior. Just document somewhere that it is so, perhaps offer a flag so you can reverse the importer's behavior in some circumstances, and be done with it.

Again, my intention is to not introduce undue friction. I’ve ported the standard library (and some of Foundation) for each of the candidate solutions, and the changes are extremely minor. In fact, in the vast majority of cases, it is enough to simply select the correct type when designing the method signatures and variables.

Thanks for taking the time to share your thoughts.
- Will

···

On Mar 4, 2016, at 5:52 PM, Michel Fortin <michel.fortin@michelf.ca> wrote:


(Joe Groff) #14

Could we treat CChar as a "signless" byte type, so that UInt8(cchar) and Int8(cchar) both just reinterpret the bit pattern?

-Joe

···

On Mar 2, 2016, at 11:06 AM, Dmitri Gribenko via swift-evolution <swift-evolution@swift.org> wrote:

On Wed, Mar 2, 2016 at 11:03 AM, William Dillon <william@housedillon.com> wrote:

It does violate the principle of least astonishment, but we should
acknowledge that the implementation-specific nature of C's char signedness
is making code *less* portable, not more -- because the same code can mean
different things on different platforms. Reflecting the same in Swift makes
Swift code less portable, too.

Dmitri

That is a fair point, and I agree for the most part. However, It is my
intent and expectation that the use of CChar would be limited to the margins
where C APIs are imported. Once values become a part of Swift (and used in
places outside of the C interface) they should have been cast into a pure
Swift type (such as UInt8, Int8, Int, etc).

True, but how can you cast a CChar portably into UInt8 or Int8? Only
via the bitPattern initializer, because the regular initializer will
trap on values outside of the 0..<128 range on signed platforms or
unsigned platforms.


(Dmitri Gribenko) #15

That is viable, but it opens a whole another can of worms:

- we need CChar to be a separate type,

- other unlabelled integer initializers trap when changing the numeric
value, and this one would be wildly inconsistent.

Dmitri

···

On Wed, Mar 2, 2016 at 11:10 AM, Joe Groff <jgroff@apple.com> wrote:

On Mar 2, 2016, at 11:06 AM, Dmitri Gribenko via swift-evolution <swift-evolution@swift.org> wrote:

On Wed, Mar 2, 2016 at 11:03 AM, William Dillon <william@housedillon.com> wrote:

It does violate the principle of least astonishment, but we should
acknowledge that the implementation-specific nature of C's char signedness
is making code *less* portable, not more -- because the same code can mean
different things on different platforms. Reflecting the same in Swift makes
Swift code less portable, too.

Dmitri

That is a fair point, and I agree for the most part. However, It is my
intent and expectation that the use of CChar would be limited to the margins
where C APIs are imported. Once values become a part of Swift (and used in
places outside of the C interface) they should have been cast into a pure
Swift type (such as UInt8, Int8, Int, etc).

True, but how can you cast a CChar portably into UInt8 or Int8? Only
via the bitPattern initializer, because the regular initializer will
trap on values outside of the 0..<128 range on signed platforms or
unsigned platforms.

Could we treat CChar as a "signless" byte type, so that UInt8(cchar) and Int8(cchar) both just reinterpret the bit pattern?

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(Dmitri Gribenko) #16

... but if we don't provide any arithmetic operations on the CChar
type, and treat it as an opaque byte-sized character type that you can
only convert to UInt8/Int8, it might not be bad!

Dmitri

···

On Wed, Mar 2, 2016 at 11:13 AM, Dmitri Gribenko <gribozavr@gmail.com> wrote:

On Wed, Mar 2, 2016 at 11:10 AM, Joe Groff <jgroff@apple.com> wrote:

On Mar 2, 2016, at 11:06 AM, Dmitri Gribenko via swift-evolution <swift-evolution@swift.org> wrote:

On Wed, Mar 2, 2016 at 11:03 AM, William Dillon <william@housedillon.com> wrote:

It does violate the principle of least astonishment, but we should
acknowledge that the implementation-specific nature of C's char signedness
is making code *less* portable, not more -- because the same code can mean
different things on different platforms. Reflecting the same in Swift makes
Swift code less portable, too.

Dmitri

That is a fair point, and I agree for the most part. However, It is my
intent and expectation that the use of CChar would be limited to the margins
where C APIs are imported. Once values become a part of Swift (and used in
places outside of the C interface) they should have been cast into a pure
Swift type (such as UInt8, Int8, Int, etc).

True, but how can you cast a CChar portably into UInt8 or Int8? Only
via the bitPattern initializer, because the regular initializer will
trap on values outside of the 0..<128 range on signed platforms or
unsigned platforms.

Could we treat CChar as a "signless" byte type, so that UInt8(cchar) and Int8(cchar) both just reinterpret the bit pattern?

That is viable, but it opens a whole another can of worms:

- we need CChar to be a separate type,

- other unlabelled integer initializers trap when changing the numeric
value, and this one would be wildly inconsistent.

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(William Dillon) #17

... but if we don't provide any arithmetic operations on the CChar
type, and treat it as an opaque byte-sized character type that you can
only convert to UInt8/Int8, it might not be bad!

Dmitri

I, to, like this solution. It would be moderately more work to transition to, but I think it’s better overall.

- Will

···

On March 2, 2016 at 11:14:26 AM, Dmitri Gribenko (gribozavr@gmail.com) wrote:

On Wed, Mar 2, 2016 at 11:13 AM, Dmitri Gribenko <gribozavr@gmail.com> wrote:

On Wed, Mar 2, 2016 at 11:10 AM, Joe Groff <jgroff@apple.com> wrote:

On Mar 2, 2016, at 11:06 AM, Dmitri Gribenko via swift-evolution <swift-evolution@swift.org> wrote:

On Wed, Mar 2, 2016 at 11:03 AM, William Dillon <william@housedillon.com> wrote:

It does violate the principle of least astonishment, but we should
acknowledge that the implementation-specific nature of C's char signedness
is making code *less* portable, not more -- because the same code can mean
different things on different platforms. Reflecting the same in Swift makes
Swift code less portable, too.

Dmitri

That is a fair point, and I agree for the most part. However, It is my
intent and expectation that the use of CChar would be limited to the margins
where C APIs are imported. Once values become a part of Swift (and used in
places outside of the C interface) they should have been cast into a pure
Swift type (such as UInt8, Int8, Int, etc).

True, but how can you cast a CChar portably into UInt8 or Int8? Only
via the bitPattern initializer, because the regular initializer will
trap on values outside of the 0..<128 range on signed platforms or
unsigned platforms.

Could we treat CChar as a "signless" byte type, so that UInt8(cchar) and Int8(cchar) both just reinterpret the bit pattern?

That is viable, but it opens a whole another can of worms:

- we need CChar to be a separate type,

- other unlabelled integer initializers trap when changing the numeric
value, and this one would be wildly inconsistent.


(Joe Groff) #18

Yeah, it would make sense to me if CChar were opaque and you had to construct an Int8 or UInt8 from it to specify the arithmetic semantics you want.

-Joe

···

On Mar 2, 2016, at 11:14 AM, Dmitri Gribenko <gribozavr@gmail.com> wrote:

On Wed, Mar 2, 2016 at 11:13 AM, Dmitri Gribenko <gribozavr@gmail.com <mailto:gribozavr@gmail.com>> wrote:

On Wed, Mar 2, 2016 at 11:10 AM, Joe Groff <jgroff@apple.com> wrote:

On Mar 2, 2016, at 11:06 AM, Dmitri Gribenko via swift-evolution <swift-evolution@swift.org> wrote:

On Wed, Mar 2, 2016 at 11:03 AM, William Dillon <william@housedillon.com> wrote:

It does violate the principle of least astonishment, but we should
acknowledge that the implementation-specific nature of C's char signedness
is making code *less* portable, not more -- because the same code can mean
different things on different platforms. Reflecting the same in Swift makes
Swift code less portable, too.

Dmitri

That is a fair point, and I agree for the most part. However, It is my
intent and expectation that the use of CChar would be limited to the margins
where C APIs are imported. Once values become a part of Swift (and used in
places outside of the C interface) they should have been cast into a pure
Swift type (such as UInt8, Int8, Int, etc).

True, but how can you cast a CChar portably into UInt8 or Int8? Only
via the bitPattern initializer, because the regular initializer will
trap on values outside of the 0..<128 range on signed platforms or
unsigned platforms.

Could we treat CChar as a "signless" byte type, so that UInt8(cchar) and Int8(cchar) both just reinterpret the bit pattern?

That is viable, but it opens a whole another can of worms:

- we need CChar to be a separate type,

- other unlabelled integer initializers trap when changing the numeric
value, and this one would be wildly inconsistent.

... but if we don't provide any arithmetic operations on the CChar
type, and treat it as an opaque byte-sized character type that you can
only convert to UInt8/Int8, it might not be bad!


(Brent Royal-Gordon) #19

Could we treat CChar as a "signless" byte type, so that UInt8(cchar) and Int8(cchar) both just reinterpret the bit pattern?

That is viable, but it opens a whole another can of worms:

- we need CChar to be a separate type,

- other unlabelled integer initializers trap when changing the numeric
value, and this one would be wildly inconsistent.

... but if we don't provide any arithmetic operations on the CChar
type, and treat it as an opaque byte-sized character type that you can
only convert to UInt8/Int8, it might not be bad!

Yeah, it would make sense to me if CChar were opaque and you had to construct an Int8 or UInt8 from it to specify the arithmetic semantics you want.

I know we're planning to remove `RawByte`, but might it make sense to give it a stay of execution, import `char` as `RawByte`, and add appropriate initializers to convert between it and `Int8`/`UInt8`?

···

--
Brent Royal-Gordon
Architechies


(William Dillon) #20

That’s not a bad idea. My only concern there is that it’s less obvious where RawByte came from (CChar is pretty self-explanatory). But I’m far from against it.

···

On Mar 3, 2016, at 2:05 AM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

Could we treat CChar as a "signless" byte type, so that UInt8(cchar) and Int8(cchar) both just reinterpret the bit pattern?

That is viable, but it opens a whole another can of worms:

- we need CChar to be a separate type,

- other unlabelled integer initializers trap when changing the numeric
value, and this one would be wildly inconsistent.

... but if we don't provide any arithmetic operations on the CChar
type, and treat it as an opaque byte-sized character type that you can
only convert to UInt8/Int8, it might not be bad!

Yeah, it would make sense to me if CChar were opaque and you had to construct an Int8 or UInt8 from it to specify the arithmetic semantics you want.

I know we're planning to remove `RawByte`, but might it make sense to give it a stay of execution, import `char` as `RawByte`, and add appropriate initializers to convert between it and `Int8`/`UInt8`?

--
Brent Royal-Gordon
Architechies

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution