[Pitch] Make `errno`-setting functions more usable from Swift


(Johannes Weiss) #1

Hey swift-evolution,

First of all apologies, this is not a full proposal yet, it's meant to kick off a discussion on how to resolve the issue.

# Make `errno`-setting functions more usable from Swift

## Introduction

This is a pitch to make [`errno`][1]-setting functions properly usable, as in having a guarantee to get the correct `errno` value on failure of a [system call][2]. Currently, functions which set `errno` are just exported in the Darwin/Glibc modules with (as far as I understand) no guaranteed correct way of handling errors as the correct `errno` value can't be retrieved.
This means that much of the Swift code which uses Darwin/Glibc out there relies on behaviour that isn't guaranteed.

## Motivation

In many Swift libraries that use the Darwin/Glibc modules there is code similar to:

/* import Darwin/Glibc */

let rv = some_system_call(some, parameters)
if rv < 0 {
   throw SomeError(errorCode: errno) /* <-- errno use */
}

That looks very innocent but please note that `errno` is used here. And `errno` is an interesting one as it's a thread-local variable which is written to by many functions. A thread-local variable is like a global variable except that setting it in one thread does not affect its value in any other thread. Pretty much all system calls and many library functions set `errno` if something went wrong.

The problem is that as far as I see (and Swift developers have confirmed), there is no guarantee that in between the call of `some_system_call` and the reading of `errno`, `errno` hasn't been overwritten by some other system call that has been call on the same thread.

To illustrate this further, let's consider this example

/* import Darwin/Glibc */
public class SomeClass {
  public let someValue: Int = 1
  deinit {
      /* call some failing syscall, for example */
      write(-1, nil, 0) /* should set errno to EBADF */
  }
}

public func foo() {
   let x = SomeClass()
   let rv = write(x.someValue, nil, 0)
   let errnoSave = errno
   if rv != 0 {
      throw SomeError(errorCode: errnoSave)
   }
}

as you see in function `foo`, the instance `x` of `SomeClass` isn't needed anymore as soon as `write` has been called. So (as far as I understand) there's no guarantee that ARC doesn't turn the above code into

let x = SomeClass()
let rv = write(x.someValue, nil, 42) /* should set errno to EFAULT */
/* ARC generated */ x.release()
let errnoSave = errno /* wrong errno value :( */
if rv != 0 {
  throw SomeError(errorCode: errnoSave)
}

And the ARC generated `x.release()` will cause `x` to be deallocated which will call the failing `write` in the `deinit` of `SomeClass`. So `errnoSave` might be `EBADF` instead of `EFAULT` depending on where ARC put the `x.release()` call.

What `errno` value we read will depend on the optimisation settings and the Swift compiler version. That's IMHO a big issue as it might make the lowest layers unstable with hard-to-debug issues.

## Proposed solution

I don't have a full story on how to actually resolve the issue but I see a few options:

### Option 1: always return errno

clang importer could be changed to make all `errno`-setting functions return a tuple of the actual return value and the `errno` value.

For example, currently write(2) is imported as:

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) -> Int

which could be changed to

 public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) -> (Int, Int32 /* for errno */)

Correct code to use write would then look like this:

let (bytesWritten, writeErrno) = write(fd, buf, len)
if bytesWritten >= 0 {
   /* everything's fine */
} else {
   throw POSIXError(code: writeErrno)
}

### Option 2: make them throw

The second option is to teach clang importer to make the functions throwing. So write(2) would be imported as

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) throws /* POSIXError */ -> Int

That would make these functions quite easy to use and would feel natural in Swift:

do {
   let bytesWritten = write(fd, buf, len)
} catch let e as POSIXError {
   /* handle error */
} catch {
   ...
}

### Discussion

The beauty of option 1 is simplicity. Clang importer would not need to know what exact values a system call returns on failure. Also very little additional code needs to be emitted for calling a system call. That seems to be the [way Go is going][3].

The downside of option 1 is that the API doesn't feel like idiomatic Swift. The returned `errno` value is only useful if the system call failed and is arbitrary in the case when it worked. (There is no guarantee that `errno` is set to `0` when a system call succeeds.)
Also there is a slight overhead in reading `errno` which would be paid for every `errno`-setting function, even if successful. Hence, option 2 looks nice as it brings these functions more in like with other Swift functions. However, as mentioned before, clang importer would need to learn what values are returned on success/failure for _every_ `errno`-setting function (and there's _many_ of them).

## Proposed Approach

Let's discuss what is a good solution and then I will volunteer to put together a full proposal.

## Source compatibility

it depends.

To retain source compatibility the Darwin/Glibc modules could also be left as is. The safe `errno` handling could then be implemented only in a new, unified module for Darwin/Glibc. There's already ongoing discussions/proposals about that on the list anyway. That new module could then be implemented in the spirit of options 1, 2, or some other solution. The benefits are guaranteed source compatibility for legacy applications and `errno` safety plus easier imports for new applications - win/win :slightly_smiling_face:.

## Effect on ABI stability

Will most likely be additive, so probably none.

## Effect on API resilience

see source compatibility.

## Alternatives considered

Do nothing and workaround `errno` capturing being very hard. I discussed this previously elsewhere and Joe Groff came up with the code below which should convince the optimiser not to insert any release calls at the wrong place or inline the function:

@inline(never)
func callWithErrno(_ fn: () -> Int) -> (result: Int, errno: Int) {
   var result: Int
   var savedErrno: Int
   withExtendedLifetime(fn) {
       result = fn()
       savedErrno = errno
   }
   return (result, savedErrno)
}

An example use of that is

let (rv, writeErrno) = callWithErrno {
   write(-1, nil, 0)
}

if rv < 0 {
   throw SomeError(errorCode: writeErrno)
}

This makes it possible to retrieve the correct `errno` value in Swift but I think there remain too many ways to do it wrongly. First and foremost that the compiler doesn't complain if the programmer forgets to use `callWithErrno`.

···

===

Let me know what you think!

[1]: https://en.wikipedia.org/wiki/Errno.h
[2]: https://en.wikipedia.org/wiki/System_call
[3]: https://golang.org/pkg/syscall/#Write

Many thanks,
Johannes


(Zachary Waldowski) #2

I could see the "swift_error" Clang attribute being extended with an
"errno" case (in the same vein as "nonzero_result" and "zero_result"),
which is then reflected into Swift via the importer. Although, having an
attribute for methods set errno seems like it would be useful for the
Clang Analyzer overall.

Cheers!
Zach

···

On Wed, Nov 2, 2016, at 09:12 AM, Johannes Weiß via swift-evolution wrote:

Hey swift-evolution,

First of all apologies, this is not a full proposal yet, it's meant to
kick off a discussion on how to resolve the issue.

# Make `errno`-setting functions more usable from Swift

## Introduction

This is a pitch to make [`errno`][1]-setting functions properly usable,
as in having a guarantee to get the correct `errno` value on failure of a
[system call][2]. Currently, functions which set `errno` are just
exported in the Darwin/Glibc modules with (as far as I understand) no
guaranteed correct way of handling errors as the correct `errno` value
can't be retrieved.
This means that much of the Swift code which uses Darwin/Glibc out there
relies on behaviour that isn't guaranteed.

## Motivation

In many Swift libraries that use the Darwin/Glibc modules there is code
similar to:

/* import Darwin/Glibc */

let rv = some_system_call(some, parameters)
if rv < 0 {
   throw SomeError(errorCode: errno) /* <-- errno use */
}

That looks very innocent but please note that `errno` is used here. And
`errno` is an interesting one as it's a thread-local variable which is
written to by many functions. A thread-local variable is like a global
variable except that setting it in one thread does not affect its value
in any other thread. Pretty much all system calls and many library
functions set `errno` if something went wrong.

The problem is that as far as I see (and Swift developers have
confirmed), there is no guarantee that in between the call of
`some_system_call` and the reading of `errno`, `errno` hasn't been
overwritten by some other system call that has been call on the same
thread.

To illustrate this further, let's consider this example

/* import Darwin/Glibc */
public class SomeClass {
  public let someValue: Int = 1
  deinit {
      /* call some failing syscall, for example */
      write(-1, nil, 0) /* should set errno to EBADF */
  }
}

public func foo() {
   let x = SomeClass()
   let rv = write(x.someValue, nil, 0)
   let errnoSave = errno
   if rv != 0 {
      throw SomeError(errorCode: errnoSave)
   }
}

as you see in function `foo`, the instance `x` of `SomeClass` isn't
needed anymore as soon as `write` has been called. So (as far as I
understand) there's no guarantee that ARC doesn't turn the above code
into

let x = SomeClass()
let rv = write(x.someValue, nil, 42) /* should set errno to EFAULT */
/* ARC generated */ x.release()
let errnoSave = errno /* wrong errno value :( */
if rv != 0 {
  throw SomeError(errorCode: errnoSave)
}

And the ARC generated `x.release()` will cause `x` to be deallocated
which will call the failing `write` in the `deinit` of `SomeClass`. So
`errnoSave` might be `EBADF` instead of `EFAULT` depending on where ARC
put the `x.release()` call.

What `errno` value we read will depend on the optimisation settings and
the Swift compiler version. That's IMHO a big issue as it might make the
lowest layers unstable with hard-to-debug issues.

## Proposed solution

I don't have a full story on how to actually resolve the issue but I see
a few options:

### Option 1: always return errno

clang importer could be changed to make all `errno`-setting functions
return a tuple of the actual return value and the `errno` value.

For example, currently write(2) is imported as:

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte:
Int) -> Int

which could be changed to

 public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte:
 Int) -> (Int, Int32 /* for errno */)

Correct code to use write would then look like this:

let (bytesWritten, writeErrno) = write(fd, buf, len)
if bytesWritten >= 0 {
   /* everything's fine */
} else {
   throw POSIXError(code: writeErrno)
}

### Option 2: make them throw

The second option is to teach clang importer to make the functions
throwing. So write(2) would be imported as

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte:
Int) throws /* POSIXError */ -> Int

That would make these functions quite easy to use and would feel natural
in Swift:

do {
   let bytesWritten = write(fd, buf, len)
} catch let e as POSIXError {
   /* handle error */
} catch {
   ...
}

### Discussion

The beauty of option 1 is simplicity. Clang importer would not need to
know what exact values a system call returns on failure. Also very little
additional code needs to be emitted for calling a system call. That seems
to be the [way Go is going][3].

The downside of option 1 is that the API doesn't feel like idiomatic
Swift. The returned `errno` value is only useful if the system call
failed and is arbitrary in the case when it worked. (There is no
guarantee that `errno` is set to `0` when a system call succeeds.)
Also there is a slight overhead in reading `errno` which would be paid
for every `errno`-setting function, even if successful. Hence, option 2
looks nice as it brings these functions more in like with other Swift
functions. However, as mentioned before, clang importer would need to
learn what values are returned on success/failure for _every_
`errno`-setting function (and there's _many_ of them).

## Proposed Approach

Let's discuss what is a good solution and then I will volunteer to put
together a full proposal.

## Source compatibility

it depends.

To retain source compatibility the Darwin/Glibc modules could also be
left as is. The safe `errno` handling could then be implemented only in a
new, unified module for Darwin/Glibc. There's already ongoing
discussions/proposals about that on the list anyway. That new module
could then be implemented in the spirit of options 1, 2, or some other
solution. The benefits are guaranteed source compatibility for legacy
applications and `errno` safety plus easier imports for new applications
- win/win :slightly_smiling_face:.

## Effect on ABI stability

Will most likely be additive, so probably none.

## Effect on API resilience

see source compatibility.

## Alternatives considered

Do nothing and workaround `errno` capturing being very hard. I discussed
this previously elsewhere and Joe Groff came up with the code below which
should convince the optimiser not to insert any release calls at the
wrong place or inline the function:

@inline(never)
func callWithErrno(_ fn: () -> Int) -> (result: Int, errno: Int) {
   var result: Int
   var savedErrno: Int
   withExtendedLifetime(fn) {
       result = fn()
       savedErrno = errno
   }
   return (result, savedErrno)
}

An example use of that is

let (rv, writeErrno) = callWithErrno {
   write(-1, nil, 0)
}

if rv < 0 {
   throw SomeError(errorCode: writeErrno)
}

This makes it possible to retrieve the correct `errno` value in Swift but
I think there remain too many ways to do it wrongly. First and foremost
that the compiler doesn't complain if the programmer forgets to use
`callWithErrno`.

===

Let me know what you think!

[1]: https://en.wikipedia.org/wiki/Errno.h
[2]: https://en.wikipedia.org/wiki/System_call
[3]: https://golang.org/pkg/syscall/#Write

Many thanks,
Johannes
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(John McCall) #3

Hey swift-evolution,

First of all apologies, this is not a full proposal yet, it's meant to kick off a discussion on how to resolve the issue.

# Make `errno`-setting functions more usable from Swift

## Introduction

This is a pitch to make [`errno`][1]-setting functions properly usable, as in having a guarantee to get the correct `errno` value on failure of a [system call][2]. Currently, functions which set `errno` are just exported in the Darwin/Glibc modules with (as far as I understand) no guaranteed correct way of handling errors as the correct `errno` value can't be retrieved.
This means that much of the Swift code which uses Darwin/Glibc out there relies on behaviour that isn't guaranteed.

Is this a general enough problem to be worth complicating the importer with vs., say, just manually wrapping the POSIX APIs to be throwing in an overlay? POSIX uses a lot of different conventions for reporting that an error occurred, and there's only a finite set of POSIX APIs. Are there "user" functions that wrap POSIX and carefully report errors with errno?

John.

···

On Nov 2, 2016, at 6:12 AM, Johannes Weiß via swift-evolution <swift-evolution@swift.org> wrote:

## Motivation

In many Swift libraries that use the Darwin/Glibc modules there is code similar to:

/* import Darwin/Glibc */

let rv = some_system_call(some, parameters)
if rv < 0 {
  throw SomeError(errorCode: errno) /* <-- errno use */
}

That looks very innocent but please note that `errno` is used here. And `errno` is an interesting one as it's a thread-local variable which is written to by many functions. A thread-local variable is like a global variable except that setting it in one thread does not affect its value in any other thread. Pretty much all system calls and many library functions set `errno` if something went wrong.

The problem is that as far as I see (and Swift developers have confirmed), there is no guarantee that in between the call of `some_system_call` and the reading of `errno`, `errno` hasn't been overwritten by some other system call that has been call on the same thread.

To illustrate this further, let's consider this example

/* import Darwin/Glibc */
public class SomeClass {
 public let someValue: Int = 1
 deinit {
     /* call some failing syscall, for example */
     write(-1, nil, 0) /* should set errno to EBADF */
 }
}

public func foo() {
  let x = SomeClass()
  let rv = write(x.someValue, nil, 0)
  let errnoSave = errno
  if rv != 0 {
     throw SomeError(errorCode: errnoSave)
  }
}

as you see in function `foo`, the instance `x` of `SomeClass` isn't needed anymore as soon as `write` has been called. So (as far as I understand) there's no guarantee that ARC doesn't turn the above code into

let x = SomeClass()
let rv = write(x.someValue, nil, 42) /* should set errno to EFAULT */
/* ARC generated */ x.release()
let errnoSave = errno /* wrong errno value :( */
if rv != 0 {
 throw SomeError(errorCode: errnoSave)
}

And the ARC generated `x.release()` will cause `x` to be deallocated which will call the failing `write` in the `deinit` of `SomeClass`. So `errnoSave` might be `EBADF` instead of `EFAULT` depending on where ARC put the `x.release()` call.

What `errno` value we read will depend on the optimisation settings and the Swift compiler version. That's IMHO a big issue as it might make the lowest layers unstable with hard-to-debug issues.

## Proposed solution

I don't have a full story on how to actually resolve the issue but I see a few options:

### Option 1: always return errno

clang importer could be changed to make all `errno`-setting functions return a tuple of the actual return value and the `errno` value.

For example, currently write(2) is imported as:

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) -> Int

which could be changed to

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) -> (Int, Int32 /* for errno */)

Correct code to use write would then look like this:

let (bytesWritten, writeErrno) = write(fd, buf, len)
if bytesWritten >= 0 {
  /* everything's fine */
} else {
  throw POSIXError(code: writeErrno)
}

### Option 2: make them throw

The second option is to teach clang importer to make the functions throwing. So write(2) would be imported as

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) throws /* POSIXError */ -> Int

That would make these functions quite easy to use and would feel natural in Swift:

do {
  let bytesWritten = write(fd, buf, len)
} catch let e as POSIXError {
  /* handle error */
} catch {
  ...
}

### Discussion

The beauty of option 1 is simplicity. Clang importer would not need to know what exact values a system call returns on failure. Also very little additional code needs to be emitted for calling a system call. That seems to be the [way Go is going][3].

The downside of option 1 is that the API doesn't feel like idiomatic Swift. The returned `errno` value is only useful if the system call failed and is arbitrary in the case when it worked. (There is no guarantee that `errno` is set to `0` when a system call succeeds.)
Also there is a slight overhead in reading `errno` which would be paid for every `errno`-setting function, even if successful. Hence, option 2 looks nice as it brings these functions more in like with other Swift functions. However, as mentioned before, clang importer would need to learn what values are returned on success/failure for _every_ `errno`-setting function (and there's _many_ of them).

## Proposed Approach

Let's discuss what is a good solution and then I will volunteer to put together a full proposal.

## Source compatibility

it depends.

To retain source compatibility the Darwin/Glibc modules could also be left as is. The safe `errno` handling could then be implemented only in a new, unified module for Darwin/Glibc. There's already ongoing discussions/proposals about that on the list anyway. That new module could then be implemented in the spirit of options 1, 2, or some other solution. The benefits are guaranteed source compatibility for legacy applications and `errno` safety plus easier imports for new applications - win/win :slightly_smiling_face:.

## Effect on ABI stability

Will most likely be additive, so probably none.

## Effect on API resilience

see source compatibility.

## Alternatives considered

Do nothing and workaround `errno` capturing being very hard. I discussed this previously elsewhere and Joe Groff came up with the code below which should convince the optimiser not to insert any release calls at the wrong place or inline the function:

@inline(never)
func callWithErrno(_ fn: () -> Int) -> (result: Int, errno: Int) {
  var result: Int
  var savedErrno: Int
  withExtendedLifetime(fn) {
      result = fn()
      savedErrno = errno
  }
  return (result, savedErrno)
}

An example use of that is

let (rv, writeErrno) = callWithErrno {
  write(-1, nil, 0)
}

if rv < 0 {
  throw SomeError(errorCode: writeErrno)
}

This makes it possible to retrieve the correct `errno` value in Swift but I think there remain too many ways to do it wrongly. First and foremost that the compiler doesn't complain if the programmer forgets to use `callWithErrno`.

===

Let me know what you think!

[1]: https://en.wikipedia.org/wiki/Errno.h
[2]: https://en.wikipedia.org/wiki/System_call
[3]: https://golang.org/pkg/syscall/#Write

Many thanks,
Johannes
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Johannes Weiss) #4

Hi,

I just realised, that the problem is slightly worse than I originally described. I believed that successful calls in between the actual call the programmer wanted to make and capturing `errno` are not a problem.

But POSIX seems to suggest [4] that "The setting of errno after a successful call to a function is unspecified unless the description of that function specifies that errno shall not be modified." . The Linux man page [5] also mentions that "a function that succeeds is allowed to change errno."

To me this means that the issue is wider than just ARC. I think the problem extends to memory allocations on the heap. Failed memory allocations aren't a problem because they are terminal in Swift. However, _successful_ memory allocations might be a problem because the malloc(3) that the Swift compiler will use is absolutely free to set errno to 0 (or any other value in fact) indicating success. (Said that at least malloc doesn't change `errno` on the macOS or Linux I tested today, we probably shouldn't rely on that though.)

This makes it even more unpredictable to the programmer what a use of `errno` in Swift will return. IMHO it shouldn't be exported to Swift as its value is undefined almost(?) everywhere.

[4]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/errno.html
[5]: http://man7.org/linux/man-pages/man3/errno.3.html

All the best,
  Johannes

···

On 2 Nov 2016, at 1:12 pm, Johannes Weiß via swift-evolution <swift-evolution@swift.org> wrote:

Hey swift-evolution,

First of all apologies, this is not a full proposal yet, it's meant to kick off a discussion on how to resolve the issue.

# Make `errno`-setting functions more usable from Swift

## Introduction

This is a pitch to make [`errno`][1]-setting functions properly usable, as in having a guarantee to get the correct `errno` value on failure of a [system call][2]. Currently, functions which set `errno` are just exported in the Darwin/Glibc modules with (as far as I understand) no guaranteed correct way of handling errors as the correct `errno` value can't be retrieved.
This means that much of the Swift code which uses Darwin/Glibc out there relies on behaviour that isn't guaranteed.

## Motivation

In many Swift libraries that use the Darwin/Glibc modules there is code similar to:

/* import Darwin/Glibc */

let rv = some_system_call(some, parameters)
if rv < 0 {
  throw SomeError(errorCode: errno) /* <-- errno use */
}

That looks very innocent but please note that `errno` is used here. And `errno` is an interesting one as it's a thread-local variable which is written to by many functions. A thread-local variable is like a global variable except that setting it in one thread does not affect its value in any other thread. Pretty much all system calls and many library functions set `errno` if something went wrong.

The problem is that as far as I see (and Swift developers have confirmed), there is no guarantee that in between the call of `some_system_call` and the reading of `errno`, `errno` hasn't been overwritten by some other system call that has been call on the same thread.

To illustrate this further, let's consider this example

/* import Darwin/Glibc */
public class SomeClass {
 public let someValue: Int = 1
 deinit {
     /* call some failing syscall, for example */
     write(-1, nil, 0) /* should set errno to EBADF */
 }
}

public func foo() {
  let x = SomeClass()
  let rv = write(x.someValue, nil, 0)
  let errnoSave = errno
  if rv != 0 {
     throw SomeError(errorCode: errnoSave)
  }
}

as you see in function `foo`, the instance `x` of `SomeClass` isn't needed anymore as soon as `write` has been called. So (as far as I understand) there's no guarantee that ARC doesn't turn the above code into

let x = SomeClass()
let rv = write(x.someValue, nil, 42) /* should set errno to EFAULT */
/* ARC generated */ x.release()
let errnoSave = errno /* wrong errno value :( */
if rv != 0 {
 throw SomeError(errorCode: errnoSave)
}

And the ARC generated `x.release()` will cause `x` to be deallocated which will call the failing `write` in the `deinit` of `SomeClass`. So `errnoSave` might be `EBADF` instead of `EFAULT` depending on where ARC put the `x.release()` call.

What `errno` value we read will depend on the optimisation settings and the Swift compiler version. That's IMHO a big issue as it might make the lowest layers unstable with hard-to-debug issues.

## Proposed solution

I don't have a full story on how to actually resolve the issue but I see a few options:

### Option 1: always return errno

clang importer could be changed to make all `errno`-setting functions return a tuple of the actual return value and the `errno` value.

For example, currently write(2) is imported as:

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) -> Int

which could be changed to

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) -> (Int, Int32 /* for errno */)

Correct code to use write would then look like this:

let (bytesWritten, writeErrno) = write(fd, buf, len)
if bytesWritten >= 0 {
  /* everything's fine */
} else {
  throw POSIXError(code: writeErrno)
}

### Option 2: make them throw

The second option is to teach clang importer to make the functions throwing. So write(2) would be imported as

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) throws /* POSIXError */ -> Int

That would make these functions quite easy to use and would feel natural in Swift:

do {
  let bytesWritten = write(fd, buf, len)
} catch let e as POSIXError {
  /* handle error */
} catch {
  ...
}

### Discussion

The beauty of option 1 is simplicity. Clang importer would not need to know what exact values a system call returns on failure. Also very little additional code needs to be emitted for calling a system call. That seems to be the [way Go is going][3].

The downside of option 1 is that the API doesn't feel like idiomatic Swift. The returned `errno` value is only useful if the system call failed and is arbitrary in the case when it worked. (There is no guarantee that `errno` is set to `0` when a system call succeeds.)
Also there is a slight overhead in reading `errno` which would be paid for every `errno`-setting function, even if successful. Hence, option 2 looks nice as it brings these functions more in like with other Swift functions. However, as mentioned before, clang importer would need to learn what values are returned on success/failure for _every_ `errno`-setting function (and there's _many_ of them).

## Proposed Approach

Let's discuss what is a good solution and then I will volunteer to put together a full proposal.

## Source compatibility

it depends.

To retain source compatibility the Darwin/Glibc modules could also be left as is. The safe `errno` handling could then be implemented only in a new, unified module for Darwin/Glibc. There's already ongoing discussions/proposals about that on the list anyway. That new module could then be implemented in the spirit of options 1, 2, or some other solution. The benefits are guaranteed source compatibility for legacy applications and `errno` safety plus easier imports for new applications - win/win :slightly_smiling_face:.

## Effect on ABI stability

Will most likely be additive, so probably none.

## Effect on API resilience

see source compatibility.

## Alternatives considered

Do nothing and workaround `errno` capturing being very hard. I discussed this previously elsewhere and Joe Groff came up with the code below which should convince the optimiser not to insert any release calls at the wrong place or inline the function:

@inline(never)
func callWithErrno(_ fn: () -> Int) -> (result: Int, errno: Int) {
  var result: Int
  var savedErrno: Int
  withExtendedLifetime(fn) {
      result = fn()
      savedErrno = errno
  }
  return (result, savedErrno)
}

An example use of that is

let (rv, writeErrno) = callWithErrno {
  write(-1, nil, 0)
}

if rv < 0 {
  throw SomeError(errorCode: writeErrno)
}

This makes it possible to retrieve the correct `errno` value in Swift but I think there remain too many ways to do it wrongly. First and foremost that the compiler doesn't complain if the programmer forgets to use `callWithErrno`.

===

Let me know what you think!

[1]: https://en.wikipedia.org/wiki/Errno.h
[2]: https://en.wikipedia.org/wiki/System_call
[3]: https://golang.org/pkg/syscall/#Write

Many thanks,
Johannes
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Johannes Weiss) #5

Hi John,

[...]

This is a pitch to make [`errno`][1]-setting functions properly usable, as in having a guarantee to get the correct `errno` value on failure of a [system call][2]. Currently, functions which set `errno` are just exported in the Darwin/Glibc modules with (as far as I understand) no guaranteed correct way of handling errors as the correct `errno` value can't be retrieved.
This means that much of the Swift code which uses Darwin/Glibc out there relies on behaviour that isn't guaranteed.

Is this a general enough problem to be worth complicating the importer with vs., say, just manually wrapping the POSIX APIs to be throwing in an overlay? POSIX uses a lot of different conventions for reporting that an error occurred, and there's only a finite set of POSIX APIs. Are there "user" functions that wrap POSIX and carefully report errors with errno?

That's a very good point. I'm sure there are "user libraries" out there which set errno but I'd hope that it's a fairly limited set. So I think an overlay that ships with Swift and makes the underlying module unavailable would be fine by me. Something like the Dispatch module (which uses the "hidden" CDispatch module internally) sounds reasonable to me.

The reason I proposed to include that into the clang importer is because the Darwin/Glibc modules contain hundreds (I haven't counted) of functions (syscall and library functions) that set errno. And I assumed that if it can be done automatically we could leverage that.

[...]

Thanks,
  Johannes


(Goffredo Marocchi) #6

I hope we get an answer from the core team as it seems something they must have considered already. Sounds serious.

···

Sent from my iPhone

On 4 Nov 2016, at 08:12, Johannes Weiß via swift-evolution <swift-evolution@swift.org> wrote:

Hi,

I just realised, that the problem is slightly worse than I originally described. I believed that successful calls in between the actual call the programmer wanted to make and capturing `errno` are not a problem.

But POSIX seems to suggest [4] that "The setting of errno after a successful call to a function is unspecified unless the description of that function specifies that errno shall not be modified." . The Linux man page [5] also mentions that "a function that succeeds is allowed to change errno."

To me this means that the issue is wider than just ARC. I think the problem extends to memory allocations on the heap. Failed memory allocations aren't a problem because they are terminal in Swift. However, _successful_ memory allocations might be a problem because the malloc(3) that the Swift compiler will use is absolutely free to set errno to 0 (or any other value in fact) indicating success. (Said that at least malloc doesn't change `errno` on the macOS or Linux I tested today, we probably shouldn't rely on that though.)

This makes it even more unpredictable to the programmer what a use of `errno` in Swift will return. IMHO it shouldn't be exported to Swift as its value is undefined almost(?) everywhere.

[4]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/errno.html
[5]: http://man7.org/linux/man-pages/man3/errno.3.html

All the best,
Johannes

On 2 Nov 2016, at 1:12 pm, Johannes Weiß via swift-evolution <swift-evolution@swift.org> wrote:

Hey swift-evolution,

First of all apologies, this is not a full proposal yet, it's meant to kick off a discussion on how to resolve the issue.

# Make `errno`-setting functions more usable from Swift

## Introduction

This is a pitch to make [`errno`][1]-setting functions properly usable, as in having a guarantee to get the correct `errno` value on failure of a [system call][2]. Currently, functions which set `errno` are just exported in the Darwin/Glibc modules with (as far as I understand) no guaranteed correct way of handling errors as the correct `errno` value can't be retrieved.
This means that much of the Swift code which uses Darwin/Glibc out there relies on behaviour that isn't guaranteed.

## Motivation

In many Swift libraries that use the Darwin/Glibc modules there is code similar to:

/* import Darwin/Glibc */

let rv = some_system_call(some, parameters)
if rv < 0 {
 throw SomeError(errorCode: errno) /* <-- errno use */
}

That looks very innocent but please note that `errno` is used here. And `errno` is an interesting one as it's a thread-local variable which is written to by many functions. A thread-local variable is like a global variable except that setting it in one thread does not affect its value in any other thread. Pretty much all system calls and many library functions set `errno` if something went wrong.

The problem is that as far as I see (and Swift developers have confirmed), there is no guarantee that in between the call of `some_system_call` and the reading of `errno`, `errno` hasn't been overwritten by some other system call that has been call on the same thread.

To illustrate this further, let's consider this example

/* import Darwin/Glibc */
public class SomeClass {
public let someValue: Int = 1
deinit {
    /* call some failing syscall, for example */
    write(-1, nil, 0) /* should set errno to EBADF */
}
}

public func foo() {
 let x = SomeClass()
 let rv = write(x.someValue, nil, 0)
 let errnoSave = errno
 if rv != 0 {
    throw SomeError(errorCode: errnoSave)
 }
}

as you see in function `foo`, the instance `x` of `SomeClass` isn't needed anymore as soon as `write` has been called. So (as far as I understand) there's no guarantee that ARC doesn't turn the above code into

let x = SomeClass()
let rv = write(x.someValue, nil, 42) /* should set errno to EFAULT */
/* ARC generated */ x.release()
let errnoSave = errno /* wrong errno value :( */
if rv != 0 {
throw SomeError(errorCode: errnoSave)
}

And the ARC generated `x.release()` will cause `x` to be deallocated which will call the failing `write` in the `deinit` of `SomeClass`. So `errnoSave` might be `EBADF` instead of `EFAULT` depending on where ARC put the `x.release()` call.

What `errno` value we read will depend on the optimisation settings and the Swift compiler version. That's IMHO a big issue as it might make the lowest layers unstable with hard-to-debug issues.

## Proposed solution

I don't have a full story on how to actually resolve the issue but I see a few options:

### Option 1: always return errno

clang importer could be changed to make all `errno`-setting functions return a tuple of the actual return value and the `errno` value.

For example, currently write(2) is imported as:

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) -> Int

which could be changed to

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) -> (Int, Int32 /* for errno */)

Correct code to use write would then look like this:

let (bytesWritten, writeErrno) = write(fd, buf, len)
if bytesWritten >= 0 {
 /* everything's fine */
} else {
 throw POSIXError(code: writeErrno)
}

### Option 2: make them throw

The second option is to teach clang importer to make the functions throwing. So write(2) would be imported as

public func write(_ __fd: Int32, _ __buf: UnsafeRawPointer!, _ __nbyte: Int) throws /* POSIXError */ -> Int

That would make these functions quite easy to use and would feel natural in Swift:

do {
 let bytesWritten = write(fd, buf, len)
} catch let e as POSIXError {
 /* handle error */
} catch {
 ...
}

### Discussion

The beauty of option 1 is simplicity. Clang importer would not need to know what exact values a system call returns on failure. Also very little additional code needs to be emitted for calling a system call. That seems to be the [way Go is going][3].

The downside of option 1 is that the API doesn't feel like idiomatic Swift. The returned `errno` value is only useful if the system call failed and is arbitrary in the case when it worked. (There is no guarantee that `errno` is set to `0` when a system call succeeds.)
Also there is a slight overhead in reading `errno` which would be paid for every `errno`-setting function, even if successful. Hence, option 2 looks nice as it brings these functions more in like with other Swift functions. However, as mentioned before, clang importer would need to learn what values are returned on success/failure for _every_ `errno`-setting function (and there's _many_ of them).

## Proposed Approach

Let's discuss what is a good solution and then I will volunteer to put together a full proposal.

## Source compatibility

it depends.

To retain source compatibility the Darwin/Glibc modules could also be left as is. The safe `errno` handling could then be implemented only in a new, unified module for Darwin/Glibc. There's already ongoing discussions/proposals about that on the list anyway. That new module could then be implemented in the spirit of options 1, 2, or some other solution. The benefits are guaranteed source compatibility for legacy applications and `errno` safety plus easier imports for new applications - win/win :slightly_smiling_face:.

## Effect on ABI stability

Will most likely be additive, so probably none.

## Effect on API resilience

see source compatibility.

## Alternatives considered

Do nothing and workaround `errno` capturing being very hard. I discussed this previously elsewhere and Joe Groff came up with the code below which should convince the optimiser not to insert any release calls at the wrong place or inline the function:

@inline(never)
func callWithErrno(_ fn: () -> Int) -> (result: Int, errno: Int) {
 var result: Int
 var savedErrno: Int
 withExtendedLifetime(fn) {
     result = fn()
     savedErrno = errno
 }
 return (result, savedErrno)
}

An example use of that is

let (rv, writeErrno) = callWithErrno {
 write(-1, nil, 0)
}

if rv < 0 {
 throw SomeError(errorCode: writeErrno)
}

This makes it possible to retrieve the correct `errno` value in Swift but I think there remain too many ways to do it wrongly. First and foremost that the compiler doesn't complain if the programmer forgets to use `callWithErrno`.

===

Let me know what you think!

[1]: https://en.wikipedia.org/wiki/Errno.h
[2]: https://en.wikipedia.org/wiki/System_call
[3]: https://golang.org/pkg/syscall/#Write

Many thanks,
Johannes
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Joe Groff) #7

Swift's implicit allocations don't directly use malloc/free. If there is a platform where a successful malloc dirties errno, we should avoid doing so in the Swift runtime's entry points.

-Joe

···

On Nov 4, 2016, at 1:12 AM, Johannes Weiß <johannesweiss@apple.com> wrote:

Hi,

I just realised, that the problem is slightly worse than I originally described. I believed that successful calls in between the actual call the programmer wanted to make and capturing `errno` are not a problem.

But POSIX seems to suggest [4] that "The setting of errno after a successful call to a function is unspecified unless the description of that function specifies that errno shall not be modified." . The Linux man page [5] also mentions that "a function that succeeds is allowed to change errno."

To me this means that the issue is wider than just ARC. I think the problem extends to memory allocations on the heap. Failed memory allocations aren't a problem because they are terminal in Swift. However, _successful_ memory allocations might be a problem because the malloc(3) that the Swift compiler will use is absolutely free to set errno to 0 (or any other value in fact) indicating success. (Said that at least malloc doesn't change `errno` on the macOS or Linux I tested today, we probably shouldn't rely on that though.)

This makes it even more unpredictable to the programmer what a use of `errno` in Swift will return. IMHO it shouldn't be exported to Swift as its value is undefined almost(?) everywhere.

[4]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/errno.html
[5]: http://man7.org/linux/man-pages/man3/errno.3.html


(Johannes Weiss) #8

Hi Joe,

I just realised, that the problem is slightly worse than I originally described. I believed that successful calls in between the actual call the programmer wanted to make and capturing `errno` are not a problem.

But POSIX seems to suggest [4] that "The setting of errno after a successful call to a function is unspecified unless the description of that function specifies that errno shall not be modified." . The Linux man page [5] also mentions that "a function that succeeds is allowed to change errno."

To me this means that the issue is wider than just ARC. I think the problem extends to memory allocations on the heap. Failed memory allocations aren't a problem because they are terminal in Swift. However, _successful_ memory allocations might be a problem because the malloc(3) that the Swift compiler will use is absolutely free to set errno to 0 (or any other value in fact) indicating success. (Said that at least malloc doesn't change `errno` on the macOS or Linux I tested today, we probably shouldn't rely on that though.)

This makes it even more unpredictable to the programmer what a use of `errno` in Swift will return. IMHO it shouldn't be exported to Swift as its value is undefined almost(?) everywhere.

[4]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/errno.html
[5]: http://man7.org/linux/man-pages/man3/errno.3.html

Swift's implicit allocations don't directly use malloc/free.

Ah right, that's good to know. But still, "explicit" allocation by creating an object does, right?

Like

class Foo {}
let x = Foo()

does allocate through _swift_allocObject_ -> swift_slowAlloc -> malloc .

If there is a platform where a successful malloc dirties errno, we should avoid doing so in the Swift runtime's entry points.

ok, agreed, should there be a platform where malloc that dirties `errno` it could be worked around by saving and restoring the `errno`.

Cheers,
  Johannes


(Joe Groff) #9

swift_slowAlloc does not need to be implemented by calling malloc.

-Joe

···

On Nov 4, 2016, at 9:12 AM, Johannes Weiß <johannesweiss@apple.com> wrote:

Hi Joe,

I just realised, that the problem is slightly worse than I originally described. I believed that successful calls in between the actual call the programmer wanted to make and capturing `errno` are not a problem.

But POSIX seems to suggest [4] that "The setting of errno after a successful call to a function is unspecified unless the description of that function specifies that errno shall not be modified." . The Linux man page [5] also mentions that "a function that succeeds is allowed to change errno."

To me this means that the issue is wider than just ARC. I think the problem extends to memory allocations on the heap. Failed memory allocations aren't a problem because they are terminal in Swift. However, _successful_ memory allocations might be a problem because the malloc(3) that the Swift compiler will use is absolutely free to set errno to 0 (or any other value in fact) indicating success. (Said that at least malloc doesn't change `errno` on the macOS or Linux I tested today, we probably shouldn't rely on that though.)

This makes it even more unpredictable to the programmer what a use of `errno` in Swift will return. IMHO it shouldn't be exported to Swift as its value is undefined almost(?) everywhere.

[4]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/errno.html
[5]: http://man7.org/linux/man-pages/man3/errno.3.html

Swift's implicit allocations don't directly use malloc/free.

Ah right, that's good to know. But still, "explicit" allocation by creating an object does, right?

Like

class Foo {}
let x = Foo()

does allocate through _swift_allocObject_ -> swift_slowAlloc -> malloc .