Using va_list in C function with a Swift String raises EXC_BAD_ACCESS

While writing a Swift wrapper for a C wrapper of a C++ library, I've stumbled on some weird bugs regarding Swift's CVarArg. The C wrapper I already have uses variadic functions which I converted to functions using va_list as an argument so they could be imported (since Swift cannot import C variadic functions). When passing arguments to such a function, once bridged to Swift, it uses the private _cVarArgEncoding property of types conforming to CVarArg to "encode" the values which are then sent as a pointer to the C function. It seems however that this encoding is faulty for Swift Strings.

To demonstrate, I've created the following package:

Package.swift

// swift-tools-version:5.2

import PackageDescription

let package = Package(
    name: "CVarArgTest",
    products: [
        .executable(
            name: "CVarArgTest",
            targets: ["CVarArgTest"]),
    ],
    targets: [
        .target(
            name: "CLib"),
        .target(
            name: "CVarArgTest",
            dependencies: ["CLib"])
    ]
)

CLib

CTest.h

#ifndef CTest_h
#define CTest_h

#include <stdio.h>

/// Prints out the strings provided in args
/// @param num The number of strings in `args`
/// @param args A `va_list` of strings
void test_va_arg_str(int num, va_list args);

/// Prints out the integers provided in args
/// @param num The number of integers in `args`
/// @param args A `va_list` of integers
void test_va_arg_int(int num, va_list args);

/// Just prints the string
/// @param str The string
void test_str_print(const char * str);

#endif /* CTest_h */


CTest.c

#include "CTest.h"
#include <stdarg.h>

void test_va_arg_str(int num, va_list args)
{
    printf("Printing %i strings...\n", num);
    for (int i = 0; i < num; i++) {
        const char * str = va_arg(args, const char *);
        puts(str);
    }
}

void test_va_arg_int(int num, va_list args)
{
    printf("Printing %i integers...\n", num);
    for (int i = 0; i < num; i++) {
        int foo = va_arg(args, int);
        printf("%i\n", foo);
    }
}

void test_str_print(const char * str)
{
    puts(str);
}

main.swift

import Foundation
import CLib

// The literal String is perfectly bridged to the CChar pointer expected by the function
test_str_print("Hello, World!")

// Prints the integers as expected
let argsInt: [CVarArg] = [123, 456, 789]
withVaList(argsInt) { listPtr in
    test_va_arg_int(Int32(argsInt.count), listPtr)
}

// ERROR: Thread 1: EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
let argsStr: [CVarArg] = ["Test", "Testing", "The test"]
withVaList(argsStr) { listPtr in
    test_va_arg_str(Int32(argsStr.count), listPtr)
}

The package is available here as well.

As commented in the code above, printing a String via C or a va_list containing Ints works as expected, but when converted to const char *, there's an exception (EXC_BAD_ACCESS (code=EXC_I386_GPFLT)).

So, in short: did I mess up the C side of it or is Swift doing something wrong here? I've tested this in Xcode 11.5 and 12.0b2. If it's a bug, I'll be happy to report it.

(Disclaimer: I've posted the same question on SO but I'm guessing I'll have more luck here)

Swift Strings are not const char *, but structs that provide a variety of capabilities. More like C++ std::basic_strings, but then some. Single String works because there is an implicit conversion to a C const char* as part of the String bridge to C. Array of Ints works because an Int can be implicitly converted to an int. I think passing an Array of Strings is going to require some kind of workaround so that you are passing an array of const char*, not an array of Strings.

I actually got an answer quicker than I thought on SO: interop - Is Swift's handling of CVarArg for String buggy? - Stack Overflow (I'll let NobodyNada post here if they want)
In short: String's conformance to CVarArg goes through NSString as this had been added a while back (August 2016, Make bridged String and collection types conform to CVarArg. · apple/swift@7535acc · GitHub). Originally this was intended for compatibility with others Obj-C functions and probably made sense back then.
The workaround is to get a cString pointer with func withCString<Result>(_ body: (UnsafePointer<CChar>) throws -> Result) rethrows -> Result and then use that UnsafePointer<CChar> in withVAList().

This is very convoluted just to call C functions in Swift and IMO the implementation of String's conformance to CVarArg should be updated. Swift has changed tremendously since Swift 3.0 and I think the user's expectation would now be that Swift Strings work with C's va_list "out of the box", especially since they can be used directly with C functions with const char *.

Cross-posting my Stack Overflow answer:


This one's a bit tricky: your string is actually being bridged to an Objective-C NSString * rather than a C char *:

(lldb) p str
(const char *) $0 = 0x3cbe9f4c5d32b745 ""
(lldb) p (id)str
(NSTaggedPointerString *) $1 = 0x3cbe9f4c5d32b745 @"Test"

(If you're wondering why it's an NSTaggedPointerString rather than just an NSString, this article is a great read -- in short, the string is short enough to be stored directly in the bytes of the pointer variable rather than in an object on the heap.

Looking at the source code for withVaList, we see that a type's va_list representation is determined by its implementation of the _cVarArgEncoding property of the CVarArg protocol. The standard library has some implementations of this protocol for some basic integer and pointer types, but there's nothing for String here. So who's converting our string to an NSString?

Searching around the Swift repo on GitHub, we find that Foundation is the culprit:

//===----------------------------------------------------------------------===//
// CVarArg for bridged types
//===----------------------------------------------------------------------===//
extension CVarArg where Self: _ObjectiveCBridgeable {
  /// Default implementation for bridgeable types.
  public var _cVarArgEncoding: [Int] {
    let object = self._bridgeToObjectiveC()
    _autorelease(object)
    return _encodeBitsAsWords(object)
  }
}

In plain English: any object which can be bridged to Objective-C is encoded as a vararg by converting to an Objective-C object and encoding a pointer to that object. C varargs are not type-safe, so your test_va_arg_str just assumes it's a char* and passes it to puts, which crashes.

So is this a bug? I don't think so -- I suppose this behavior is probably intentional for compatibility with functions like NSLog that are more commonly used with Objective-C objects than C ones. However, it's certainly a surprising pitfall, and it's probably one of the reasons why Swift doesn't like to let you call C variadic functions.

You'll want to work around this by manually converting your strings to C-strings. This can get a bit ugly if you have an array of strings that you want to convert without making unnecessary copies, but here's a function that should be able to do it.

extension Collection where Element == String {
    /// Converts an array of strings to an array of C strings, without copying.
    func withCStrings<R>(_ body: ([UnsafePointer<CChar>]) throws -> R) rethrows -> R {
        return try withCStrings(head: [], body: body)
    }
    
    // Recursively call withCString on each of the strings.
    private func withCStrings<R>(head: [UnsafePointer<CChar>],
                                 body: ([UnsafePointer<CChar>]) throws -> R) rethrows -> R {
        if let next = self.first {
            // Get a C string, add it to the result array, and recurse on the remainder of the collection
            return try next.withCString { cString in
                var head = head
                head.append(cString)
                return try dropFirst().withCStrings(head: head, body: body)
            }
        } else {
            // Base case: no more strings; call the body closure with the array we've built
            return try body(head)
        }
    }
}

func withVaListOfCStrings<R>(_ args: [String], body: (CVaListPointer) -> R) -> R {
    return args.withCStrings { cStrings in
        withVaList(cStrings, body)
    }
}

let argsStr: [String] = ["Test", "Testing", "The test"]
withVaListOfCStrings(argsStr) { listPtr in
    test_va_arg_str(Int32(argsStr.count), listPtr)
}

// Output:
// Printing 3 strings...
// Test
// Testing
// The test