Inexplicable Behaviour Difference

Can someone confirm for me that I’m not losing my mind? It seems like the exact same code does two different things, and I cannot understand why. Does anyone else observe the same when you do this? (I’m on macOS 10.15.6 with Swift 5.2.4 at the moment.)

  1. Create an empty package:

    % mkdir Experiment
    % cd Experiment
    % swift package init --type executable
    
  2. Delete all but the manifest and the two module sources.

    Package.swift
    Sources/Experiment/main.swift
    Tests/ExperimentTests/ExperimentTests.swift
    
  3. Reduce the manifest to this:

    // swift-tools-version:5.2
    
    import PackageDescription
    
    let package = Package(
      name: "Experiment",
      targets: [
        .target(name: "Experiment", dependencies: []),
        .testTarget(name: "ExperimentTests", dependencies: []),
      ]
    )
    
  4. Overwrite main.swift with the following:

    import XCTest
    
    let string = "ש\u{5C1}\u{5B8}"
    let result = string.decomposedStringWithCompatibilityMapping
    
    let scalars = Array(result.unicodeScalars)
    let sorted = scalars.sorted(by: {
      $0.properties.canonicalCombiningClass < $1.properties.canonicalCombiningClass
    })
    assert(scalars == sorted, "\(scalars) ≠ \(sorted)")
    
  5. Strip the test case file down to this:

    import XCTest
    
    final class ExperimentTests: XCTestCase {
    
      func testReordering() {
      }
    }
    
  6. Copy and paste the contents of main.swift (except the import) verbatim into testReordering(). It should then look like this:

    import XCTest
    
    final class ExperimentTests: XCTestCase {
    
      func testReordering() {
    
        let string = "ש\u{5C1}\u{5B8}"
        let result = string.decomposedStringWithCompatibilityMapping
    
        let scalars = Array(result.unicodeScalars)
        let sorted = scalars.sorted(by: {
          $0.properties.canonicalCombiningClass < $1.properties.canonicalCombiningClass
        })
        assert(scalars == sorted, "\(scalars) ≠ \(sorted)")
      }
    }
    
  7. Try the executable variant:

    % swift run
    [3/3] Linking Experiment
    Assertion failed: ["\u{05E9}", "\u{05C1}", "\u{05B8}"] ≠ ["\u{05E9}", "\u{05B8}", "\u{05C1}"]: file [...]/Experiment/Sources/Experiment/main.swift, line 10
    zsh: illegal hardware instruction  swift run
    
  8. Try the test variant:

    % swift test
    [3/3] Linking ExperimentPackageTests
    Test Suite 'All tests' started at 2020-08-25 16:29:07.998
    Test Suite 'ExperimentPackageTests.xctest' started at 2020-08-25 16:29:07.999
    Test Suite 'ExperimentTests' started at 2020-08-25 16:29:07.999
    Test Case '-[ExperimentTests.ExperimentTests testReordering]' started.
    Test Case '-[ExperimentTests.ExperimentTests testReordering]' passed (0.088 seconds).
    Test Suite 'ExperimentTests' passed at 2020-08-25 16:29:08.087.
         Executed 1 test, with 0 failures (0 unexpected) in 0.088 (0.088) seconds
    Test Suite 'ExperimentPackageTests.xctest' passed at 2020-08-25 16:29:08.088.
         Executed 1 test, with 0 failures (0 unexpected) in 0.088 (0.089) seconds
    Test Suite 'All tests' passed at 2020-08-25 16:29:08.088.
         Executed 1 test, with 0 failures (0 unexpected) in 0.088 (0.090) seconds
    

How is it even possible that one traps and not the other?!? (That is what happens on your machine too, right? Or is the universe playing some sort of joke on me?)

My hunch was that this is locale-dependent and that XCTest is initializing the locale in a way that is different than the plain command-line tool. I tried to print out the current locale to debug,

print("locale: \(Locale.current)")

but adding that to main.swift made the error go away. Not being an expert on any of this, I'm not sure whether touching the current locale is expected to change behaviour like that.

Thank you. I can confirm that adding _ = Locale.current before everything else is a viable workaround, even in the vastly more complicated scenario that kickstarted this debugging session.

There shouldn’t be anything locale‐sensitive about Unicode equivalence. But it wouldn’t surprise me if the implementation is coming from related areas of ICU and Foundation is neglecting to load something important. Now that I believe my eyes, I’ll file a bug.

It’s too bad it’s not possible to write a regression test...

1 Like