Compiling 4.2 on armv7 (and i686)

Knowing Darwin works here is useful info. Might give me another angle to see what might be different.

Maybe someone can tell me if this is a red herring or not. The processing is choking on the SIL output for this function where Base is _UnmanagedString<UInt16>:

extension ReversedCollection.Iterator: IteratorProtocol, Sequence {
  public typealias Element = Base.Element
  
  @inlinable
  @inline(__always)
  public mutating func next() -> Element? {
    guard _fastPath(_position != _base.startIndex) else { return nil }
    _base.formIndex(before: &_position)
    return _base[_position] // THIS IS WHERE THINGS GO BAD
  }
}

I've looked at the SIL output for the function, and it mostly makes sense, after reading up on SIL. However, here's where it goes south:

bb12:                                             // Preds: bb11                                                                
  %97 = struct_element_addr %20 : $*UInt16, #UInt16._value // user: %98                                                         
  %98 = load %97 : $*Builtin.Int16                // user: %99                                                                  
  %99 = struct $UInt16 (%98 : $Builtin.Int16)     // user: %100                                      
  %100 = enum $Optional<UInt16>, #Optional.some!enumelt.1, %99 : $UInt16 // user: %101                                          
  br bb16(%100 : $Optional<UInt16>)               // id: %101       

The specific piece that dies is the load into %98. The logic is weird, but after muddling through the Integers.swift.gyb file, I realize _value for all integers use the signed built-in type, even if it isn't signed. So this looks somewhat wrong at first glance, but is actually correct. But what about how %20 is calculated?

  %18 = integer_literal $Builtin.Word, 4294967295 // user: %20                                                                  
  %19 = pointer_to_address %4 : $Builtin.RawPointer to [strict] $*UInt16 // user: %20                                           
  %20 = index_addr %19 : $*UInt16, %18 : $Builtin.Word // users: %97, %21      

This, looks a bit more weird. I'm having a hard time drumming it back up, but I could have sworn that I read that literals are expected to be signed rather than unsigned. Yet, here we have 0xFFFF_FFFF as a literal. So I'm suspecting that this may be the root problem, where the compiler is attempting to offset an index by 4 billion elements in the positive direction (a very bad day), versus 1 element in the negative direction (expected behavior).

Am I on the right track here? Or is my memory faulty and literals shouldn't be signed in this case?

The Builtin types correspond to LLVM's types, which are neither signed nor unsigned; it's the Swift struct type and the operations on it that imparts those semantics. That pointer arithmetic indeed looks suspicious. Even though the operation isn't fundamentally signed, I'm not sure why we would walk one byte backwards there; @Michael_Ilseman might recognize that logic though.

Considering this is a Swift.ReversedCollection<_UnmanagedString<UInt16>>.Iterator that is being worked with, the answer is that it works by decrementing the Index. In this case, because _UnmanagedString<UInt16>.Index is backed by UnsafePointer<UInt16>, it's doing the index decrement using index_addr. That all makes sense.

I was more concerned if the unsigned literal may be causing issues, if it were ever loaded into say, a 64-bit int on 32-bit systems as-is rather than using sign-extension, and then used to try to understand what type is expected to be at the new address?

EDIT:

Okay, this is pretty interesting:

(gdb) p static_cast<IntegerLiteralInst *>(IAI->getIndex().Value)->getValue()
$58 = {static WORD_MAX = 18446744073709551615, U = {VAL = 4294967295, pVal = 0xffffffff}, BitWidth = 64}

Swift when asked about the APInt for the Literal is reporting 64-bits wide. So that makes this even more suspicious. It may not even be a bug in the StringGuts code then, it just happened to expose this underlying issue. That would explain why nothing stood out in the diff.

It also looks like this is intentional, as the IntegerLiteralInst type stores the literal as llvm::APInt::WordType in a trailing object (which is always uint64_t). It then creates the APInt from it, which explains the 64-bit width. This really looks like it somehow got created from a 32-bit signed value somewhere, and didn't get sign extended before the instruction was constructed.

I just want to say, this is getting all sorts of weird. I've discovered a pattern in the SIL that's generating weird behaviors during constant folding beyond just this one case:

  public func deallocate() {
    Builtin.deallocRaw(_rawValue, (-1)._builtinWordValue, (-1)._builtinWordValue)
  }

This looks pretty benign, but the end result in the SIL gets weird:

// UnsafeMutablePointer.deallocate()
sil [serialized] @$SSp10deallocateyyF : $@convention(method) <Pointee> (UnsafeMutablePointer<Pointee>) -> () {
// %0                                             // users: %2, %1
bb0(%0 : $UnsafeMutablePointer<Pointee>):
  debug_value %0 : $UnsafeMutablePointer<Pointee>, let, name "self", argno 1 // id: %1
  %2 = struct_extract %0 : $UnsafeMutablePointer<Pointee>, #UnsafeMutablePointer._rawValue // user: %12
  %3 = integer_literal $Builtin.Int2048, -1       // user: %4
  %4 = builtin "s_to_s_checked_trunc_Int2048_Int32"(%3 : $Builtin.Int2048) : $(Builtin.Int32, Builtin.Int1) // users: %6, %5
  %5 = tuple_extract %4 : $(Builtin.Int32, Builtin.Int1), 0 // user: %7
  %6 = tuple_extract %4 : $(Builtin.Int32, Builtin.Int1), 1
  %7 = struct $Int (%5 : $Builtin.Int32)          // user: %8
  %8 = struct_extract %7 : $Int, #Int._value      // user: %9
  %9 = builtin "zextOrBitCast_Int32_Word"(%8 : $Builtin.Int32) : $Builtin.Word // user: %12
  %10 = integer_literal $Builtin.Int32, -1        // user: %11
  %11 = builtin "zextOrBitCast_Int32_Word"(%10 : $Builtin.Int32) : $Builtin.Word // user: %12
  %12 = builtin "deallocRaw"(%2 : $Builtin.RawPointer, %9 : $Builtin.Word, %11 : $Builtin.Word) : $()
  %13 = tuple ()                                  // user: %14
  return %13 : $()                                // id: %14
} // end sil function '$SSp10deallocateyyF'

This becomes:

// UnsafeMutablePointer.deallocate()
sil [serialized] @$SSp10deallocateyyF : $@convention(method) <Pointee> (UnsafeMutablePointer<Pointee>) -> () {
// %0                                             // users: %2, %1
bb0(%0 : $UnsafeMutablePointer<Pointee>):
  debug_value %0 : $UnsafeMutablePointer<Pointee>, let, name "self", argno 1 // id: %1
  %2 = struct_extract %0 : $UnsafeMutablePointer<Pointee>, #UnsafeMutablePointer._rawValue // user: %6
  %3 = integer_literal $Builtin.Int32, -1         // user: %4
  %4 = builtin "zextOrBitCast_Int32_Word"(%3 : $Builtin.Int32) : $Builtin.Word // user: %6
  %5 = integer_literal $Builtin.Word, 4294967295  // user: %6
  %6 = builtin "deallocRaw"(%2 : $Builtin.RawPointer, %4 : $Builtin.Word, %5 : $Builtin.Word) : $()
  %7 = tuple ()                                   // user: %8
  return %7 : $()                                 // id: %8
} // end sil function '$SSp10deallocateyyF'

That's... interesting. Starting with an Int2048 literal (expected), which is then truncated using sign information to Int32 (why Int32? What decides this?), and that then winds up being thrown into the zextOrBitCast_Int32_Word built-in, which extends Int32 to Word. Because Word is still considered 64-bits (it seems to be considered a pointer type, and the maximum width of pointer types is 64 bits, no matter what your host/target is). That creates an interesting situation where we truncate down using sign-preserving means, but then extend back up using a means that doesn't preserve the sign. And I'm not sure why it does this, or why the StringGuts change was what it took to start firing asserts. But because of constant folding, these become the Builtin.Word with 0xFFFFFFFF as the value (positive, instead of negative). When the optimizer tries to make sense of the loads, it chokes.

Now I'm wondering what the heck could be different on Darwin, unless this is specifically a host behavior, and not target at play. Are i386/arm32 on Darwin only targets built on a 64-bit host? If that's the case it would explain it, if for some reason it emits Builtin.Int64 in these cases on 64-bit hosts?

1 Like

The Int2048 is a byproduct of the _ExpressibleByBuiltinIntegerLiteral protocol, which tries to provide some abstraction between literals in the source language and the implementation so that literals can be user-extensible. The resulting SIL looks reasonable (though we're missing some constant folding that ought to be happening); Word would be 32-bit on a 32-bit target, so the manipulations here should end up with -1 getting passed to deallocRaw in the end

Keep in mind the snippets arenā€™t the full folding process, that would be very verbose. :slight_smile:

The literal stuff makes sense, except when a 32-bit host is treating Builtin.Word as a 64-bit value, and then tries to optimize address loads. In a 32-bit address space. Then the sign of the original literal is suddenly important again, yes?

Also, what is expected if I took this SIL and used it with a 64-bit target? Word in that case is 64 bit, and again this value seems like it is problematic.

Unfortunately, the things Iā€™m getting explanations on are the things we already agree make sense. But hear me out here:

What Iā€™m reading is that when SIL encounters a Builtin.Word, is that it treats it like a 64 bit value. Always. It isnā€™t fixed width, and follows the pointer size, so when the compiler generates an IntegerLiteralInst for that type, it asks for the maximum width (which is hard coded at 64-bits for pointer width semantics), I always get a 64-bit APInt, which means -1 becomes 4 billion under zero extension behavior that happens when folding the Int2048 literal into a Int32 literal and then folding the up conversion to Builtin.Word.

It may all be a wash when it gets truncated when outputting LLVM IR, but the SIL optimizer, doing 64-bit math with this value is hitting the error. Thatā€™s why the Projection kind has no value, the memory location isnā€™t remotely correct and well out of bounds of anything it knows about when processing the function.

I suspect the StringGuts change exposes this bug precisely because it is a Reversible collection that is using an UnsafePointer as the Index, causing SILGen to produce the address index and load instructions which need to be checked by the RedunantLoadElim pass.

Again, because itā€™s a 32-bit Word once you gen the IR for LLVM, unless the optimizer actually wants to check the address indexing behavior, you wouldnā€™t even know that this is a problem. Unless I tried to build 64-bit targets on a 32-bit host, or the optimizer actually wanted to know what is at offset 4 billion * sizeof(UInt16).

I apologize if I come off as a bit harsh here, or rude. Iā€™m just a little frustrated because it feels like Iā€™m so close to the answer here, but Iā€™m also literally brand new to this code base. I donā€™t have enough experience to confirm my thinking is correct, and would appreciate any help walking through this logic, or guidance on how to build a test to confirm/disprove my hypothesis.

If I was more familiar with the code, Iā€™d probably have tested my hypothesis with a fix by now, hence my frustration. :slight_smile:

No, it looks like you're right. I wonder how this works on Apple's 32-bit platforms.

This definitely seems suspect to me.

SIL doesn't know anything about the target configuration, so it treats Word as 64-bit conservatively. It is suspect that it's doing a zextOrBitCast_Int32_Word instead of a sextOrBitCast_Int32_Word. We could try fixing that, though I suspect in practice it still won't matter once we lower to LLVM IR.

@jrose, thatā€™s the code block Iā€™m referring to, yes. And as a sheer hack, I set it to 32 on my local build just now. I was able to actually build the stdlib on 32-bit x86 Debian against the 2018-9-22 snapshot in master, so I think that lends a lot of evidence that this is the issue, and one of only two

(The other is that the static assert you recently added for SILLocation fails on 32-bit, I disabled it just to see how far I could get on master overnight out of a lack of anything else to try)

@Joe_Groff the issue is that LLVM isnā€™t the only one attempting to make sense of these literals. SIL is too. So whatever change is made needs to account for that. But I agree you can probably get away with leaving the emitted SIL alone so long as Builtin.Word has the correct size for the target platform.

Sure, we shouldn't rely on stars aligning for things to work by accident. We should still fix the standard library to use sensible sign-extending operations here. Maybe we have problems like this elsewhere, but at least in the case you've highlighted, though, the only thing that appears to be using the value is the deallocRaw call, which should only care about the low 32 bits in practice on a 32-bit platform.

Iā€™ll just be clear, the case I highlighted around dealloc isnā€™t the break, but it was a good example of the constant folding that led to the break I am hitting in next().

The SIL from that post further up is what is breaking after all the constant folding is done.

1 Like

What would that fix be and where would it apply? This example is the default ReversedCollection operation that's calling formIndex(before:), which seems sensible:

  public mutating func next() -> Element? {
    guard _fastPath(_position != _base.startIndex) else { return nil }
    _base.formIndex(before: &_position)
    return _base[_position]
  }

_UnmanagedString's Index type is UnsafePointer<CodeUnit> (either UInt8 or UInt16), which I don't think breaks the indexing model:

extension _UnmanagedString : RandomAccessCollection {
  internal typealias Element = UTF16.CodeUnit
  // Note that the Index type can't be an integer offset because Collection
  // requires that SubSequence share indices with the original collection.
  // Therefore, we use pointers as the index type; however, we also provide
  // integer subscripts as a convenience, in a separate extension below.
  @usableFromInline
  internal typealias Index = UnsafePointer<CodeUnit>
...
  internal subscript(position: Index) -> UTF16.CodeUnit { ... }
...
  internal subscript(offset: Int) -> UTF16.CodeUnit { ... }
...

Since pointers and integers are distinct types, I wouldn't expect the Int-based convenience subscript to conflict with anything.

It's a RandomAccessCollection, deriving its index(before:) etc from the default:

  public func index(before i: Index) -> Index {
    let result = i.advanced(by: -1)
    // FIXME: swift-3-indexing-model: tests for the trap.
    _failEarlyRangeCheck(
      result, bounds: Range(uncheckedBounds: (startIndex, endIndex)))
    return result
  }

So in this case I would expect the UnsafePointer<UInt16> to be decrementing itself by its byte-stride of 2, instead of an Int that decrements itself by 1. Is this not the case? Where is the issue or am I missing something?

Looking at @Kaiede's example:

suggests that the problem may be in Int._builtinWordValue.

No worries, we appreciate that you're trying so hard to get to the bottom of this problem!

2 Likes
  public // @testable
  var _builtinWordValue: Builtin.Word {
% if BuiltinName == 'Int32':
    return Builtin.zextOrBitCast_Int32_Word(_value)
% elif BuiltinName == 'Int64':
    return Builtin.truncOrBitCast_Int64_Word(_value)
% end
  }

CC @moiseev. Should we universally zext or sext, or should this vary with the signedness of Self?

We ought to sext for signed types and zext for unsigned types.

I filed [SR-8870] _builtinWordValue should zext/sext based on signedness Ā· Issue #51376 Ā· apple/swift Ā· GitHub

2 Likes

If I read the docs correctly, index_addr takes the type into account. So it is more like ptr[-1] in C, not ((byte *)ptr) - 1. Or more concretely: %19[%18] Makes sense to me.

Thatā€™s certainly one place where it enforces unsigned extension of the type, yes. Iā€™m not sure why itā€™s emitting it in the case of formIndex(before:), I just know that it is. Trying to trace down all the hits where we use unsigned extension on Int32 to Word in GDB was painful, and I never actually caught it with my breakpoint due to the noise, unfortunately.

Later today, Iā€™ll take a peek to see if updating _builtinWordValue is enough to fix it, or if the break remains (i.e. thereā€™s another path to the builtin conversion)

EDIT:

@Joe_Groff I think you'll be happy to know that my test of modifying IntegerTypes.swift.gyb to use signed conversions works. Since that's a lot simpler than trying to make SIL see Builtin.Word as 32-bit (and probably less risky), and even works at the latest development snapshot on master, I'll put together the PR tonight for the change. It'll give me time to also start putting together the PR for building on x86 32-bit Linux hosts, too.

3 Likes