The SILLocation::LocationKind enum and the ReturnKind case

hongm · October 30, 2018, 6:09pm

Hi compiler experts,

In the TensorFlow branch, we needed to synthesize some SIL code, so we used the SILLocation of an existing SILInstruction to create new ones (
swift/TFPartition.cpp at 9501d3c27d98dba1ad29aaf4f44af707b5e17d94 · apple/swift · GitHub).

In an edge case, that existing SILLocation could have its LocationKind set to ReturnKind (because that instruction, a BranchInst, is originally lowered from a return stmt as in this test case swift/crashers.swift at 9501d3c27d98dba1ad29aaf4f44af707b5e17d94 · apple/swift · GitHub), and function getLogpLex() later gets inlined into b118507040(), but the BranchInst still has SILLocation::LocationKind set to ReturnKind).

This later triggers a SILVerifier failure here (https://github.com/apple/swift/blob/80419c711dc9dcf895f81c75cc4902d69cd83be9/lib/SIL/SILVerifier.cpp#L853).

Our short-term fix is to make sure we use RegularKind as the SILLocation::LocationKind value when creating new SIL instructions (Fixed a SILVerifier failure, where a synthesized host inst for switch_case based control flow handling has its SILLocation set to ReturnKind in the LocationKind field by mhong · Pull Request #20141 · apple/swift · GitHub), but we wonder whether the fix could be incomplete and/or cause other issues (e.g. the fix includes forking SILLocation::setLocationKind(), and we don't understand why that function does a bitwise OR instead of overwriting the original location kind): Fixed a SILVerifier failure, where a synthesized host inst for switch_case based control flow handling has its SILLocation set to ReturnKind in the LocationKind field by mhong · Pull Request #20141 · apple/swift · GitHub).

We are also in general curious of what LocationKind does. Any thoughts and suggestions? Thanks.

hongm · November 1, 2018, 8:03pm

@Adrian_Prantl @Frederic_Riss, do you have any insight here or know who I should ask? Thanks!

Adrian_Prantl · November 1, 2018, 8:08pm

The SILLocation kind is used when generating SIL-level diagnostics. When lowering SILLocations into LLVM DILocations It also determines whether the beginning or the end of the source range of the location is used.

The main question you need to ask yourself is: Is the synthesized code expanding an existing SIL instruction? Is there any relation to source code that a user wrote? If the answer to both is no, you should be using a compiler generated location instead. Otherwise reusing the SILLocation of the instruction you are expanding is fine.

hongm · November 1, 2018, 8:32pm

Thanks for the context.

We are not expanding an existing SIL instruction, but we'd like to reuse the source location (file, line number) of an existing instruction, because the synthesized code is related to the code that user wrote.

One option is to create a new location via:

// let `loc` be an existing SIL location
SILLocation newLoc(loc.getSourceLoc(), RegularKind, loc.getOpaqueKind());

Does this look reasonable?

Adrian_Prantl · November 1, 2018, 8:34pm

Why aren't you just using SILBuilderWithLocation() ?

Adrian_Prantl · November 1, 2018, 8:38pm

because the synthesized code is related to the code that user wrote.

The synthesized code is related to a return statement? It still sounds like you are picking the wrong instruction to inherit the location from.

hongm · November 1, 2018, 8:45pm

Why aren't you just using SILBuilderWithLocation() ?

Where can I find SILBuilderWithLocation? The only reference I found is SILBuilder ergonomics for improved debug scope handling.

The synthesized code is related to a return statement?

The test case is:

@inline(never)
func foo() -> Int32? {
  return 0
}

func getLogpLex() {
  if let _ = foo() {
    _ = Tensor<Float>(1.0)
    return
  } else {
    // For the SILLocation associated with this return stmt/inst,
    // The LocationKind is ReturnKind.
    //
    // When we synthesize host code to send case id #1 to the accelerator, make
    // sure the host inst will NOT use ReturnKind as the LocationKind.
    return
  }
}

public func b118507040() {
  getLogpLex()
  _ = Tensor<Float>(1.0)
}

We first inline getLogpLex() into b118507040(), and then synthesize some code for b118507040(), at the location of the second return (the else body) of getLogpLex(). So after inlining, that SIL instruction is not a return inst any more, but its SILLocation::LocationKind is still set to ReturnKind.

I wonder if the inliner should change the LocationKind from ReturnKind to RegularKind in that case?

Adrian_Prantl · November 1, 2018, 9:54pm

Sorry, I conflated SILBuilderWithScope and DebugLocOverrideRAII. While convenient, neither actually helps with your problem.

Looking at your example, I'm still not convinced that using the return statement's location is the right thing to do; compiler-generated seems more appropriate, since the user didn't write any code to deal with the accelerator. But if you believe that this is indeed the location the user want's to see / break on, then manually creating a SILLocation that is a copy of the return location with normal location kind is how you would implement that.

hongm · November 1, 2018, 9:59pm

Thanks! Agree with your assessment. So we'll go with something like this:

// let `loc` be an existing SIL location, we create a new SILLocation but set LocationKind to regular kind
SILLocation newLoc(loc.getSourceLoc(), RegularKind, loc.getOpaqueKind());