[Question] Absolute paths in executable .swiftmodule files

Hello Swift-ers,

My name is Pepper Lebeck-Jobe, and I'm working on adding Swift support to
the Gradle Build Tool <https://gradle.org>. One feature that Gradle has
recently added is support for a Build Cache
<https://docs.gradle.org/current/userguide/build_cache.html>. This feature
enables fine-grained work avoidance by snapshotting/fingerprinting/hashing
all of the inputs of a particular task (unit of work) and writing the
output of the task to a cache. This cache can be shared:

   1. Between build invocations in a single workspace (think git working
   directory.)
   2. Between build invocations on the same machine run by the same user,
   even across multiple workspaces.
   3. Between build invocations across a whole development team (if a
   remote/centralized implementation of the build cache is deployed.)

However, for the sharing of task outputs to be useful, it must be possible
to reuse the outputs in any build which is running with the same exact
inputs.

This is where we are running into a little bit of trouble with our current
implementation of Swift support. We have noticed that when we build Swift
executables, the .swiftmodule file corresponding to the executable contains
some strings which are absolute paths. Using llvm-dcanalyzer -dump we were
able to see that these absolute paths are of two general types:

   1. The XCC options. Namely, the argument to `-working-directory`
   2. The SEARCH_PATH used for finding modules.

My limited understanding of these absolute paths in the .siwftmodule files
of executables is that they are used when debugging the executable. The
problem is, if we cache the .swiftmodule file and try to use it when
compiling the exact same executable on a different machine or even in a
different workspace on the same machine, we may break the end-user's
ability to debug the resulting executable. Note: We haven't actually tried
this yet and seen it broken. Let us know if our concerns are unfounded.

Questions:

   - Is there a way to tell swiftc to use relative paths instead of
   absolute?
      - If not, would a pull request adding such an option be welcomed?
   - If we reuse the output (complete with incorrect absolute paths) what
   would be failure mode? That is, which development-time or runtime use cases
   would not work?

Thanks,
Pepper Lebeck-Jobe

Further details:

Code which leads me to believe I should be able to use an empty
-working-directory argument in combination with a relative -I
<relative-search-path> argument to get the paths to be relative (although,
it doesn't work)

https://github.com/apple/swift/blob/6607ff73ce7a039fe76e45e5870f21ce6b524f60/lib/Frontend/CompilerInvocation.cpp#L1178-L1179

Abbreviated output from llvm-bcanalyzer -dump

<CONTROL_BLOCK NumWords=61 BlockCodeSize=3>
    <MODULE_NAME abbrevid=4/> blob data = 'Conductor'
    <METADATA abbrevid=5 op0=0 op1=363 op2=3 op3=3/> blob data =
'4.1(4.1)/Swift version 4.1-dev (LLVM f06eb26858, Clang 65adc04c91, Swift
8c65f6e785)'
    <TARGET abbrevid=6/> blob data = 'x86_64-unknown-linux'
    <OPTIONS_BLOCK NumWords=22 BlockCodeSize=3>
      <IS_SIB abbrevid=4 op0=0/>
      <IS_TESTABLE abbrevid=5/>
      <SDK_PATH abbrevid=6/> blob data = '/'
      <XCC abbrevid=7/> blob data = '-working-directory'
      <XCC abbrevid=7/> blob data = '/home/pepper/dev/gh/eljobe/Conductor'
    </OPTIONS_BLOCK>
  </CONTROL_BLOCK>
  <INPUT_BLOCK NumWords=45 BlockCodeSize=4>
    <SEARCH_PATH abbrevid=9 op0=0 op1=0/> blob data = '/home/pepper/dev/
github.com/eljobe/Conductor/.build/x86_64-unknown-linux/debug'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Orchestra'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Swift'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data =
'SwiftOnoneSupport'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Violin'
  </INPUT_BLOCK>

Thanks for bringing this up, Pepper!

We have many of the same issues with our Swift support in Bazel
<https://github.com/bazelbuild/bazel>; since .swiftmodule files aren't
hermetic, we can't reliably cache them. The proliferation of import paths
in the modules is another major issue—as part of our recent work to start
building Bazel support for Swift protocol buffers
<https://github.com/apple/swift-protobuf>, we end up generating a
potentially large tree of modules and having the transitive closure of a
module's dependencies' import paths in the .swiftmodule files causes those
files to grow significantly.

I've pointed another one of our engineers to this thread where he can
provide better details.

(I'm adding swift-dev@ onto this thread because I think it might be more
appropriate there and you might be more likely to get better answers; it's
not really a language evolution topic AFAICT.)

···

On Thu, Sep 14, 2017 at 1:48 AM Pepper Lebeck-Jobe via swift-evolution < swift-evolution@swift.org> wrote:

Hello Swift-ers,

My name is Pepper Lebeck-Jobe, and I'm working on adding Swift support to
the Gradle Build Tool <https://gradle.org>. One feature that Gradle has
recently added is support for a Build Cache
<https://docs.gradle.org/current/userguide/build_cache.html>. This
feature enables fine-grained work avoidance by
snapshotting/fingerprinting/hashing all of the inputs of a particular task
(unit of work) and writing the output of the task to a cache. This cache
can be shared:

   1. Between build invocations in a single workspace (think git working
   directory.)
   2. Between build invocations on the same machine run by the same user,
   even across multiple workspaces.
   3. Between build invocations across a whole development team (if a
   remote/centralized implementation of the build cache is deployed.)

However, for the sharing of task outputs to be useful, it must be possible
to reuse the outputs in any build which is running with the same exact
inputs.

This is where we are running into a little bit of trouble with our current
implementation of Swift support. We have noticed that when we build Swift
executables, the .swiftmodule file corresponding to the executable contains
some strings which are absolute paths. Using llvm-dcanalyzer -dump we were
able to see that these absolute paths are of two general types:

   1. The XCC options. Namely, the argument to `-working-directory`
   2. The SEARCH_PATH used for finding modules.

My limited understanding of these absolute paths in the .siwftmodule files
of executables is that they are used when debugging the executable. The
problem is, if we cache the .swiftmodule file and try to use it when
compiling the exact same executable on a different machine or even in a
different workspace on the same machine, we may break the end-user's
ability to debug the resulting executable. Note: We haven't actually tried
this yet and seen it broken. Let us know if our concerns are unfounded.

Questions:

   - Is there a way to tell swiftc to use relative paths instead of
   absolute?
      - If not, would a pull request adding such an option be welcomed?
   - If we reuse the output (complete with incorrect absolute paths) what
   would be failure mode? That is, which development-time or runtime use cases
   would not work?

Thanks,
Pepper Lebeck-Jobe

Further details:

Code which leads me to believe I should be able to use an empty
-working-directory argument in combination with a relative -I
<relative-search-path> argument to get the paths to be relative (although,
it doesn't work)

https://github.com/apple/swift/blob/6607ff73ce7a039fe76e45e5870f21ce6b524f60/lib/Frontend/CompilerInvocation.cpp#L1178-L1179

Abbreviated output from llvm-bcanalyzer -dump

<CONTROL_BLOCK NumWords=61 BlockCodeSize=3>
    <MODULE_NAME abbrevid=4/> blob data = 'Conductor'
    <METADATA abbrevid=5 op0=0 op1=363 op2=3 op3=3/> blob data =
'4.1(4.1)/Swift version 4.1-dev (LLVM f06eb26858, Clang 65adc04c91, Swift
8c65f6e785)'
    <TARGET abbrevid=6/> blob data = 'x86_64-unknown-linux'
    <OPTIONS_BLOCK NumWords=22 BlockCodeSize=3>
      <IS_SIB abbrevid=4 op0=0/>
      <IS_TESTABLE abbrevid=5/>
      <SDK_PATH abbrevid=6/> blob data = '/'
      <XCC abbrevid=7/> blob data = '-working-directory'
      <XCC abbrevid=7/> blob data = '/home/pepper/dev/gh/eljobe/Conductor'
    </OPTIONS_BLOCK>
  </CONTROL_BLOCK>
  <INPUT_BLOCK NumWords=45 BlockCodeSize=4>
    <SEARCH_PATH abbrevid=9 op0=0 op1=0/> blob data = '/home/pepper/dev/
github.com/eljobe/Conductor/.build/x86_64-unknown-linux/debug'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Orchestra'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Swift'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data =
'SwiftOnoneSupport'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Violin'
  </INPUT_BLOCK>
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Tony,

Thanks for redirecting this conversation to the swift-dev mailing list. I
wasn't 100% sure I was pointing in the right direction. I'm moving
swift-evolution to the BCC (that way we won't keep bothering them.)

Thanks,
Pepper

···

On Thu, Sep 14, 2017 at 6:13 PM Tony Allevato <tony.allevato@gmail.com> wrote:

Thanks for bringing this up, Pepper!

We have many of the same issues with our Swift support in Bazel
<https://github.com/bazelbuild/bazel>; since .swiftmodule files aren't
hermetic, we can't reliably cache them. The proliferation of import paths
in the modules is another major issue—as part of our recent work to start
building Bazel support for Swift protocol buffers
<https://github.com/apple/swift-protobuf>, we end up generating a
potentially large tree of modules and having the transitive closure of a
module's dependencies' import paths in the .swiftmodule files causes those
files to grow significantly.

I've pointed another one of our engineers to this thread where he can
provide better details.

(I'm adding swift-dev@ onto this thread because I think it might be more
appropriate there and you might be more likely to get better answers; it's
not really a language evolution topic AFAICT.)

On Thu, Sep 14, 2017 at 1:48 AM Pepper Lebeck-Jobe via swift-evolution < > swift-evolution@swift.org> wrote:

Hello Swift-ers,

My name is Pepper Lebeck-Jobe, and I'm working on adding Swift support to
the Gradle Build Tool <https://gradle.org>. One feature that Gradle has
recently added is support for a Build Cache
<https://docs.gradle.org/current/userguide/build_cache.html>. This
feature enables fine-grained work avoidance by
snapshotting/fingerprinting/hashing all of the inputs of a particular task
(unit of work) and writing the output of the task to a cache. This cache
can be shared:

   1. Between build invocations in a single workspace (think git working
   directory.)
   2. Between build invocations on the same machine run by the same
   user, even across multiple workspaces.
   3. Between build invocations across a whole development team (if a
   remote/centralized implementation of the build cache is deployed.)

However, for the sharing of task outputs to be useful, it must be
possible to reuse the outputs in any build which is running with the same
exact inputs.

This is where we are running into a little bit of trouble with our
current implementation of Swift support. We have noticed that when we
build Swift executables, the .swiftmodule file corresponding to the
executable contains some strings which are absolute paths. Using
llvm-dcanalyzer -dump we were able to see that these absolute paths are of
two general types:

   1. The XCC options. Namely, the argument to `-working-directory`
   2. The SEARCH_PATH used for finding modules.

My limited understanding of these absolute paths in the .siwftmodule
files of executables is that they are used when debugging the executable.
The problem is, if we cache the .swiftmodule file and try to use it when
compiling the exact same executable on a different machine or even in a
different workspace on the same machine, we may break the end-user's
ability to debug the resulting executable. Note: We haven't actually tried
this yet and seen it broken. Let us know if our concerns are unfounded.

Questions:

   - Is there a way to tell swiftc to use relative paths instead of
   absolute?
      - If not, would a pull request adding such an option be welcomed?
   - If we reuse the output (complete with incorrect absolute paths)
   what would be failure mode? That is, which development-time or runtime use
   cases would not work?

Thanks,
Pepper Lebeck-Jobe

Further details:

Code which leads me to believe I should be able to use an empty
-working-directory argument in combination with a relative -I
<relative-search-path> argument to get the paths to be relative (although,
it doesn't work)

https://github.com/apple/swift/blob/6607ff73ce7a039fe76e45e5870f21ce6b524f60/lib/Frontend/CompilerInvocation.cpp#L1178-L1179

Abbreviated output from llvm-bcanalyzer -dump

<CONTROL_BLOCK NumWords=61 BlockCodeSize=3>
    <MODULE_NAME abbrevid=4/> blob data = 'Conductor'
    <METADATA abbrevid=5 op0=0 op1=363 op2=3 op3=3/> blob data =
'4.1(4.1)/Swift version 4.1-dev (LLVM f06eb26858, Clang 65adc04c91, Swift
8c65f6e785)'
    <TARGET abbrevid=6/> blob data = 'x86_64-unknown-linux'
    <OPTIONS_BLOCK NumWords=22 BlockCodeSize=3>
      <IS_SIB abbrevid=4 op0=0/>
      <IS_TESTABLE abbrevid=5/>
      <SDK_PATH abbrevid=6/> blob data = '/'
      <XCC abbrevid=7/> blob data = '-working-directory'
      <XCC abbrevid=7/> blob data = '/home/pepper/dev/gh/eljobe/Conductor'
    </OPTIONS_BLOCK>
  </CONTROL_BLOCK>
  <INPUT_BLOCK NumWords=45 BlockCodeSize=4>
    <SEARCH_PATH abbrevid=9 op0=0 op1=0/> blob data = '/home/pepper/dev/
github.com/eljobe/Conductor/.build/x86_64-unknown-linux/debug'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Orchestra'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Swift'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data =
'SwiftOnoneSupport'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Violin'
  </INPUT_BLOCK>

_______________________________________________

swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Yep, Bazel has the same concerns with both debugging and caching cases.

I don't think there's any option to change this today, there are places in
code that assume or construct absolute paths. I have filed
https://bugs.swift.org/browse/SR-5694 to track this.

In my experience, if your output locations do not match the module stored
ones, LLDB won't be able to load the modules you reference.

···

On Thu, Sep 14, 2017 at 10:15 AM Pepper Lebeck-Jobe via swift-evolution < swift-evolution@swift.org> wrote:

Tony,

Thanks for redirecting this conversation to the swift-dev mailing list. I
wasn't 100% sure I was pointing in the right direction. I'm moving
swift-evolution to the BCC (that way we won't keep bothering them.)

Thanks,
Pepper

On Thu, Sep 14, 2017 at 6:13 PM Tony Allevato <tony.allevato@gmail.com> > wrote:

Thanks for bringing this up, Pepper!

We have many of the same issues with our Swift support in Bazel
<https://github.com/bazelbuild/bazel>; since .swiftmodule files aren't
hermetic, we can't reliably cache them. The proliferation of import paths
in the modules is another major issue—as part of our recent work to start
building Bazel support for Swift protocol buffers
<https://github.com/apple/swift-protobuf>, we end up generating a
potentially large tree of modules and having the transitive closure of a
module's dependencies' import paths in the .swiftmodule files causes those
files to grow significantly.

I've pointed another one of our engineers to this thread where he can
provide better details.

(I'm adding swift-dev@ onto this thread because I think it might be more
appropriate there and you might be more likely to get better answers; it's
not really a language evolution topic AFAICT.)

On Thu, Sep 14, 2017 at 1:48 AM Pepper Lebeck-Jobe via swift-evolution < >> swift-evolution@swift.org> wrote:

Hello Swift-ers,

My name is Pepper Lebeck-Jobe, and I'm working on adding Swift support
to the Gradle Build Tool <https://gradle.org>. One feature that Gradle
has recently added is support for a Build Cache
<https://docs.gradle.org/current/userguide/build_cache.html>. This
feature enables fine-grained work avoidance by
snapshotting/fingerprinting/hashing all of the inputs of a particular task
(unit of work) and writing the output of the task to a cache. This cache
can be shared:

   1. Between build invocations in a single workspace (think git
   working directory.)
   2. Between build invocations on the same machine run by the same
   user, even across multiple workspaces.
   3. Between build invocations across a whole development team (if a
   remote/centralized implementation of the build cache is deployed.)

However, for the sharing of task outputs to be useful, it must be
possible to reuse the outputs in any build which is running with the same
exact inputs.

This is where we are running into a little bit of trouble with our
current implementation of Swift support. We have noticed that when we
build Swift executables, the .swiftmodule file corresponding to the
executable contains some strings which are absolute paths. Using
llvm-dcanalyzer -dump we were able to see that these absolute paths are of
two general types:

   1. The XCC options. Namely, the argument to `-working-directory`
   2. The SEARCH_PATH used for finding modules.

My limited understanding of these absolute paths in the .siwftmodule
files of executables is that they are used when debugging the executable.
The problem is, if we cache the .swiftmodule file and try to use it when
compiling the exact same executable on a different machine or even in a
different workspace on the same machine, we may break the end-user's
ability to debug the resulting executable. Note: We haven't actually tried
this yet and seen it broken. Let us know if our concerns are unfounded.

Questions:

   - Is there a way to tell swiftc to use relative paths instead of
   absolute?
      - If not, would a pull request adding such an option be welcomed?
   - If we reuse the output (complete with incorrect absolute paths)
   what would be failure mode? That is, which development-time or runtime use
   cases would not work?

Thanks,
Pepper Lebeck-Jobe

Further details:

Code which leads me to believe I should be able to use an empty
-working-directory argument in combination with a relative -I
<relative-search-path> argument to get the paths to be relative (although,
it doesn't work)

https://github.com/apple/swift/blob/6607ff73ce7a039fe76e45e5870f21ce6b524f60/lib/Frontend/CompilerInvocation.cpp#L1178-L1179

Abbreviated output from llvm-bcanalyzer -dump

<CONTROL_BLOCK NumWords=61 BlockCodeSize=3>
    <MODULE_NAME abbrevid=4/> blob data = 'Conductor'
    <METADATA abbrevid=5 op0=0 op1=363 op2=3 op3=3/> blob data =
'4.1(4.1)/Swift version 4.1-dev (LLVM f06eb26858, Clang 65adc04c91, Swift
8c65f6e785)'
    <TARGET abbrevid=6/> blob data = 'x86_64-unknown-linux'
    <OPTIONS_BLOCK NumWords=22 BlockCodeSize=3>
      <IS_SIB abbrevid=4 op0=0/>
      <IS_TESTABLE abbrevid=5/>
      <SDK_PATH abbrevid=6/> blob data = '/'
      <XCC abbrevid=7/> blob data = '-working-directory'
      <XCC abbrevid=7/> blob data =
'/home/pepper/dev/gh/eljobe/Conductor'
    </OPTIONS_BLOCK>
  </CONTROL_BLOCK>
  <INPUT_BLOCK NumWords=45 BlockCodeSize=4>
    <SEARCH_PATH abbrevid=9 op0=0 op1=0/> blob data = '/home/pepper/dev/
github.com/eljobe/Conductor/.build/x86_64-unknown-linux/debug'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Orchestra'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Swift'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data =
'SwiftOnoneSupport'
    <IMPORTED_MODULE abbrevid=4 op0=0 op1=0/> blob data = 'Violin'
  </INPUT_BLOCK>

_______________________________________________

swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________

swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution