Metadata Representation

Hello,

The current layout for the swift metadata for structure types, as emitted,
seems to be unrepresentable in PE/COFF (at least for x86_64). There is a
partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which
cannot be represented. Tracking down the relocation led to the type
metadata for SwiftNSOperatingSystemVersion. The metadata here is
_T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind
(1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we
have the nominal type descriptor reference. This is the relocation which
we fail to represent correctly. If I'm not mistaken, it seems that the
field is supposed to be a relative offset to the nominal type descriptor.
However, currently, the nominal type descriptor is emitted in a different
section (.rodata) as opposed to the type descriptor (.data). This
cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load
for the field offsets. Furthermore, my guess is that the relative offset
is used to encode the location to avoid a relocation for the load address
base. In the case of windows, the based relocations are a given, and I'm
not sure if there is a better approach to be taken. There are a couple of
solutions which immediately spring to mind: moving the nominal type
descriptor into the (RW) data segment and the other is to adjust the ABI to
use an absolute relocation which would be rebased. Given that the type
metadata may be adjusted means that we cannot emit it into the RO data
segment. Is there another solution that I am overlooking which may be
simpler or better?

Thanks!

        .section        .rdata,"dr"
...
        .globl  _T0SC30_SwiftNSOperatingSystemVersionVMn #
@_T0SC30_SwiftNSOperatingSystemVersionVMn
        .p2align        3
_T0SC30_SwiftNSOperatingSystemVersionVMn:
        .long   .L__unnamed_1056-_T0SC30_SwiftNSOperatingSystemVersionVMn
        .long   3                       # 0x3
        .long   3                       # 0x3
        .long
(.L__unnamed_1057-_T0SC30_SwiftNSOperatingSystemVersionVMn)-12
        .long
(.Lget_field_types__SwiftNSOperatingSystemVersion-_T0SC30_SwiftNSOperatingSystemVersionVMn)-16
        .long   1                       # 0x1
        .long   0                       # 0x0
        .long   6                       # 0x6
        .long   0                       # 0x0
        .long   0                       # 0x0
        .long   0                       # 0x0

        .data
        .globl  _T0SC30_SwiftNSOperatingSystemVersionVN #
@_T0SC30_SwiftNSOperatingSystemVersionVN
        .p2align        3
_T0SC30_SwiftNSOperatingSystemVersionVN:
        .quad   .L__unnamed_1056
        .quad   0
        .quad   0                       # 0x0
        .quad   _T0SC30_SwiftNSOperatingSystemVersionVWV
        .quad   1                       # 0x1
        .quad
(_T0SC30_SwiftNSOperatingSystemVersionVMn-_T0SC30_SwiftNSOperatingSystemVersionVN)-40
        .quad   0
        .quad   0                       # 0x0
        .quad   8                       # 0x8
        .quad   16                      # 0x10
···

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

-Joe

···

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

Hello,

The current layout for the swift metadata for structure types, as emitted,
seems to be unrepresentable in PE/COFF (at least for x86_64). There is a
partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which
cannot be represented. Tracking down the relocation led to the type
metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_
SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So,
this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the
nominal type descriptor reference. This is the relocation which we fail to
represent correctly. If I'm not mistaken, it seems that the field is
supposed to be a relative offset to the nominal type descriptor. However,
currently, the nominal type descriptor is emitted in a different section
(.rodata) as opposed to the type descriptor (.data). This cross-section
relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the
load for the field offsets. Furthermore, my guess is that the relative
offset is used to encode the location to avoid a relocation for the load
address base. In the case of windows, the based relocations are a given,
and I'm not sure if there is a better approach to be taken. There are a
couple of solutions which immediately spring to mind: moving the nominal
type descriptor into the (RW) data segment and the other is to adjust the
ABI to use an absolute relocation which would be rebased. Given that the
type metadata may be adjusted means that we cannot emit it into the RO data
segment. Is there another solution that I am overlooking which may be
simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM
as well, and they were able to conditionalize the code so that we used
absolute pointers on Windows/ARM, and we may have to do the same on Windows
in general. It may be somewhat more complicated on Win64 since we generally
assume that relative references can be 32-bit, whereas an absolute
reference will be 64-bit, so some formats may have to change layout to make
this work too. I believe Windows' executable loader still ultimately maps
the final PE image contiguously, so alternatively, you could conceivably
build a Swift toolchain that used ELF or Mach-O or some other format with
better support for PIC as the intermediate object format and still linked a
final PE executable. Using relative references should still be a win on
Windows both because of the size benefit of being 32-bit and the fact that
they don't need to be slid when running under ASLR or when a DLL needs to
be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle
little warning that it is not fully portable :-). Would you happen to have
a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows.
The idea of MachO as an intermediatary is rather intriguing. Thinking
longer term, maybe we want to use that as a global solution? It would also
provide a nicer autolinking mechanism for ELF which is the one target which
currently is missing this functionality. However, if Im not mistaken, this
would require a MachO linker (and the only current viable MachO linker
would be ld64). The MachO binary would then need to be converted into ELF
or COFF. This seems like it could take a while to implement though, but
would not really break ABI, so pushing that off to later may be wise.

I really hope that we can get the Windows build to the point where we can
actually have that be built regularly, as it seems that there is still
insufficient test coverage.

···

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev < > swift-dev@swift.org> wrote:

-Joe

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

-Joe

···

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:
On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

I really hope that we can get the Windows build to the point where we can actually have that be built regularly, as it seems that there is still insufficient test coverage.

-Joe

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

Hello,

The current layout for the swift metadata for structure types, as
emitted, seems to be unrepresentable in PE/COFF (at least for x86_64).
There is a partial listing of the generated code following the message for
reference.

When building the standard library, LLVM encounters a relocation which
cannot be represented. Tracking down the relocation led to the type
metadata for SwiftNSOperatingSystemVersion. The metadata here is
_T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind
(1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we
have the nominal type descriptor reference. This is the relocation which
we fail to represent correctly. If I'm not mistaken, it seems that the
field is supposed to be a relative offset to the nominal type descriptor.
However, currently, the nominal type descriptor is emitted in a different
section (.rodata) as opposed to the type descriptor (.data). This
cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the
load for the field offsets. Furthermore, my guess is that the relative
offset is used to encode the location to avoid a relocation for the load
address base. In the case of windows, the based relocations are a given,
and I'm not sure if there is a better approach to be taken. There are a
couple of solutions which immediately spring to mind: moving the nominal
type descriptor into the (RW) data segment and the other is to adjust the
ABI to use an absolute relocation which would be rebased. Given that the
type metadata may be adjusted means that we cannot emit it into the RO data
segment. Is there another solution that I am overlooking which may be
simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on
ARM as well, and they were able to conditionalize the code so that we used
absolute pointers on Windows/ARM, and we may have to do the same on Windows
in general. It may be somewhat more complicated on Win64 since we generally
assume that relative references can be 32-bit, whereas an absolute
reference will be 64-bit, so some formats may have to change layout to make
this work too. I believe Windows' executable loader still ultimately maps
the final PE image contiguously, so alternatively, you could conceivably
build a Swift toolchain that used ELF or Mach-O or some other format with
better support for PIC as the intermediate object format and still linked a
final PE executable. Using relative references should still be a win on
Windows both because of the size benefit of being 32-bit and the fact that
they don't need to be slid when running under ASLR or when a DLL needs to
be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle
little warning that it is not fully portable :-). Would you happen to have
a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows.
The idea of MachO as an intermediatary is rather intriguing. Thinking
longer term, maybe we want to use that as a global solution? It would also
provide a nicer autolinking mechanism for ELF which is the one target which
currently is missing this functionality. However, if Im not mistaken, this
would require a MachO linker (and the only current viable MachO linker
would be ld64). The MachO binary would then need to be converted into ELF
or COFF. This seems like it could take a while to implement though, but
would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple
already, though I don't know what Mach-O to PE linker (if any) that's
intended to be used with. We implemented relative references using
current-position-relative offsets for Darwin and Linux both because that
still allows for a fairly convenient pointer-like C++ API for working with
relative offsets, and because the established toolchains on those platforms
already have to support PIC so had most of the relocations we needed to
make them work already; is there another base we could use for relative
offsets on Windows that would fit in the set of relocations supported by
standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is
translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative
displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section
relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the
section. The latter is why I mentioned that moving them into the same
section could be a solution as that would allow the relative distance to be
encoded. Unfortunately, the section relative relocation is relative to the
section within which the symbol is.

···

On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org> > wrote:
On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev < >> swift-dev@swift.org> wrote:

-Joe

I really hope that we can get the Windows build to the point where we can
actually have that be built regularly, as it seems that there is still
insufficient test coverage.

-Joe

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

Hm, well, image-base-relative would at least be relatively easy to interpret by out-of-process tools looking at the image on disk, and could conceivably still work in-process too, though you'd have to call GetModuleHandleEx to derive the image base for an arbitrary metadata structure mapped into memory. I wonder how bad that'd be in practice, since the relative-addressed things are generally not intended to be on any fast paths for compiler-generated code.

-Joe

···

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

John.

···

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

Hello,

The current layout for the swift metadata for structure types, as
emitted, seems to be unrepresentable in PE/COFF (at least for x86_64).
There is a partial listing of the generated code following the message for
reference.

When building the standard library, LLVM encounters a relocation which
cannot be represented. Tracking down the relocation led to the type
metadata for SwiftNSOperatingSystemVersion. The metadata here is
_T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind
(1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we
have the nominal type descriptor reference. This is the relocation which
we fail to represent correctly. If I'm not mistaken, it seems that the
field is supposed to be a relative offset to the nominal type descriptor.
However, currently, the nominal type descriptor is emitted in a different
section (.rodata) as opposed to the type descriptor (.data). This
cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the
load for the field offsets. Furthermore, my guess is that the relative
offset is used to encode the location to avoid a relocation for the load
address base. In the case of windows, the based relocations are a given,
and I'm not sure if there is a better approach to be taken. There are a
couple of solutions which immediately spring to mind: moving the nominal
type descriptor into the (RW) data segment and the other is to adjust the
ABI to use an absolute relocation which would be rebased. Given that the
type metadata may be adjusted means that we cannot emit it into the RO data
segment. Is there another solution that I am overlooking which may be
simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on
ARM as well, and they were able to conditionalize the code so that we used
absolute pointers on Windows/ARM, and we may have to do the same on Windows
in general. It may be somewhat more complicated on Win64 since we generally
assume that relative references can be 32-bit, whereas an absolute
reference will be 64-bit, so some formats may have to change layout to make
this work too. I believe Windows' executable loader still ultimately maps
the final PE image contiguously, so alternatively, you could conceivably
build a Swift toolchain that used ELF or Mach-O or some other format with
better support for PIC as the intermediate object format and still linked a
final PE executable. Using relative references should still be a win on
Windows both because of the size benefit of being 32-bit and the fact that
they don't need to be slid when running under ASLR or when a DLL needs to
be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle
little warning that it is not fully portable :-). Would you happen to have
a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows.
The idea of MachO as an intermediatary is rather intriguing. Thinking
longer term, maybe we want to use that as a global solution? It would also
provide a nicer autolinking mechanism for ELF which is the one target which
currently is missing this functionality. However, if Im not mistaken, this
would require a MachO linker (and the only current viable MachO linker
would be ld64). The MachO binary would then need to be converted into ELF
or COFF. This seems like it could take a while to implement though, but
would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple
already, though I don't know what Mach-O to PE linker (if any) that's
intended to be used with. We implemented relative references using
current-position-relative offsets for Darwin and Linux both because that
still allows for a fairly convenient pointer-like C++ API for working with
relative offsets, and because the established toolchains on those platforms
already have to support PIC so had most of the relocations we needed to
make them work already; is there another base we could use for relative
offsets on Windows that would fit in the set of relocations supported by
standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary
is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative
displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section
relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the
section. The latter is why I mentioned that moving them into the same
section could be a solution as that would allow the relative distance to be
encoded. Unfortunately, the section relative relocation is relative to the
section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the
relative-pointer logic to store an offset from the end of the relative
pointer instead of the beginning, but it doesn't seem to have a section
requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes,
that could work.

···

On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev < > swift-dev@swift.org> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org> >> wrote:
On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev < >>> swift-dev@swift.org> wrote:

John.

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.

John.

···

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.

Hello,

The current layout for the swift metadata for structure types, as
emitted, seems to be unrepresentable in PE/COFF (at least for x86_64).
There is a partial listing of the generated code following the message for
reference.

When building the standard library, LLVM encounters a relocation which
cannot be represented. Tracking down the relocation led to the type
metadata for SwiftNSOperatingSystemVersion. The metadata here is
_T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the
Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40
bytes) we have the nominal type descriptor reference. This is the
relocation which we fail to represent correctly. If I'm not mistaken, it
seems that the field is supposed to be a relative offset to the nominal
type descriptor. However, currently, the nominal type descriptor is
emitted in a different section (.rodata) as opposed to the type descriptor
(.data). This cross-section relocation cannot be represented in the file
format.

My understanding is that the type metadata will be adjusted during the
load for the field offsets. Furthermore, my guess is that the relative
offset is used to encode the location to avoid a relocation for the load
address base. In the case of windows, the based relocations are a given,
and I'm not sure if there is a better approach to be taken. There are a
couple of solutions which immediately spring to mind: moving the nominal
type descriptor into the (RW) data segment and the other is to adjust the
ABI to use an absolute relocation which would be rebased. Given that the
type metadata may be adjusted means that we cannot emit it into the RO data
segment. Is there another solution that I am overlooking which may be
simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on
ARM as well, and they were able to conditionalize the code so that we used
absolute pointers on Windows/ARM, and we may have to do the same on Windows
in general. It may be somewhat more complicated on Win64 since we generally
assume that relative references can be 32-bit, whereas an absolute
reference will be 64-bit, so some formats may have to change layout to make
this work too. I believe Windows' executable loader still ultimately maps
the final PE image contiguously, so alternatively, you could conceivably
build a Swift toolchain that used ELF or Mach-O or some other format with
better support for PIC as the intermediate object format and still linked a
final PE executable. Using relative references should still be a win on
Windows both because of the size benefit of being 32-bit and the fact that
they don't need to be slid when running under ASLR or when a DLL needs to
be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle
little warning that it is not fully portable :-). Would you happen to have
a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on
Windows. The idea of MachO as an intermediatary is rather intriguing.
Thinking longer term, maybe we want to use that as a global solution? It
would also provide a nicer autolinking mechanism for ELF which is the one
target which currently is missing this functionality. However, if Im not
mistaken, this would require a MachO linker (and the only current viable
MachO linker would be ld64). The MachO binary would then need to be
converted into ELF or COFF. This seems like it could take a while to
implement though, but would not really break ABI, so pushing that off to
later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple
already, though I don't know what Mach-O to PE linker (if any) that's
intended to be used with. We implemented relative references using
current-position-relative offsets for Darwin and Linux both because that
still allows for a fairly convenient pointer-like C++ API for working with
relative offsets, and because the established toolchains on those platforms
already have to support PIC so had most of the relocations we needed to
make them work already; is there another base we could use for relative
offsets on Windows that would fit in the set of relocations supported by
standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary
is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative
displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section
relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the
section. The latter is why I mentioned that moving them into the same
section could be a solution as that would allow the relative distance to be
encoded. Unfortunately, the section relative relocation is relative to the
section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the
relative-pointer logic to store an offset from the end of the relative
pointer instead of the beginning, but it doesn't seem to have a section
requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes,
that could work.

There's no inherent reason, but I wouldn't put it past the linker to fall
over and die. But it should at least be section-agnostic about the target,
since this is likely to be used for all sorts of PC-relative addressing.

At least MC doesnt seem to like it. Something like this for example:

  .data
data:
  .long 0

  .section .rodata
rodata:
  .quad data(%rip)

Bails out due to the unexpected modifier. Now, theoretically, we could
support that modififer, but it does seem pretty odd.

Now, as it so happens, both PE and PE+ have limitations on the file size at
4GiB. This means that we are guaranteed that the relative difference is
guaranteed to fit within 32-bits. This is where things get really
interesting!

We cannot generate the relocation because we are emitting the values at
pointer width. However, the value that we are emitting is a relative
offset, which we just determined to be limited to 32-bits in width. The
thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the
cross-setionness as you pointed out. So, rather than emitting a
pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow
that with the relative displacement (`.long <expr>`). This would be
representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed
to be pointer sized. I assume that we would like to maintain that across
the formats. It may be possible to alter the emission to change the
relative pointer emission to emit a pair of longs instead for PE/COFF with
a 64-bit pointer value. Basically, we cannot truncate the relocation to a
IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and
pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the
emission to do this? The only downsides that I can see is that the
emission would need to be taret dependent (that is check the output object
format and the target pointer width).

Thanks for the hint John! It seems that was spot on :-).

···

On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org> > wrote:
On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev < >> swift-dev@swift.org> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org> >>> wrote:
On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev < >>>> swift-dev@swift.org> wrote:

John.

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

Honestly, I don't know that there's a great reason for this pointer to be relative in the first place. The struct metadata will already have an absolute pointer to the value witness table which requires load-time relocation, so maybe we should just make this an absolute pointer, too, unless we're seriously considering making that a relative pointer before allocation.

In practice this will just be a rebase, not a full relocation, so it should be relatively cheap.

John.

···

On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.

There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.

At least MC doesnt seem to like it. Something like this for example:

  .data
data:
  .long 0

  .section .rodata
rodata:
  .quad data(%rip)

Bails out due to the unexpected modifier. Now, theoretically, we could support that modififer, but it does seem pretty odd.

Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!

We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).

Thanks for the hint John! It seems that was spot on :-).

Beware of sign extension.

If you have a 32-bit signed offset, and you pad it to 64 bits by prepending a 32-bit zero, the result is not a 64-bit signed offset. Negative offsets would become large positive offsets. You would also have to adjust all code that reads these offsets. Such code must read only 32 bits and sign-extend if necessary to get 64 bits.

···

On Sep 22, 2017, at 5:39 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:

Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!

We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).

--
Greg Parker gparker@apple.com <mailto:gparker@apple.com> Runtime Wrangler

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.

There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.

At least MC doesnt seem to like it. Something like this for example:

  .data
data:
  .long 0

  .section .rodata
rodata:
  .quad data(%rip)

Bails out due to the unexpected modifier. Now, theoretically, we could support that modififer, but it does seem pretty odd.

Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!

We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).

Thanks for the hint John! It seems that was spot on :-).

Honestly, I don't know that there's a great reason for this pointer to be relative in the first place. The struct metadata will already have an absolute pointer to the value witness table which requires load-time relocation, so maybe we should just make this an absolute pointer, too, unless we're seriously considering making that a relative pointer before allocation.

I'm sorry, my mind wandered. "...making that (the value witness table pointer) a relative pointer before we reach ABI stability."

John.

···

On Sep 25, 2017, at 1:30 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:

On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:
On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:
On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:
On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

In practice this will just be a rebase, not a full relocation, so it should be relatively cheap.

John.
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

At one point we discussed the possibility of also making the value witness table pointer relative, which would allow concrete value type metadata to be fully read-only, and since code invoking a value witness is almost certainly going to have the base type metadata pointer live, probably not an undue burden on code size. It's a fair question though whether we'll ever get around to that analysis, and I think the nominal type descriptor reference is the only place we statically emit a pointer-sized rather than 32-bit relative offset, which has caused problems for ports to other platforms that only support 32-bit relative offsets.

-Joe

···

On Sep 24, 2017, at 10:30 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.

There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.

At least MC doesnt seem to like it. Something like this for example:

  .data
data:
  .long 0

  .section .rodata
rodata:
  .quad data(%rip)

Bails out due to the unexpected modifier. Now, theoretically, we could support that modififer, but it does seem pretty odd.

Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!

We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).

Thanks for the hint John! It seems that was spot on :-).

Honestly, I don't know that there's a great reason for this pointer to be relative in the first place. The struct metadata will already have an absolute pointer to the value witness table which requires load-time relocation, so maybe we should just make this an absolute pointer, too, unless we're seriously considering making that a relative pointer before allocation.

In practice this will just be a rebase, not a full relocation, so it should be relatively cheap.

That seems at odds with having standard VWTs in the runtime, though, or having a single-element struct share a VWT with its lone member.

Jordan

···

On Sep 25, 2017, at 09:24, Joe Groff via swift-dev <swift-dev@swift.org> wrote:

On Sep 24, 2017, at 10:30 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.

There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.

At least MC doesnt seem to like it. Something like this for example:

 .data
data:
 .long 0

 .section .rodata
rodata:
 .quad data(%rip)

Bails out due to the unexpected modifier. Now, theoretically, we could support that modififer, but it does seem pretty odd.

Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!

We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).

Thanks for the hint John! It seems that was spot on :-).

Honestly, I don't know that there's a great reason for this pointer to be relative in the first place. The struct metadata will already have an absolute pointer to the value witness table which requires load-time relocation, so maybe we should just make this an absolute pointer, too, unless we're seriously considering making that a relative pointer before allocation.

In practice this will just be a rebase, not a full relocation, so it should be relatively cheap.

At one point we discussed the possibility of also making the value witness table pointer relative, which would allow concrete value type metadata to be fully read-only, and since code invoking a value witness is almost certainly going to have the base type metadata pointer live, probably not an undue burden on code size. It's a fair question though whether we'll ever get around to that analysis, and I think the nominal type descriptor reference is the only place we statically emit a pointer-sized rather than 32-bit relative offset, which has caused problems for ports to other platforms that only support 32-bit relative offsets.

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.

There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.

At least MC doesnt seem to like it. Something like this for example:

 .data
data:
 .long 0

 .section .rodata
rodata:
 .quad data(%rip)

Bails out due to the unexpected modifier. Now, theoretically, we could support that modififer, but it does seem pretty odd.

Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!

We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).

Thanks for the hint John! It seems that was spot on :-).

Honestly, I don't know that there's a great reason for this pointer to be relative in the first place. The struct metadata will already have an absolute pointer to the value witness table which requires load-time relocation, so maybe we should just make this an absolute pointer, too, unless we're seriously considering making that a relative pointer before allocation.

In practice this will just be a rebase, not a full relocation, so it should be relatively cheap.

At one point we discussed the possibility of also making the value witness table pointer relative, which would allow concrete value type metadata to be fully read-only, and since code invoking a value witness is almost certainly going to have the base type metadata pointer live, probably not an undue burden on code size.

Yes, that's true. It would make the base of the load (metadata + loaded-offset + immediate-offset), which I think would require an extra instruction even on x86, but maybe that's not so bad.

On the other hand, yes, it would not be possible to refer to prebuilt vwtables from the runtime, and it would need to be a 64-bit relative offset in order to handle dynamic instantiation correctly, which as you say is problematic on some platforms.

John.

···

On Sep 25, 2017, at 12:24 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 24, 2017, at 10:30 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:
On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:
On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:
On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:

It's a fair question though whether we'll ever get around to that analysis, and I think the nominal type descriptor reference is the only place we statically emit a pointer-sized rather than 32-bit relative offset, which has caused problems for ports to other platforms that only support 32-bit relative offsets.

-Joe

Yeah, that would be another tradeoff we'd have to make. We could theoretically still share common concrete value witness implementations even if we need to copy the table (and John and Arnold have done work recently to make the table itself a lot smaller than it used to be).

-Joe

···

On Sep 25, 2017, at 10:28 AM, Jordan Rose <jordan_rose@apple.com> wrote:

On Sep 25, 2017, at 09:24, Joe Groff via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Sep 24, 2017, at 10:30 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.

There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.

At least MC doesnt seem to like it. Something like this for example:

 .data
data:
 .long 0

 .section .rodata
rodata:
 .quad data(%rip)

Bails out due to the unexpected modifier. Now, theoretically, we could support that modififer, but it does seem pretty odd.

Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!

We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).

Thanks for the hint John! It seems that was spot on :-).

Honestly, I don't know that there's a great reason for this pointer to be relative in the first place. The struct metadata will already have an absolute pointer to the value witness table which requires load-time relocation, so maybe we should just make this an absolute pointer, too, unless we're seriously considering making that a relative pointer before allocation.

In practice this will just be a rebase, not a full relocation, so it should be relatively cheap.

At one point we discussed the possibility of also making the value witness table pointer relative, which would allow concrete value type metadata to be fully read-only, and since code invoking a value witness is almost certainly going to have the base type metadata pointer live, probably not an undue burden on code size. It's a fair question though whether we'll ever get around to that analysis, and I think the nominal type descriptor reference is the only place we statically emit a pointer-sized rather than 32-bit relative offset, which has caused problems for ports to other platforms that only support 32-bit relative offsets.

That seems at odds with having standard VWTs in the runtime, though, or having a single-element struct share a VWT with its lone member.

Sharing a VWT is, unfortunately, not generally possible because you need to pass the element metadata to the value witnesses, not the struct metadata. (We made exactly this mistake ourselves. :))

John.

···

On Sep 25, 2017, at 1:28 PM, Jordan Rose <jordan_rose@apple.com> wrote:

On Sep 25, 2017, at 09:24, Joe Groff via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Sep 24, 2017, at 10:30 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Hello,

The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.

When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.

My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?

IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.

Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?

You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.

Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?

Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.

There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.

What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.

Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.

There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.

At least MC doesnt seem to like it. Something like this for example:

 .data
data:
 .long 0

 .section .rodata
rodata:
 .quad data(%rip)

Bails out due to the unexpected modifier. Now, theoretically, we could support that modififer, but it does seem pretty odd.

Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!

We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.

If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.

Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).

Thanks for the hint John! It seems that was spot on :-).

Honestly, I don't know that there's a great reason for this pointer to be relative in the first place. The struct metadata will already have an absolute pointer to the value witness table which requires load-time relocation, so maybe we should just make this an absolute pointer, too, unless we're seriously considering making that a relative pointer before allocation.

In practice this will just be a rebase, not a full relocation, so it should be relatively cheap.

At one point we discussed the possibility of also making the value witness table pointer relative, which would allow concrete value type metadata to be fully read-only, and since code invoking a value witness is almost certainly going to have the base type metadata pointer live, probably not an undue burden on code size. It's a fair question though whether we'll ever get around to that analysis, and I think the nominal type descriptor reference is the only place we statically emit a pointer-sized rather than 32-bit relative offset, which has caused problems for ports to other platforms that only support 32-bit relative offsets.

That seems at odds with having standard VWTs in the runtime, though, or having a single-element struct share a VWT with its lone member.

>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> The current layout for the swift metadata for structure types, as
emitted, seems to be unrepresentable in PE/COFF (at least for x86_64).
There is a partial listing of the generated code following the message for
reference.
>>>>>>>
>>>>>>> When building the standard library, LLVM encounters a relocation
which cannot be represented. Tracking down the relocation led to the type
metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_
SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So,
this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the
nominal type descriptor reference. This is the relocation which we fail to
represent correctly. If I'm not mistaken, it seems that the field is
supposed to be a relative offset to the nominal type descriptor. However,
currently, the nominal type descriptor is emitted in a different section
(.rodata) as opposed to the type descriptor (.data). This cross-section
relocation cannot be represented in the file format.
>>>>>>>
>>>>>>> My understanding is that the type metadata will be adjusted during
the load for the field offsets. Furthermore, my guess is that the relative
offset is used to encode the location to avoid a relocation for the load
address base. In the case of windows, the based relocations are a given,
and I'm not sure if there is a better approach to be taken. There are a
couple of solutions which immediately spring to mind: moving the nominal
type descriptor into the (RW) data segment and the other is to adjust the
ABI to use an absolute relocation which would be rebased. Given that the
type metadata may be adjusted means that we cannot emit it into the RO data
segment. Is there another solution that I am overlooking which may be
simpler or better?
>>>>>>
>>>>>> IIRC, this came up when someone was trying to port Swift to Windows
on ARM as well, and they were able to conditionalize the code so that we
used absolute pointers on Windows/ARM, and we may have to do the same on
Windows in general. It may be somewhat more complicated on Win64 since we
generally assume that relative references can be 32-bit, whereas an
absolute reference will be 64-bit, so some formats may have to change
layout to make this work too. I believe Windows' executable loader still
ultimately maps the final PE image contiguously, so alternatively, you
could conceivably build a Swift toolchain that used ELF or Mach-O or some
other format with better support for PIC as the intermediate object format
and still linked a final PE executable. Using relative references should
still be a win on Windows both because of the size benefit of being 32-bit
and the fact that they don't need to be slid when running under ASLR or
when a DLL needs to be rebased.
>>>>>>
>>>>>>
>>>>>> Yeah, I tracked down the relativePointer thing. There is a nice
subtle little warning that it is not fully portable :-). Would you happen
to have a pointer to where the adjustment for the absolute pointers on WoA
is?
>>>>>>
>>>>>> You are correct that the image should be contiugously mapped on
Windows. The idea of MachO as an intermediatary is rather intriguing.
Thinking longer term, maybe we want to use that as a global solution? It
would also provide a nicer autolinking mechanism for ELF which is the one
target which currently is missing this functionality. However, if Im not
mistaken, this would require a MachO linker (and the only current viable
MachO linker would be ld64). The MachO binary would then need to be
converted into ELF or COFF. This seems like it could take a while to
implement though, but would not really break ABI, so pushing that off to
later may be wise.
>>>>>
>>>>> Intriguingly, LLVM does support `*-*-win32-macho` as a target triple
already, though I don't know what Mach-O to PE linker (if any) that's
intended to be used with. We implemented relative references using
current-position-relative offsets for Darwin and Linux both because that
still allows for a fairly convenient pointer-like C++ API for working with
relative offsets, and because the established toolchains on those platforms
already have to support PIC so had most of the relocations we needed to
make them work already; is there another base we could use for relative
offsets on Windows that would fit in the set of relocations supported by
standard COFF linkers?
>>>>>
>>>>>
>>>>> Yes, the `-windows-macho` target is used for UEFI :-). The MachO
binary is translated later to PE/COFF as required by the UEFI specification.
>>>>>
>>>>> There are only two relocation types which can be used for relative
displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section
relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the
section. The latter is why I mentioned that moving them into the same
section could be a solution as that would allow the relative distance to be
encoded. Unfortunately, the section relative relocation is relative to the
section within which the symbol is.
>>>>
>>>> What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the
relative-pointer logic to store an offset from the end of the relative
pointer instead of the beginning, but it doesn't seem to have a section
requirement.
>>>>
>>>> Hmm, is it possible to use RIP relative addressing in data? If so,
yes, that could work.
>>>
>>> There's no inherent reason, but I wouldn't put it past the linker to
fall over and die. But it should at least be section-agnostic about the
target, since this is likely to be used for all sorts of PC-relative
addressing.
>>>
>>>
>>> At least MC doesnt seem to like it. Something like this for example:
>>>
>>> ```
>>> .data
>>> data:
>>> .long 0
>>>
>>> .section .rodata
>>> rodata:
>>> .quad data(%rip)
>>> ```
>>>
>>> Bails out due to the unexpected modifier. Now, theoretically, we
could support that modififer, but it does seem pretty odd.
>>>
>>> Now, as it so happens, both PE and PE+ have limitations on the file
size at 4GiB. This means that we are guaranteed that the relative
difference is guaranteed to fit within 32-bits. This is where things get
really interesting!
>>>
>>> We cannot generate the relocation because we are emitting the values
at pointer width. However, the value that we are emitting is a relative
offset, which we just determined to be limited to 32-bits in width. The
thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the
cross-setionness as you pointed out. So, rather than emitting a
pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow
that with the relative displacement (`.long <expr>`). This would be
representable in the PE/COFF model.
>>>
>>> If I understand the layout correctly, the type metadata fields are
supposed to be pointer sized. I assume that we would like to maintain that
across the formats. It may be possible to alter the emission to change the
relative pointer emission to emit a pair of longs instead for PE/COFF with
a 64-bit pointer value. Basically, we cannot truncate the relocation to a
IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and
pad to the desired width.
>>>
>>> Are there any pitfalls that I should be aware of trying to adjust the
emission to do this? The only downsides that I can see is that the
emission would need to be taret dependent (that is check the output object
format and the target pointer width).
>>>
>>> Thanks for the hint John! It seems that was spot on :-).
>>
>> Honestly, I don't know that there's a great reason for this pointer to
be relative in the first place. The struct metadata will already have an
absolute pointer to the value witness table which requires load-time
relocation, so maybe we should just make this an absolute pointer, too,
unless we're seriously considering making that a relative pointer before
allocation.
>>
>> In practice this will just be a rebase, not a full relocation, so it
should be relatively cheap.
>
> At one point we discussed the possibility of also making the value
witness table pointer relative, which would allow concrete value type
metadata to be fully read-only, and since code invoking a value witness is
almost certainly going to have the base type metadata pointer live,
probably not an undue burden on code size.

Yes, that's true. It would make the base of the load (metadata +
loaded-offset + immediate-offset), which I think would require an extra
instruction even on x86, but maybe that's not so bad.

On the other hand, yes, it would not be possible to refer to prebuilt
vwtables from the runtime, and it would need to be a 64-bit relative offset
in order to handle dynamic instantiation correctly, which as you say is
problematic on some platforms.

Hmm, Im not sure I understand the desired approach. Would we want to
switch to a rebased pointer? Would this be for all of the metadata or just
the struct type? Are there no other instances of the same pattern?

···

On Mon, Sep 25, 2017 at 11:47 AM, John McCall <rjmccall@apple.com> wrote:

> On Sep 25, 2017, at 12:24 PM, Joe Groff <jgroff@apple.com> wrote:
>> On Sep 24, 2017, at 10:30 PM, John McCall <rjmccall@apple.com> wrote:
>>> On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org> > wrote:
>>> On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com> > wrote:
>>>> On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool < > compnerd@compnerd.org> wrote:
>>>> On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com> > wrote:
>>>>> On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev < > swift-dev@swift.org> wrote:
>>>>> On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com> > wrote:
>>>>>> On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool < > compnerd@compnerd.org> wrote:
>>>>>> On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com> > wrote:
>>>>>>> On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev < > swift-dev@swift.org> wrote:

John.

> It's a fair question though whether we'll ever get around to that
analysis, and I think the nominal type descriptor reference is the only
place we statically emit a pointer-sized rather than 32-bit relative
offset, which has caused problems for ports to other platforms that only
support 32-bit relative offsets.

>
> -Joe

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> The current layout for the swift metadata for structure types, as emitted, seems to be unrepresentable in PE/COFF (at least for x86_64). There is a partial listing of the generated code following the message for reference.
>>>>>>>
>>>>>>> When building the standard library, LLVM encounters a relocation which cannot be represented. Tracking down the relocation led to the type metadata for SwiftNSOperatingSystemVersion. The metadata here is _T0SC30_SwiftNSOperatingSystemVersionVN. At +32-bytes we find the Kind (1). So, this is a struct metadata type. Thus at Offset 1 (+40 bytes) we have the nominal type descriptor reference. This is the relocation which we fail to represent correctly. If I'm not mistaken, it seems that the field is supposed to be a relative offset to the nominal type descriptor. However, currently, the nominal type descriptor is emitted in a different section (.rodata) as opposed to the type descriptor (.data). This cross-section relocation cannot be represented in the file format.
>>>>>>>
>>>>>>> My understanding is that the type metadata will be adjusted during the load for the field offsets. Furthermore, my guess is that the relative offset is used to encode the location to avoid a relocation for the load address base. In the case of windows, the based relocations are a given, and I'm not sure if there is a better approach to be taken. There are a couple of solutions which immediately spring to mind: moving the nominal type descriptor into the (RW) data segment and the other is to adjust the ABI to use an absolute relocation which would be rebased. Given that the type metadata may be adjusted means that we cannot emit it into the RO data segment. Is there another solution that I am overlooking which may be simpler or better?
>>>>>>
>>>>>> IIRC, this came up when someone was trying to port Swift to Windows on ARM as well, and they were able to conditionalize the code so that we used absolute pointers on Windows/ARM, and we may have to do the same on Windows in general. It may be somewhat more complicated on Win64 since we generally assume that relative references can be 32-bit, whereas an absolute reference will be 64-bit, so some formats may have to change layout to make this work too. I believe Windows' executable loader still ultimately maps the final PE image contiguously, so alternatively, you could conceivably build a Swift toolchain that used ELF or Mach-O or some other format with better support for PIC as the intermediate object format and still linked a final PE executable. Using relative references should still be a win on Windows both because of the size benefit of being 32-bit and the fact that they don't need to be slid when running under ASLR or when a DLL needs to be rebased.
>>>>>>
>>>>>>
>>>>>> Yeah, I tracked down the relativePointer thing. There is a nice subtle little warning that it is not fully portable :-). Would you happen to have a pointer to where the adjustment for the absolute pointers on WoA is?
>>>>>>
>>>>>> You are correct that the image should be contiugously mapped on Windows. The idea of MachO as an intermediatary is rather intriguing. Thinking longer term, maybe we want to use that as a global solution? It would also provide a nicer autolinking mechanism for ELF which is the one target which currently is missing this functionality. However, if Im not mistaken, this would require a MachO linker (and the only current viable MachO linker would be ld64). The MachO binary would then need to be converted into ELF or COFF. This seems like it could take a while to implement though, but would not really break ABI, so pushing that off to later may be wise.
>>>>>
>>>>> Intriguingly, LLVM does support `*-*-win32-macho` as a target triple already, though I don't know what Mach-O to PE linker (if any) that's intended to be used with. We implemented relative references using current-position-relative offsets for Darwin and Linux both because that still allows for a fairly convenient pointer-like C++ API for working with relative offsets, and because the established toolchains on those platforms already have to support PIC so had most of the relocations we needed to make them work already; is there another base we could use for relative offsets on Windows that would fit in the set of relocations supported by standard COFF linkers?
>>>>>
>>>>>
>>>>> Yes, the `-windows-macho` target is used for UEFI :-). The MachO binary is translated later to PE/COFF as required by the UEFI specification.
>>>>>
>>>>> There are only two relocation types which can be used for relative displacements: __ImageBase relative (IMAGE_REL_*_ADDR32NB) and section relative (IMAGE_REL_*_SECREL) which are relative to the beginning of the section. The latter is why I mentioned that moving them into the same section could be a solution as that would allow the relative distance to be encoded. Unfortunately, the section relative relocation is relative to the section within which the symbol is.
>>>>
>>>> What's wrong with IMAGE_REL_AMD64_REL32? We'd have to adjust the relative-pointer logic to store an offset from the end of the relative pointer instead of the beginning, but it doesn't seem to have a section requirement.
>>>>
>>>> Hmm, is it possible to use RIP relative addressing in data? If so, yes, that could work.
>>>
>>> There's no inherent reason, but I wouldn't put it past the linker to fall over and die. But it should at least be section-agnostic about the target, since this is likely to be used for all sorts of PC-relative addressing.
>>>
>>>
>>> At least MC doesnt seem to like it. Something like this for example:
>>>
>>> ```
>>> .data
>>> data:
>>> .long 0
>>>
>>> .section .rodata
>>> rodata:
>>> .quad data(%rip)
>>> ```
>>>
>>> Bails out due to the unexpected modifier. Now, theoretically, we could support that modififer, but it does seem pretty odd.
>>>
>>> Now, as it so happens, both PE and PE+ have limitations on the file size at 4GiB. This means that we are guaranteed that the relative difference is guaranteed to fit within 32-bits. This is where things get really interesting!
>>>
>>> We cannot generate the relocation because we are emitting the values at pointer width. However, the value that we are emitting is a relative offset, which we just determined to be limited to 32-bits in width. The thing is, the IMAGE_REL_AMD64_REL32 doesn't actually seem to care about the cross-setionness as you pointed out. So, rather than emitting a pointer-width value (`.quad`), we could emit a pad (`.long 0`) and follow that with the relative displacement (`.long <expr>`). This would be representable in the PE/COFF model.
>>>
>>> If I understand the layout correctly, the type metadata fields are supposed to be pointer sized. I assume that we would like to maintain that across the formats. It may be possible to alter the emission to change the relative pointer emission to emit a pair of longs instead for PE/COFF with a 64-bit pointer value. Basically, we cannot truncate the relocation to a IMAGE_REL_AMD64_REL32 but we could generate the appropriate relocation and pad to the desired width.
>>>
>>> Are there any pitfalls that I should be aware of trying to adjust the emission to do this? The only downsides that I can see is that the emission would need to be taret dependent (that is check the output object format and the target pointer width).
>>>
>>> Thanks for the hint John! It seems that was spot on :-).
>>
>> Honestly, I don't know that there's a great reason for this pointer to be relative in the first place. The struct metadata will already have an absolute pointer to the value witness table which requires load-time relocation, so maybe we should just make this an absolute pointer, too, unless we're seriously considering making that a relative pointer before allocation.
>>
>> In practice this will just be a rebase, not a full relocation, so it should be relatively cheap.
>
> At one point we discussed the possibility of also making the value witness table pointer relative, which would allow concrete value type metadata to be fully read-only, and since code invoking a value witness is almost certainly going to have the base type metadata pointer live, probably not an undue burden on code size.

Yes, that's true. It would make the base of the load (metadata + loaded-offset + immediate-offset), which I think would require an extra instruction even on x86, but maybe that's not so bad.

On the other hand, yes, it would not be possible to refer to prebuilt vwtables from the runtime, and it would need to be a 64-bit relative offset in order to handle dynamic instantiation correctly, which as you say is problematic on some platforms.

Hmm, Im not sure I understand the desired approach. Would we want to switch to a rebased pointer?

That's what we're discussing. Switching to an absolute pointer (i.e. a normal pointer, which would need to be rebased) has proven to be generally more portable because many linkers do not support 64-bit relative pointers. Also, since this is adjacent to another absolute pointer, the benefits of a relative pointer seem pretty weak: it would eliminate a very small amount of work at load time and (probably) some binary-size overhead, but that's relatively minor compared to, say, whether the loader has to dirty any memory. Now, maybe we can avoid it being adjacent to another absolute pointer by making the vwtable relative, and that would have some significant upsides, but it would also have some significant drawbacks, and it's not clear that anybody actually wants to put any time into that investigation before we reach ABI stability.

I'm personally leaning towards saying that vwtables should just stay absolute, and thus that nominal-type-descriptor pointers should just become absolute to make things easier. I'm not worried about the binary-size impact; it's just a rebase, and Mach-O encodes rebases pretty efficiently. It's a little unfortunate for ELF, which has wastefully large loader encodings, but we could address that specifically if we felt the urge (or maybe just do ELF infrastructure work on more efficient encodings).

Would this be for all of the metadata or just the struct type?

Only structs, enums, and classes have nominal type metadata, and classes use an absolute pointer.

Are there no other instances of the same pattern?

At the very least, none of the other instances have the 64-bit problem. They're also just generally more likely to be internal to a section.

John.

···

On Sep 26, 2017, at 12:35 AM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:
On Mon, Sep 25, 2017 at 11:47 AM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:
> On Sep 25, 2017, at 12:24 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:
>> On Sep 24, 2017, at 10:30 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:
>>> On Sep 22, 2017, at 8:39 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:
>>> On Thu, Sep 21, 2017 at 10:28 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:
>>>> On Sep 21, 2017, at 10:10 PM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:
>>>> On Thu, Sep 21, 2017 at 5:18 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:
>>>>> On Sep 21, 2017, at 1:26 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>>>>> On Thu, Sep 21, 2017 at 12:04 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:
>>>>>> On Sep 21, 2017, at 11:49 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:
>>>>>> On Thu, Sep 21, 2017 at 10:53 AM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:
>>>>>>> On Sep 21, 2017, at 9:32 AM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

John.

> It's a fair question though whether we'll ever get around to that analysis, and I think the nominal type descriptor reference is the only place we statically emit a pointer-sized rather than 32-bit relative offset, which has caused problems for ports to other platforms that only support 32-bit relative offsets.

>
> -Joe

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org