Changing ELF layout

Hello,

I'd like to propose that we change the locations that we use to store the
type metadata, protocol conformances, type references, reflection strings,
field metadata, and associated types.

I think that it is possible to simplify the design for the linker tables by
changing section names and relying on the linker to perform the work
necessary to generate the tables so that they can be walked later.

Switching sections would mean that we would lose interoperability with
previously built libraries. Given that there is ABI stability work going
on for at least the Darwin target, I figure that this would be the best
time to do this.

Would this be acceptable? Is compatibility something that we need to worry
about?

···

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

Compatibility is not something that we're currently promising. I think this is a fine time to be working on this problem.

It's not clear from your proposal whether you're just proposing changing sections or whether you're interested in more invasive changes to metadata emission. Can you be more specific.

John.

···

On Sep 16, 2017, at 6:06 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:
Hello,

I'd like to propose that we change the locations that we use to store the type metadata, protocol conformances, type references, reflection strings, field metadata, and associated types.

I think that it is possible to simplify the design for the linker tables by changing section names and relying on the linker to perform the work necessary to generate the tables so that they can be walked later.

Switching sections would mean that we would lose interoperability with previously built libraries. Given that there is ABI stability work going on for at least the Darwin target, I figure that this would be the best time to do this.

Would this be acceptable? Is compatibility something that we need to worry about?

> Hello,
>
> I'd like to propose that we change the locations that we use to store
the type metadata, protocol conformances, type references, reflection
strings, field metadata, and associated types.
>
> I think that it is possible to simplify the design for the linker tables
by changing section names and relying on the linker to perform the work
necessary to generate the tables so that they can be walked later.
>
> Switching sections would mean that we would lose interoperability with
previously built libraries. Given that there is ABI stability work going
on for at least the Darwin target, I figure that this would be the best
time to do this.
>
> Would this be acceptable? Is compatibility something that we need to
worry about?

Compatibility is not something that we're currently promising. I think
this is a fine time to be working on this problem.

It's not clear from your proposal whether you're just proposing changing
sections or whether you're interested in more invasive changes to metadata
emission. Can you be more specific.

Certainly.

Right now, we have two special object files which must be included in a
certain order to ensure that the sections that I mentioned earlier are
bounded and grouped. However, this is unneessary. As long as the section
name is a valid C identifier, the linker will group and bound the sections
with special variables that it will synthesize
(__{start,stop}_[SectionName]). This will allow us to replace the two file
approach with a single file approach. Furthermore, it will allow the file
to be injected anywhere (it drops the need for the files to appear in a
specific order). Finally, it simplifies the logic so that we can write the
entire thing in C rather than having to roll the begin/end content in
assembly.

I think that this would help reduce some of the complexity of the ELF
emission. In particular, it would mean that we would change the following:

.swift3_typeref -> swift3_type_references
.swift3_reflstr -> swift3_reflection_strings
.swift3_fieldmd -> swift3_field_metadata
.swift3_assocty -> swift3_associated_types

.swift2_protocol_conformances -> swift2_protocol_conformances
.swift2_type_metadata -> swift2_type_metadata

AFAIK, ELF does not impose section name limits, so, Im not sure if there is
anything to gain from the shortened names.

While we are changing these things around, it seems that it may be a good
idea to also change the PE/COFF emission to use section grouping (as
specified within the specification) so that we can have similar handling on
both the ELF and COFF sides.

In the case of PE/COFF, the only change would be the augmentation of the
grouping specifier ($B) on the existing names. The names already are
within the specification limits (COFF limits section names to 8
characters). This would allow begin/end markers to be constructed.

···

On Sat, Sep 16, 2017 at 6:19 PM, John McCall <rjmccall@apple.com> wrote:

> On Sep 16, 2017, at 6:06 PM, Saleem Abdulrasool via swift-dev < > swift-dev@swift.org> wrote:

John.

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

> Hello,
>
> I'd like to propose that we change the locations that we use to store the type metadata, protocol conformances, type references, reflection strings, field metadata, and associated types.
>
> I think that it is possible to simplify the design for the linker tables by changing section names and relying on the linker to perform the work necessary to generate the tables so that they can be walked later.
>
> Switching sections would mean that we would lose interoperability with previously built libraries. Given that there is ABI stability work going on for at least the Darwin target, I figure that this would be the best time to do this.
>
> Would this be acceptable? Is compatibility something that we need to worry about?

Compatibility is not something that we're currently promising. I think this is a fine time to be working on this problem.

It's not clear from your proposal whether you're just proposing changing sections or whether you're interested in more invasive changes to metadata emission. Can you be more specific.

Certainly.

Right now, we have two special object files which must be included in a certain order to ensure that the sections that I mentioned earlier are bounded and grouped. However, this is unneessary. As long as the section name is a valid C identifier, the linker will group and bound the sections with special variables that it will synthesize (__{start,stop}_[SectionName]). This will allow us to replace the two file approach with a single file approach. Furthermore, it will allow the file to be injected anywhere (it drops the need for the files to appear in a specific order). Finally, it simplifies the logic so that we can write the entire thing in C rather than having to roll the begin/end content in assembly.

Yes, if that's the case, that would be massively useful.

The runtime needs to be able to find these bounds in an arbitrary image, since there may be multiple images in the program containing Swift code. Are those symbols available dynamically even if it they aren't used statically?

John.

···

On Sep 17, 2017, at 10:15 AM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:
On Sat, Sep 16, 2017 at 6:19 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:
> On Sep 16, 2017, at 6:06 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

I think that this would help reduce some of the complexity of the ELF emission. In particular, it would mean that we would change the following:

.swift3_typeref -> swift3_type_references
.swift3_reflstr -> swift3_reflection_strings
.swift3_fieldmd -> swift3_field_metadata
.swift3_assocty -> swift3_associated_types

.swift2_protocol_conformances -> swift2_protocol_conformances
.swift2_type_metadata -> swift2_type_metadata

AFAIK, ELF does not impose section name limits, so, Im not sure if there is anything to gain from the shortened names.

While we are changing these things around, it seems that it may be a good idea to also change the PE/COFF emission to use section grouping (as specified within the specification) so that we can have similar handling on both the ELF and COFF sides.

In the case of PE/COFF, the only change would be the augmentation of the grouping specifier ($B) on the existing names. The names already are within the specification limits (COFF limits section names to 8 characters). This would allow begin/end markers to be constructed.

John.

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

> Hello,
>
> I'd like to propose that we change the locations that we use to store
the type metadata, protocol conformances, type references, reflection
strings, field metadata, and associated types.
>
> I think that it is possible to simplify the design for the linker
tables by changing section names and relying on the linker to perform the
work necessary to generate the tables so that they can be walked later.
>
> Switching sections would mean that we would lose interoperability with
previously built libraries. Given that there is ABI stability work going
on for at least the Darwin target, I figure that this would be the best
time to do this.
>
> Would this be acceptable? Is compatibility something that we need to
worry about?

Compatibility is not something that we're currently promising. I think
this is a fine time to be working on this problem.

It's not clear from your proposal whether you're just proposing changing
sections or whether you're interested in more invasive changes to metadata
emission. Can you be more specific.

Certainly.

Right now, we have two special object files which must be included in a
certain order to ensure that the sections that I mentioned earlier are
bounded and grouped. However, this is unneessary. As long as the section
name is a valid C identifier, the linker will group and bound the sections
with special variables that it will synthesize
(__{start,stop}_[SectionName]). This will allow us to replace the two file
approach with a single file approach. Furthermore, it will allow the file
to be injected anywhere (it drops the need for the files to appear in a
specific order). Finally, it simplifies the logic so that we can write the
entire thing in C rather than having to roll the begin/end content in
assembly.

Yes, if that's the case, that would be massively useful.

The runtime needs to be able to find these bounds in an arbitrary image,
since there may be multiple images in the program containing Swift code.
Are those symbols available dynamically even if it they aren't used
statically?

It should be possible to preserve the symbols. In general, it is possible
to dead strip symbols. So having the object that needs to be injected
reference them is sufficient to ensure that they aren't dead stripped.
After that, they can be dynamically looked up even if they aren't
statically used.

I'll try to get to this change soon!

John.

I think that this would help reduce some of the complexity of the ELF
emission. In particular, it would mean that we would change the following:

.swift3_typeref -> swift3_type_references
.swift3_reflstr -> swift3_reflection_strings
.swift3_fieldmd -> swift3_field_metadata
.swift3_assocty -> swift3_associated_types

.swift2_protocol_conformances -> swift2_protocol_conformances
.swift2_type_metadata -> swift2_type_metadata

AFAIK, ELF does not impose section name limits, so, Im not sure if there
is anything to gain from the shortened names.

While we are changing these things around, it seems that it may be a good
idea to also change the PE/COFF emission to use section grouping (as
specified within the specification) so that we can have similar handling on
both the ELF and COFF sides.

In the case of PE/COFF, the only change would be the augmentation of the
grouping specifier ($B) on the existing names. The names already are
within the specification limits (COFF limits section names to 8
characters). This would allow begin/end markers to be constructed.

John.

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

--

Saleem Abdulrasool
compnerd (at) compnerd (dot) org

···

On Mon, Sep 18, 2017 at 9:36 AM John McCall <rjmccall@apple.com> wrote:

On Sep 17, 2017, at 10:15 AM, Saleem Abdulrasool <compnerd@compnerd.org> > wrote:
On Sat, Sep 16, 2017 at 6:19 PM, John McCall <rjmccall@apple.com> wrote:

> On Sep 16, 2017, at 6:06 PM, Saleem Abdulrasool via swift-dev < >> swift-dev@swift.org> wrote:

The symbols would not only have to not be stripped but also be exported with protected or default visibility in order for dlsym to find them normally. Is it possible to get the linker to export these implicit symbols with protected visibility? If not, we might still need an asm stub to define visible symbols as aliases for the implicit section symbols. That stub could at least be order-independent and portable, which would be an improvement over what we have now.

-Joe

···

On Sep 18, 2017, at 3:31 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org> wrote:

On Mon, Sep 18, 2017 at 9:36 AM John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 17, 2017, at 10:15 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:

On Sat, Sep 16, 2017 at 6:19 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:
> On Sep 16, 2017, at 6:06 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
> Hello,
>
> I'd like to propose that we change the locations that we use to store the type metadata, protocol conformances, type references, reflection strings, field metadata, and associated types.
>
> I think that it is possible to simplify the design for the linker tables by changing section names and relying on the linker to perform the work necessary to generate the tables so that they can be walked later.
>
> Switching sections would mean that we would lose interoperability with previously built libraries. Given that there is ABI stability work going on for at least the Darwin target, I figure that this would be the best time to do this.
>
> Would this be acceptable? Is compatibility something that we need to worry about?

Compatibility is not something that we're currently promising. I think this is a fine time to be working on this problem.

It's not clear from your proposal whether you're just proposing changing sections or whether you're interested in more invasive changes to metadata emission. Can you be more specific.

Certainly.

Right now, we have two special object files which must be included in a certain order to ensure that the sections that I mentioned earlier are bounded and grouped. However, this is unneessary. As long as the section name is a valid C identifier, the linker will group and bound the sections with special variables that it will synthesize (__{start,stop}_[SectionName]). This will allow us to replace the two file approach with a single file approach. Furthermore, it will allow the file to be injected anywhere (it drops the need for the files to appear in a specific order). Finally, it simplifies the logic so that we can write the entire thing in C rather than having to roll the begin/end content in assembly.

Yes, if that's the case, that would be massively useful.

The runtime needs to be able to find these bounds in an arbitrary image, since there may be multiple images in the program containing Swift code. Are those symbols available dynamically even if it they aren't used statically?

It should be possible to preserve the symbols. In general, it is possible to dead strip symbols. So having the object that needs to be injected reference them is sufficient to ensure that they aren't dead stripped. After that, they can be dynamically looked up even if they aren't statically used.

I'll try to get to this change soon!

> Hello,
>
> I'd like to propose that we change the locations that we use to store
the type metadata, protocol conformances, type references, reflection
strings, field metadata, and associated types.
>
> I think that it is possible to simplify the design for the linker
tables by changing section names and relying on the linker to perform the
work necessary to generate the tables so that they can be walked later.
>
> Switching sections would mean that we would lose interoperability with
previously built libraries. Given that there is ABI stability work going
on for at least the Darwin target, I figure that this would be the best
time to do this.
>
> Would this be acceptable? Is compatibility something that we need to
worry about?

Compatibility is not something that we're currently promising. I think
this is a fine time to be working on this problem.

It's not clear from your proposal whether you're just proposing changing
sections or whether you're interested in more invasive changes to metadata
emission. Can you be more specific.

Certainly.

Right now, we have two special object files which must be included in a
certain order to ensure that the sections that I mentioned earlier are
bounded and grouped. However, this is unneessary. As long as the section
name is a valid C identifier, the linker will group and bound the sections
with special variables that it will synthesize
(__{start,stop}_[SectionName]). This will allow us to replace the two
file approach with a single file approach. Furthermore, it will allow the
file to be injected anywhere (it drops the need for the files to appear in
a specific order). Finally, it simplifies the logic so that we can write
the entire thing in C rather than having to roll the begin/end content in
assembly.

Yes, if that's the case, that would be massively useful.

The runtime needs to be able to find these bounds in an arbitrary image,
since there may be multiple images in the program containing Swift code.
Are those symbols available dynamically even if it they aren't used
statically?

It should be possible to preserve the symbols. In general, it is possible
to dead strip symbols. So having the object that needs to be injected
reference them is sufficient to ensure that they aren't dead stripped.
After that, they can be dynamically looked up even if they aren't
statically used.

I'll try to get to this change soon!

The symbols would not only have to not be stripped but also be exported
with protected or default visibility in order for dlsym to find them
normally. Is it possible to get the linker to export these implicit symbols
with protected visibility? If not, we might still need an asm stub to
define visible symbols as aliases for the implicit section symbols. That
stub could at least be order-independent and portable, which would be an
improvement over what we have now.

The default is default visibility, which IMO is not really what we want.
Ideally, we want protected visibility. Now, this is possible to, but, the
code for is every so slightly distasteful.

__attribute__((__section__("section"))) const int i = 0;
__attribute__((__visibility__("protected"))) extern void *__start_section;
__attribute__((__visibility__("protected"))) extern void *__stop_section;
__UINTPTR_TYPE__ get_section_size(void) { return __stop_section -
__start_section; }

The thing is that we need the local function (in this case,
`get_section_size`) to both preserve the symbols as well as to ensure that
the symbols receive the proper visibility. In the worst case, we would
need a structure that is preserved to aid in keeping the reference to the
symbols.

-Joe

···

On Mon, Sep 18, 2017 at 4:07 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 18, 2017, at 3:31 PM, Saleem Abdulrasool via swift-dev < > swift-dev@swift.org> wrote:
On Mon, Sep 18, 2017 at 9:36 AM John McCall <rjmccall@apple.com> wrote:

On Sep 17, 2017, at 10:15 AM, Saleem Abdulrasool <compnerd@compnerd.org> >> wrote:
On Sat, Sep 16, 2017 at 6:19 PM, John McCall <rjmccall@apple.com> wrote:

> On Sep 16, 2017, at 6:06 PM, Saleem Abdulrasool via swift-dev < >>> swift-dev@swift.org> wrote:

--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

> Hello,
>
> I'd like to propose that we change the locations that we use to store the type metadata, protocol conformances, type references, reflection strings, field metadata, and associated types.
>
> I think that it is possible to simplify the design for the linker tables by changing section names and relying on the linker to perform the work necessary to generate the tables so that they can be walked later.
>
> Switching sections would mean that we would lose interoperability with previously built libraries. Given that there is ABI stability work going on for at least the Darwin target, I figure that this would be the best time to do this.
>
> Would this be acceptable? Is compatibility something that we need to worry about?

Compatibility is not something that we're currently promising. I think this is a fine time to be working on this problem.

It's not clear from your proposal whether you're just proposing changing sections or whether you're interested in more invasive changes to metadata emission. Can you be more specific.

Certainly.

Right now, we have two special object files which must be included in a certain order to ensure that the sections that I mentioned earlier are bounded and grouped. However, this is unneessary. As long as the section name is a valid C identifier, the linker will group and bound the sections with special variables that it will synthesize (__{start,stop}_[SectionName]). This will allow us to replace the two file approach with a single file approach. Furthermore, it will allow the file to be injected anywhere (it drops the need for the files to appear in a specific order). Finally, it simplifies the logic so that we can write the entire thing in C rather than having to roll the begin/end content in assembly.

Yes, if that's the case, that would be massively useful.

The runtime needs to be able to find these bounds in an arbitrary image, since there may be multiple images in the program containing Swift code. Are those symbols available dynamically even if it they aren't used statically?

It should be possible to preserve the symbols. In general, it is possible to dead strip symbols. So having the object that needs to be injected reference them is sufficient to ensure that they aren't dead stripped. After that, they can be dynamically looked up even if they aren't statically used.

I'll try to get to this change soon!

The symbols would not only have to not be stripped but also be exported with protected or default visibility in order for dlsym to find them normally. Is it possible to get the linker to export these implicit symbols with protected visibility? If not, we might still need an asm stub to define visible symbols as aliases for the implicit section symbols. That stub could at least be order-independent and portable, which would be an improvement over what we have now.

The default is default visibility, which IMO is not really what we want. Ideally, we want protected visibility. Now, this is possible to, but, the code for is every so slightly distasteful.

__attribute__((__section__("section"))) const int i = 0;
__attribute__((__visibility__("protected"))) extern void *__start_section;
__attribute__((__visibility__("protected"))) extern void *__stop_section;
__UINTPTR_TYPE__ get_section_size(void) { return __stop_section - __start_section; }

This would *declare* the symbols as being protected, but they're still defined externally. Does the linker reconcile the declared visibility with its implicit definition so that it picks up the visibility from the declaration?

The thing is that we need the local function (in this case, `get_section_size`) to both preserve the symbols as well as to ensure that the symbols receive the proper visibility. In the worst case, we would need a structure that is preserved to aid in keeping the reference to the symbols.

If the symbols have the correct visibility, then they ought to be preserved regardless. Exported symbols ought to be considered used.

-Joe

···

On Sep 18, 2017, at 9:24 PM, Saleem Abdulrasool <compnerd@compnerd.org> wrote:
On Mon, Sep 18, 2017 at 4:07 PM, Joe Groff <jgroff@apple.com <mailto:jgroff@apple.com>> wrote:

On Sep 18, 2017, at 3:31 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
On Mon, Sep 18, 2017 at 9:36 AM John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Sep 17, 2017, at 10:15 AM, Saleem Abdulrasool <compnerd@compnerd.org <mailto:compnerd@compnerd.org>> wrote:
On Sat, Sep 16, 2017 at 6:19 PM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:
> On Sep 16, 2017, at 6:06 PM, Saleem Abdulrasool via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

> Hello,
>
> I'd like to propose that we change the locations that we use to store
the type metadata, protocol conformances, type references, reflection
strings, field metadata, and associated types.
>
> I think that it is possible to simplify the design for the linker
tables by changing section names and relying on the linker to perform the
work necessary to generate the tables so that they can be walked later.
>
> Switching sections would mean that we would lose interoperability
with previously built libraries. Given that there is ABI stability work
going on for at least the Darwin target, I figure that this would be the
best time to do this.
>
> Would this be acceptable? Is compatibility something that we need to
worry about?

Compatibility is not something that we're currently promising. I think
this is a fine time to be working on this problem.

It's not clear from your proposal whether you're just proposing
changing sections or whether you're interested in more invasive changes to
metadata emission. Can you be more specific.

Certainly.

Right now, we have two special object files which must be included in a
certain order to ensure that the sections that I mentioned earlier are
bounded and grouped. However, this is unneessary. As long as the section
name is a valid C identifier, the linker will group and bound the sections
with special variables that it will synthesize
(__{start,stop}_[SectionName]). This will allow us to replace the two file
approach with a single file approach. Furthermore, it will allow the file
to be injected anywhere (it drops the need for the files to appear in a
specific order). Finally, it simplifies the logic so that we can write the
entire thing in C rather than having to roll the begin/end content in
assembly.

Yes, if that's the case, that would be massively useful.

The runtime needs to be able to find these bounds in an arbitrary image,
since there may be multiple images in the program containing Swift code.
Are those symbols available dynamically even if it they aren't used
statically?

It should be possible to preserve the symbols. In general, it is
possible to dead strip symbols. So having the object that needs to be
injected reference them is sufficient to ensure that they aren't dead
stripped. After that, they can be dynamically looked up even if they
aren't statically used.

I'll try to get to this change soon!

The symbols would not only have to not be stripped but also be exported
with protected or default visibility in order for dlsym to find them
normally. Is it possible to get the linker to export these implicit symbols
with protected visibility? If not, we might still need an asm stub to
define visible symbols as aliases for the implicit section symbols. That
stub could at least be order-independent and portable, which would be an
improvement over what we have now.

The default is default visibility, which IMO is not really what we want.
Ideally, we want protected visibility. Now, this is possible to, but, the
code for is every so slightly distasteful.

__attribute__((__section__("section"))) const int i = 0;
__attribute__((__visibility__("protected"))) extern void *__start_section;
__attribute__((__visibility__("protected"))) extern void *__stop_section;
__UINTPTR_TYPE__ get_section_size(void) { return __stop_section -
__start_section; }

This would *declare* the symbols as being protected, but they're still
defined externally. Does the linker reconcile the declared visibility with
its implicit definition so that it picks up the visibility from the
declaration?

Right, the declaration is reconciled by the linker (at least by the BFD and
gold linkers) so give the symbol protected visibility. But it only does so
if they are used. Simply declaring it was insufficient and they receive
default visibility.

The thing is that we need the local function (in this case,
`get_section_size`) to both preserve the symbols as well as to ensure that
the symbols receive the proper visibility. In the worst case, we would
need a structure that is preserved to aid in keeping the reference to the
symbols.

If the symbols have the correct visibility, then they ought to be
preserved regardless. Exported symbols ought to be considered used.

I think that there may be a bug in one of the linkers as the behavior is
different across the two. Preserving the symbol with a reference is the
distasteful bit, but since it is for working around the linker behavior, I
don't think that it would prevent a future clean up if the linker is fixed.

-Joe

--

Saleem Abdulrasool
compnerd (at) compnerd (dot) org

···

On Tue, Sep 19, 2017 at 8:42 AM Joe Groff <jgroff@apple.com> wrote:

On Sep 18, 2017, at 9:24 PM, Saleem Abdulrasool <compnerd@compnerd.org> > wrote:
On Mon, Sep 18, 2017 at 4:07 PM, Joe Groff <jgroff@apple.com> wrote:

On Sep 18, 2017, at 3:31 PM, Saleem Abdulrasool via swift-dev < >> swift-dev@swift.org> wrote:
On Mon, Sep 18, 2017 at 9:36 AM John McCall <rjmccall@apple.com> wrote:

On Sep 17, 2017, at 10:15 AM, Saleem Abdulrasool <compnerd@compnerd.org> >>> wrote:
On Sat, Sep 16, 2017 at 6:19 PM, John McCall <rjmccall@apple.com> wrote:

> On Sep 16, 2017, at 6:06 PM, Saleem Abdulrasool via swift-dev < >>>> swift-dev@swift.org> wrote: