ELF metadata reflection

elf
(Saleem Abdulrasool) #1

During the Windows test suite work, a fun little problem for ELFish targets was found. It seems that the in memory reflection doesn't really work in practice (outside of the tests). The particular issue that I am currently thinking about involves the location of the section containing the relevant metadata. Section names are stored in the shstrtab section with the shdr->sh_info containing an offset into the section. This is problematic since shstrtab is normally not mapped into the VA of the image. This means that we cannot find a section by name (there is no real equivalent to getsectbyname from dyld).

My current thought is that we should employ the use of section flags here, as ELF has 0x80000000-0xffffffff reserved as user defined value ranges for the application (shdr->sh_flags & SHT_PROGBITS indicates that the data is program data, and then shdr->sh_flags & ~SHT_LOUSER should give us the program specific bits that we can play with to enumerate the section types).

Am I forgetting something about ELF loading/handling? Is this a reasonable approach? This should improve the reflection support on ELFish targets.

CC: @Joe_Groff @John_McCall @dcci @Slava_Pestov

(Joe Groff) #2

In-process, I don't think we use the section names at all, we go off of the MetadataSections record that the static constructor registers, and prior to that refactoring, we had used symbol names. It seems to me like we should try to be consistent with the runtime's behavior; maybe we can use a known symbol name to point to the MetadataSections constant info.

1 Like
(Saleem Abdulrasool) #3

Right, I remember doing the refactoring that you are referring to wrt registration of the sections with the runtime.

However, it is possible to reflect upon a remote process (RemoteMirror) where the image has been loaded already from disk (and consider the disk image to be removed, since unlike Windows, you can delete a file that is mapped on Unices). In such a case, you only have the in memory image to read from. For ELF, in such a case, shstrtab (section header strings - contains the section names) and strtab (string table - contains the symbol names) are not mapped, so we cannot find a symbol manually (and this should work for static binary cases, so we cannot use dlsym either. So, we need to parse the ELF metadata which is guaranteed to be mapped.

(Joe Groff) #4

Is there a more direct way to encode the offset of the metadata sections data structure so that it gets mapped and doesn't need symbol lookup? If there's a way remote mirrors could get the table addresses it needs with less work, and without looking up symbols that could get stripped or mangled by various things, that seems generally good for robustness.

(Simon Evans) #5

@compnerd https://github.com/apple/swift/blob/master/stdlib/public/runtime/StaticBinaryELF.cpp has some ELF parsing, it was used for static executables on Linux to provide a dlsym() type implementation. Although it will need to mmap the file, so wont work if it has been deleted.

(Saleem Abdulrasool) #6

Right, that is what I am trying to figure out. However, what I had not considered is that we really only need to access the metadata symbol itself. We could push that into a special section and mark that section with the SHT_LOUSER bit to indicate that it contains the metadata since the section will be mapped but the name of the section is lost. The bit will identify the section and because the content of the section is going to contain only a single instance of swift::MetadataSections we know how to process the data. This does seem significantly better than what I had initially thought about. It should remain ABI compatible as well, simply that reflection will continue to be broken on older releases, which seems reasonable.

(Saleem Abdulrasool) #7

It is certainly possible to reconstruct the data if you have the file and can reparse it. The attempt here is to process just the content which is in memory.

(Joe Groff) #8

There's no ABI to be compatible with on Linux yet. As a future proofing thing, though, MetadataSections should be easier to extend with new functionality if someone decides to support an ABI at some point.

(Saleem Abdulrasool) #9

Absolute, that's why I added a version field as the first item in MetadataSections :slight_smile: