Adapting SwiftReflectionTest to work on Linux

@Adrian_Prantl @vedantk as a first step in enabling the reflect_UInt16, I'll need to adapt SwiftReflectionTest to Linux.

SwiftReflectionTest depends on MachO and Darwin. Right now, I'm thinking that in order to make this file works on Linux, I'll have to use preprocessor macros to import the correct modules, like so:

#if os(macOS)
import MachO
import Darwin
#elseif os(Linux)
import Glibc
// import Elf?
#endif

Is this more or less correct? If yes, is there an Elf module for Swift, or would I have to write one?

1 Like

@augusto2112, you and I are both diving into this code for the first time, so please bear with me. CC'ing @Mike_Ash @bitjammer and @Joe_Groff for swift reflection expertise.

At a high level, it looks like SwiftReflectionTest reflects information about different kinds of 'instances'. The first example I see is about querying the stored properties ('reflecting') of a class ('an instance').

To do this, the harness relies on two workhorses: getReflectionInfoForImage and sendBytes. The latter should be portable (it's just writing some bytes to stdout), but the former is written in a fairly MachO/Darwin-centric way. So, your goal would be to port getReflectionInfoForImage.

The job of getReflectionInfoForImage is to extract the sections which pertain to swift reflection metadata from a loaded image. To do this on Darwin, SwiftReflectionTest uses getsectiondata and MachHeader (presumably pulled from the MachO module) to fill in the info for a single section, and _dyld_{image_count, get_image_name, get_image_header} to enumerate/visit the sections in a loaded image.

AFAIK, there isn't an ELF module for Swift. But the MachO/dyld APIs in use here may have ELF/ld.so equivalents. If so, you'll need to find these and wire them up. If not, we may have to look for alternative solutions.

You may be able to find C++ implementations of the primitives you need. For example, lldb must have a way to obtain a list of loaded modules in an ELF process. I think this is (at least partially) implemented in the DynamicLoaderPOSIXDYLD class. And you can have a look at include/swift/Reflection/ReflectionContext.h to see how an ELF section can be parsed.

1 Like

I’ve worked on the reflection infrastructure for ELF and COFF a fair amount in the past.

There’s no good way to obtain the ELF modules. If you statically link, you can not query the information from the loader. You basically rely on the auxiliary vector to get the ASLR slide and the base address and from there just map things manually.

The loader APIs aren’t that great. The best way would require pulling in a ton of code from LLVM (~250-500kLoC, yes, I’m serious, in doing something similar for the runtime right now).

CC: @alexshap

Ultimately, what swift-reflection-test wants is the locations of the reflection metadata inside the process. The loader may not be the best way to do that. We could perhaps provide some extra entry points in the Swift runtime that only build in debug builds, which could give swift-reflection-test access to the addresses of the reflection sections as the runtime sees them.

Right. This is related to the horrible workaround that @alexshap has tried to apply with the preserved note program header (PT_NOTE) that would provide the offset of the metadata as the section information is not guaranteed in ELF. It resulted in some other failures in the test IIRC as the tests are not entirely correct for ELF.

Thanks for your helpful comments! I think I'm starting to put together a picture of how ReflectionContext is tested. I'll try to summarize the discussion here (mostly to to solidify my own understanding).

Summary: For Augusto's project, we want lldb to use NativeReflectionContext to get precise type information when debugging ELF binaries. The equivalent MachO support is tested in two ways: (1) independently of lldb (via the PipeMemoryReader set up in swift-reflection-test) and (2) with lldb (via a NativeReflectionContext set up to use LLDBMemoryReader). This thread is about making the first kind of test work with ELF binaries: it turns out this is hard to do because magic dyld APIs aren't available.

@Joe_Groff, you had a suggestion for how to fix this:

We could perhaps provide some extra entry points in the Swift runtime that only build in debug builds, which could give swift-reflection-test access to the addresses of the reflection sections as the runtime sees them.

Could you elaborate on what would be needed to make this work? Is the idea to record the start/end of the swift reflection sections in a PT_NOTE, like @compnerd described?

I'll throw an alternative half-baked idea out there: could we get swift-reflection-test to spawn the test process under lldb? It should be possible to write a lldb script that computes the result for sendReflectionInfos and injects it into a string inside of the test process.

CC'ing @dcci & @Frederic_Riss since they may have some opinions about this as well.

1 Like

If it's possible to get it working under lldb's NativeReflectionContext instead of needing its own ad-hoc memory reader, that sounds like a promising direction to explore. What I had in mind was more just adding runtime functions that report the set of sections as registered with the runtime; although that's probably not API we want to ship in production runtimes, it would allow swift-reflection-test to use the cross-platform code for tracking where reflection metadata is inside the process that already exists in the runtime.

1 Like

I think it's possible, I've sketched this here. The idea is to add another set of bidirectional pipes, like this:

swift-reflection-test <-> lldb-test <-> SwiftReflectionTest test process

For the most part, lldb-test would simply forward requests unaltered. If it gets REQUEST_REFLECTION_INFO or REQUEST_IMAGES from swift-reflection-test, it can attach-with-pid to the child to construct an lldb::Process and answer those queries. Ditto for REQUEST_STRING_LENGTH and REQUEST_READ_BYTES, I suppose: it could simply forward these, but it'd be a nice opportunity to increase code coverage for LLDBMemoryReader.

Ah, got it. Yeah this has the advantage of not introducing a dependency on lldb to the swift validation tests.

If anyone has opinions/feedback on which path to pursue, please share.

Hi everyone!

For GSOC we decided to start by implementing @Joe_Groff's idea, and I have a couple of questions.

I started by writing a function that returns a pointer to swift::MetadataSections that is accessible to Swift:

// in ImageInspectionElf.cpp
const swift::MetadataSections *swift_getMetadataSection() {
  return registered;
}
// in Misc.swift, is this the correct place?
@_silgen_name("swift_getMetadataSection")
public func _getMetadataSection() -> UnsafeRawPointer?

I also wrote on the Swift side a struct that is exactly the same as MetadataSections (defined in ImageInspectionElf.h) so I can get a typed structure to work with in Swift. I checked that all the fields match in Swift and C++, and they do, but I'd like to ask you if you think if this is too fragile. If someone changes the fields of swift::MetadataSections, or even their order, this would break the Swift version. So is this ok, and if not, what's the usual way to access a typed structure from C++ in Swift?

I also have a second question: should I modify SwiftReflectionTest.swift with a bunch of #if macOS ... #elif Linux etc, or should I write a second file that compiles on Linux (and maybe Windows) and implements the same API?

If you want to share a type layout between C++ and Swift code, it would be more robust to try to move the struct declaration into the SwiftShims header, so that both sides can import the type from a common C definition.

How much difference is there between the macOS and Linux implementations?

If you want to share a type layout between C++ and Swift code, it would be more robust to try to move the struct declaration into the SwiftShims header, so that both sides can import the type from a common C definition.

Nice! I'll do that.

How much difference is there between the macOS and Linux implementations?

Hmm, I'm not 100% sure, but a lot of it seems to be platform-independent. I can start out keeping one file, and if turns into a mess I'll split it up.

Sounds good. Another option might be to keep the cross-platform code in one file, and selectively include a platform-dependent file for the macOS/Linux-specific bits.

1 Like

Hi @Joe_Groff, @vedantk and @Adrian_Prantl.

I have some questions:

  • getsegmentdata(const struct mach_header *mhp, const char *segname, unsigned long *size) returns a pointer to the segment and its size, given a mach header and a segment name (the one I need is __text). I might be missing something, don't see any functionality that gives the exact same information I need, but I think using the ReflectionContext class would be a good start, would this be correct?
  • If yes, I'm having problems including ReflectionContext.h inside any files in stdlib/public/runtime, when I do that I get a bunch of namespace conflicts (example below), so how do you usually do this?
/home/augusto/Developer/swift-project/swift/stdlib/include/llvm/ADT/Hashing.h:638:12: error: reference to 'llvm' is ambiguous
  return ::llvm::hashing::detail::hash_integer_value(
         ~~^
/home/augusto/Developer/swift-project/llvm-project/llvm/include/llvm/BinaryFormat/COFF.h:29:11: note: candidate found by name lookup is 'llvm'
namespace llvm {
          ^
/home/augusto/Developer/swift-project/swift/stdlib/include/llvm/ADT/Hashing.h:58:11: note: candidate found by name lookup is '__swift::__runtime::llvm'
namespace llvm {
          ^

The Reflection headers should not be included directly into the runtime. If there are data structures in there that you need, you might want to move them to a common parent component like ABI, and factor out the reflection-specific parts.

On Darwin, getsegmentdata ought to be sufficient to get at the related data structures, since each special section in a binary directly contains an array of records of the type appropriate to that section. What data are you trying to find in particular?

I need to do implement the same functionality as this call: getsegmentdata(header, "__TEXT", &size), so, returning the start and size of the .text section inside an ELF image.

What is that information used for? I don't believe we specifically need the entire text section for anything. There may be another way to get to the more specific information that's based on that computation.

We send both the address and the size to the parent:

internal func getAddressInfoForImage(atIndex i: UInt32) ->
  (name: String, address: UnsafeMutablePointer<UInt8>?, size: UInt) {
  debugLog("BEGIN \(#function)"); defer { debugLog("END \(#function)") }
  let header = unsafeBitCast(_dyld_get_image_header(i),
    to: UnsafePointer<MachHeader>.self)
  let name = String(validatingUTF8: _dyld_get_image_name(i)!)!
  var size: UInt = 0
  let address = getsegmentdata(header, "__TEXT", &size)
  return (name, address, size)
}


internal func sendImages() {
  debugLog("BEGIN \(#function)"); defer { debugLog("END \(#function)") }
  let infos = (0..<_dyld_image_count()).map(getAddressInfoForImage)

  debugLog("\(infos.count) reflection info bundles.")
  precondition(infos.count >= 1)
  sendValue(infos.count)
  for (name, address, size) in infos {
    debugLog("Sending info for \(name)")
    sendValue(address)
    sendValue(size)
  }
}

This is done when the parent requests the images:

internal func reflect(instanceAddress: UInt, kind: InstanceKind) {
  while let command = readLine(strippingNewline: true) {
    switch command {
    // others
    case String(validatingUTF8: RequestImages)!:
      sendImages()

@Joe_Groff it seems I might not need it... The only call I found that requests the images is the following (in swift-reflection-test.c):

#if defined(__APPLE__) && defined(__MACH__)
      PipeMemoryReader_receiveImages(RC, &Pipe);
#else
      PipeMemoryReader_receiveReflectionInfo(RC, &Pipe);
#endif

So I might be ok not implementing the functionality in getAddressInfoForImage(atIndex i: UInt32). What do you think?

@augusto2112 I'm not sure how much test coverage there is for PipeMemoryReader_receiveReflectionInfo, as this doesn't run on Apple platforms. Regardless, it issues REQUEST_REFLECTION_INFO, which just requires getReflectionInfoForImage. Were you able to find a way to re-implement that routine using information from the runtime?

The Reflection library should not need the entire image, only the bits that are referenced by the ReflectionInfo struct. You might need to change swift-reflection-test's memory reader implementation around a bit to only grab those parts, but it should be possible.