Pitch: Introduce #module to get the current module name

Pitch: Introduce #module to get the current module name

Implementation: pending (will do if pitch is successful enough)

Motivation

Currently Swift has a number "magic identifiers": #fileID, #file, #filePath, #file, #function, #line, #column, and #dsohandle which are extremely useful, for example for debugging and logging.
In log output, it's a common practise to output the filename (and sometimes the line number) where the log message was emitted from. In most cases however, only the basename of a file is logged, ie. if a log messages originates from /Users/me/MyProject/Sources/MyModule/MySubfolder/BestFile.swift, then commonly only the "basename" BestFile.swift is logged because the full paths can become very long.

Logging the file name works quite well if all the logs originate from your own code because chances are you will recognise the file name and can easily find it. If however a log message originates from a package you pull in, then this may be a lot harder. Let's say it comes from Utilities.swift, which package might that be from? Potentially multiple.

This is to say that frequently, it'd be very useful to also know the emitting module but currently in Swift that's not properly supported. SwiftLog currently uses a hack which tries to parse the module name from the full file path which is then uses as the source parameter for each log message (it uses the last directory name :sob:). As an aside why a source parameter is useful: In structured logging where you want to preserve metadata across module boundaries, you typically pass through a single Logger (which holds onto the metadata) to other libraries you call. When passing through the Logger however we cannot attach the "source" information to the Logger itself. Instead, the source information needs to be attached to every log message and the current module name seems like a reasonable default.


ADDITION (posted after and thanks to @marco.masser's helpful comment)

Please note that since Swift 5.3, there's another actually documented(!) (in the Literals Expressions section of the Swift Book) workaround to get the current module name by doing string parsing on #fileID which guarantees that its first path component is the module name:

To parse a #fileID expression, read the module name as the text before the first slash ( / ) and the filename as the text after the last slash. In the future, the string might contain multiple slashes, such as MyModule/some/disambiguation/MyFile.swift.

Whilst that works better than SwiftLog's hack, it's still a work around which may incur an allocation and has to do string parsing every time the module name is needed.


Apart from logging, there are numerous other places where it's useful to be able to get the current module name.

Proposed Solution

Much like #filePath and the others, introduce #module which hols the current module name. So if a hypothetical module BestModule did print("I am '\(#module)'"), it would print I am 'BestModule'.

Detailed Design

Much like for #fileID and the others, both of the following uses would work:

let string: String = #module 
let staticString: StaticString = #module 

ADDITION (posted after the first few comments):

It's crucial that #module would work like #filePath and the others and when used as a default value for an argument gets evaluated in the caller's context.

Example: Let's assume we have two modules: App and Framework. I the Framework module we'd have:

public func printCallerModule(_ module: String = #module) {
    print(module)
}

And in App we do

import Framework

printCallerModule()

This should print App (and not Framework) because the #module gets evaluated in the caller's context (ie. the App module).


Source compatibility

This is an additive change and does not have any material effect on source compatibility.

Effect on ABI stability

This change introduces new conversions at compile-time only, and so would not impact ABI.

Effect on API resilience

This is not an API-level change and would not impact resilience.

Alternatives Considered

We could not add #module.

44 Likes

+ :100:, I’d love this so much...! Thanks for picking it up Johannes.

Mostly for the logging case but there can be many other applications I think, though mostly around the debugging/developer experience like this

6 Likes

Thank you for bringing this problem up to our attention!

While I believe Swift needs something that says "current" module, I'd see it as an identifier that returns the current module reference that could be used in a broader sense, eg. to disambiguate types. I'm thinking about "ThisModule.Foo" vs "NamedModule.Foo" vs "Foo".

3 Likes

Can't you do this already? Or is the difference in run time vs compile time? If so, a downside of the proposed #module compile-time directive is that there's no way to get a name of some other module, while it still remains an option with this runtime approach:

enum TypeInCurrentModule {}
let moduleName = String(reflecting: TypeInCurrentModule.self)
  .split(separator: ".")[0]
2 Likes

I see. I think there's room for both tbh! #module is important because the #magicIdentifiers are the one place where you can get information from your caller.

Ie. if say we have two modules: App and Framework. If in the Framework module we have:

public func printCallerModule(_ module: String = #module) {
    print(module)
}

And in App we do

import Framework

printCallerModule()

Then it will print App because the #magicIdentifiers get evaluated in the caller's context. If we had a (hypothetical) Module.Type and did public func printCallerModule(_ module: String = "\(Module.Type)") then the above would print Framework which isn't what we need.

1 Like

Yes, there are various hacks like the one you propose but I don't think we should rely on the String description of a type (or the file path). Said that, the hack you suggest is better than SwiftLog's hack tbh, so I think I'll make a PR to switch (until we get #module)! LATER EDIT: my original response is actually incorrect, this hack doesn't work because we can't forward the caller's module name. Full correction here.

And yes, ideally this would be done at compile-time so that we can guarantee to do this without allocations (which is rather important I think).

2 Likes

Actually, I misspoke. You can't unfortunately get the desired behaviour that you can evaluate that in the caller's module like you can with the #magicIDs. I think your suggestion suffers from the same problem as the ThisModule suggestion: you can't forward the caller's module as you could with public func printCallerModule(_ module: String = #module).

2 Likes

(I edited the pitch to incorporate the important functionality of being able to evaluate #module in the caller's context using a defaulted argument. Much like I clarified in the comments above. Thanks @krzyzanowskim & @Max_Desiatov whose suggestions made me aware that I forgot a crucial bit of information.)

3 Likes

I would also like this very much!

But I think your pitch should mention the existing documented workaround using #fileID. I’m currently using this to get the “current” Bundle of the caller without having to add a separate type to each and every module where this is called. This is officially documented in the Literals Expressions section of the Swift Book:

To parse a #fileID expression, read the module name as the text before the first slash (/) and the filename as the text after the last slash. In the future, the string might contain multiple slashes, such as MyModule/some/disambiguation/MyFile.swift .

Essentially, it boils down to this:

extension Bundle {

    static func current(fileID: StaticString = #fileID) -> Bundle {
        let moduleName = String("\(fileID)".prefix(while: { $0 != "/" }))
        // Look for a Bundle whose `executableName` name is `moduleName`
        // ... plus some other logic to support resource bundles from Swift Packages.
    }
}
2 Likes

Thank you, indeed, added!

1 Like

This would be great. I wonder if there's a way here to also address the hole in the language around being able to make qualified references to the current module, without having to know the current module's name. Today, if you have:

struct Foo {
  struct Foo { /* xxx */ }
}

the only way to refer specifically to the outer Foo inside the body of the inner Foo is as <current module name>.Foo. Could #module resolve to a value that represents the current module, which can be stringified by "\(#module)", but can also be used for top-level qualified references as #module.Foo?

12 Likes

Right, I think this is also a problem we should fix but I think that these are two distinct problems for these reasons:

  • Currently if you do public func foo(_ file: String = "\(#file)"), then the #file is evaluated in the callee's context whereas for public func foo(_ file: String = #file) is evaluated in the caller's context. Evaluating this in the caller's context is the desired behaviour. (And I don't think #module and #file etc should behave differently).
  • To me, #module.Foo does look a little odd and much like @krzyzanowskim proposed, we could have another pitch for something like CurrentModule.Foo (CurrentModule akin to Self).

Of course, we could also have #module be super polymorphic and be able to yield a String, a StaticString, or a Module :slight_smile:.

Another option might be an apostrophe suffix, when referring to the shadowed outer identifier:
Foo, Foo', Foo'', etc.

Just as a data point, Scala's solution is _root_, so one can refer to the outer foo is _root_.Foo.

That said, I don't advocate for the "root" spelling, but some magic spelling for the "PackageRoot" or CurrentModule or actually the #module.Foo all look nice (don't mind the #module there at all tbh hm...) :slight_smile:

// There's also imports with aliases but let's not go there I think.

This sounds like useful functionality to have.

I don't like the proliferation of yet-another-pound-keyword that makes it hard to pass around contextual information. We're already forced to pass around file:line:function: triples in a whole bunch of different places and, frankly, it's annoying.

So instead of adding #module, can we add #context that becomes a compiler-generated struct value that has a .module field? (and a .file and a .line and …). Then we can add all sorts of cool things to it in the future as needs arise and evolve and not have to splat out more arguments everywhere to all our context-bearing functions.

28 Likes

This has been suggested in almost all of the last few pitches for new magic info keywords, can we please finally get it?

8 Likes

So this does sound sensible, however there's a slight problem I think. That struct would pack all context which would start out to be

  • fileID: 2 words as String (and potential ARC traffic) / 2 words + 1 byte (ie. 3 registers) as StaticString
  • file: 2 words as String (and potential ARC traffic) / 2 words + 1 byte (ie. 3 registers) as StaticString
  • filePath: 2 words as String (and potential ARC traffic) / 2 words + 1 byte (ie. 3 registers) as StaticString
  • file: 2 words as String (and potential ARC traffic) / 2 words + 1 byte (ie. 3 registers) as StaticString
  • function: 2 words as String (and potential ARC traffic) / 2 words + 1 byte (ie. 3 registers) as StaticString
  • line: 1 word
  • column: 1 word
  • dsohandle: 1 word
  • module: 2 words as String (and potential ARC traffic) / 2 words + 1 byte (ie. 3 registers) as StaticString

Summing that up, such a struct would be

  • if the strings are Strings: 15 words & code (that mostly wouldn't do any ARC but potentially can) for 5 ARC retain/release pair
  • if the strings are StaticStrings: 15 words and 5 bytes (which with padding would probably be 21 words (and no ARC))

that's an extremely wide struct with (if modelled as String) quite a bit of code for ARC (that will -- because of immortal strings -- almost never do anything). So I think the problem could be that most people won't need all the information so to not get a pretty high cost for calling cost for calling such functions wouldn't actually use #context and use whatever combination of #file/#line/#module they need...

Of course, we could build some clever data structure but it would still be rather wide and we can't just put it on the heap because that requires an allocation. Maybe we could put all the #context values that appear in the whole program in a special section of the binary but that also doesn't immediately seem like a great idea. Not sure, what do people think?

6 Likes

Given that we're talking about some compile-time magic anyway, is there a chance that something like #context.line could avoid the extra allocations and just put the line directly into the code? i.e. don't allocate the wide struct unless it's actually used by name, like a function that takes a context: CoolCodeContextStruct = #context parameter. In most cases, #context would then act more like a namespace than a proper type. To put it another way, it would be code that's evaluated at compile time, not run time.

3 Likes

Yes, this is what has been talked about before. Usually it's spelled #context(line) or something to separate it from property access, but either form works for me.

2 Likes

These are great points… but I'm also not sure they apply (and please let me know if I'm missing something).

I'm not sure that StaticContext (to give it a name) would need to be passed around like a regular struct, because like StaticString, the data would live in read-only sections of the binary itself. We'd be passing around a pointer to that location and the struct (if it's a struct; maybe it's a class? I don't know) would be reading things out from offsets at that location.

As @ZevEisenberg and @Jon_Shier point out, there's also potentially the opportunity for slimming down a particular StaticContext instance based on what's retrieved from it.

Like I said, I could totally be off-base here, but I'm expecting that static, compiler-produced data lives in immutable memory as a __TEXT or __DATA segment or whatever, and that any API that uses it would be referring to those portions, and not copying it around all over the place.

5 Likes