[Pitch] Adding metadata to SPM Package

Hi, per a suggestion from @ddunbar I'd like to pitch the idea of adding a .metadata category to SPM Package definition. The motivation is to support the (eventual) expansion of the use of SPM for more user-facing targets with the addition of SwiftUI, Xcode 11 support for SPM, etc. This follows common practice with other packaging managers such as gems, npm, and pod specs.

Further, I would like to pitch a Swifty version of this metadata, something like the following would be ideal for our use cases:

let package = Package(
    name: "MySwiftProject",
    products: [
    .library(
        name: "MySwiftProject",
        targets: ["MySwiftProject"]),
    ],
    targets: [
      .target(
        name: "MySwiftProject",
        dependencies: [])
    ],
    metadata: [
      .description("SwiftUI Extensions to help center text automatically."),
      .license(type: "BSD-3-Clause"),
      .homepage(url: "https://github.com/jasonhawkins/SwiftUIExtensions"),
      .screenshots(urls: ["https://github.com/moflo/pic1.png","https://github.com/moflo/pic2.png"]),
      .bugs(url: "https://github.com/moflo/SwiftUIExtensions", email: "bugs@moflo.me"),
      .keywords(["keyword1","keyword2","keyword3"])
    ]
)

Thank you!

9 Likes

I have no strong opinions about whether or not this belongs in the package manager itself. As an author of some tools that use this sort of information and I am already accustomed to a combination of a separate configuration file and familiar /// documentation comments in the package manifest.

However, if the manifest itself does start taking on descriptions intended for presentation to humans, then I really think it should support localization:

.descriptions([
    "en-US": "SwiftUI Extensions to help center text automatically."
    "en-GB": "SwiftUI Extensions to help centre text automatically."
    "de": "Erweiterungen zu SwiftUI, um Text automatisch zu Zentrieren."
    "fr": "Des extensions pour SwiftUI, afin de centrer du texte automatiquement."
]),
.homepage(urls: [
     "en-US": "https://swiftui-ext.com/en-US/",
     "en-GB": "https://swiftui-ext.com/en-GB/",
     "de": "https://swiftui-ext.com/de-DE/",
     "fr": "https://swiftui-ext.com/fr-FR/"
]),
3 Likes

I am not sure if I like a typed version for metadata. It's not clear what fields should go there and how to evolve it. Most of these can be derived using the git host (GitHub/GitLab) if they have the relevant APIs. It feels like this metadata belongs in a package index rather than the manifest.

4 Likes

Sure, but there is precedent in all of the references I mentioned for both typed metadata fields (E.g., CocoaPods Ruby based Specs being used for search) and having somewhat obvious derived fields being curiously non-obvious (e.g., GitHub Packages pulling “homepage” info from package.json that links to a GitLab public repo!)

How would the metadata get into such an index if it wasn't part of the manifest? A secondary manifest? That seems unnecessary. Is there any particular reason why you feel SPM shouldn't follow the rest of the industry here?

As for the actual proposal, we may want metadata to be a type with an initializer, rather than an array of properties, if only to remove the possibility of duplicate entries. They could be defaulted to nil, made alphabetical, and new entries could be added over time without breaking I think. Once a full proposal is sketched out we can discuss the exact types of metadata and possible values.

2 Likes

Maybe? As @SDGGiesbrecht there are other considerations like localization. I don't like the idea of filling the manifest with properties that are not related to package configuration. This metadata is usually already present in the readme file of the package. Maybe the index could extract it from there?

That's not what I am saying. Taking inspiration from the industry is definitely good but we also need to consider how and in what form these feature make sense for SwiftPM.

I guess my question was, what makes SPM different from every other package manager which supports metadata like this? How are its consideration for what "makes sense" to support any different? Is there a design document that may outline such considerations?

3 Likes

I think it makes sense to have a discussion rather than just blindly copying features from other package manager, which is why we have an evolution process for SwiftPM. Sure, SwiftPM might end up putting this metadata in the manifest file but exploring possible solutions isn't a bad thing.

2 Likes

I didn't think I saw it mentioned but Python also includes this metadata in a module's setup.py as well.


Members of the community are more than welcome to propose other ideas and we will gladly discuss them. I don't think anyone has said that discussing other ideas is bad or unwelcome, but are more curious about what are those other ideas and what makes them better than following the prior art established by many other languages.

Clearly people are comfortable and used to how other languages declare their metadata in each package's manifest. I don't think it counts as "blindly copying" when it is a very well established way of doing things across multiple languages.

The only counterargument made so far:

Both the index tools I am familiar with (Rubygems and Python Pip) parse this information from their respective package manifests. Prior art is a very valid argument. Using the two most popular package indexes as motivation for a design seems reasonable to me.

Parsing an entire readme to pull out package metadata from a potentially infinite number of formats seems excessive (not to mention impossible) when it could be easily declared in the package manifest in a very organized and standard manner.

It feels natural to include all metadata in a single location as opposed to having two locations where metadata could be contained. Oh I want the dependencies of the package? I have to look in the Package.swift, but to get the version I have to look in Metadata.txt. Or should I redeclare all the dependencies in the metadata file as well?

This is another area we could rely on prior art to determine all the relevant metadata that is permitted. I'm not sure how frequently the available metadata keys changes...seems like something that you have a set of things available and then you almost never have to change.

I wouldn't be opposed to having it declared in a dictionary format to prevent it from being too rigid, although having it strongly typed prevents simple typos from causing an index to fail to pick up your package's metadata.


@SDGGiesbrecht do you have any examples of how other languages do localization in their package metadata? All my experience with Python and Ruby packages has been with pure English manifests (Even things like ActiveSupport in ruby has their entire gemspec in English). Is this something that would have to be a requirement?

2 Likes

Good point @Ponyboy47 @SDGGiesbrecht about localization. And although I know it's well outside the scope of SPM, the new SwiftUI approach to assume string localizations may be instructive here. Essentially they assume all static strings are default localizable with a simple Localizable.strings lookup file. So, for example:

metadata: [
  .description("SwiftUI Extensions to help center text automatically."),
  .license(type: "BSD-3-Clause"),

  .localizable({"jp":
    ["SwiftUI Extensions to help center text automatically.":"SwiftUI拡張機能、テキストを自動的に中央揃えにする。"]
   }),

Or, perhaps just point to a dictionary file (PList/JSON) within the repo, for example:

  .localizable({"jp":"localizable.jp.json"}}

Thanks.

I never used open source before I switched to Swift, so I don’t really know what other languages do. Swift is fairly unique in that its active support of Unicode source, which allows you code it in Chinese, Arabic or Russian—and people do. For that reason I think it is much more important for Swift to handle localization than it is for the many languages out there that only permit ASCII in their identifiers anyway. You do not have to learn English to write Swift. (While my open source contributions are mostly in English, I have already needed to code entire projects in German and in Greek—albeit minus keywords like func.)

What I can do is show you how I handle multilingual metadata for Swift at the moment in the absence of an official model. Symbol documentation extends the general Swift model like this. That strategy is extended into the package manifest like this. And what’s left is declared in a separate file like this.

Right, I was just trying to say that always using the prior art should not be the only approach. It seems like we all agree on that.

SwiftPM could have a standardize way to define these in the readme but I guess people wouldn't want to have restrictions on how their readme is structured.

Maybe a good way would be figuring out what kind of metadata is used in other communities and then starting with typed versions for the most common/useful subset while having a string-based escape hatch for further customization.


We could also consider separating the metadata from the Package initializer. This way everything is still in the manifest but there is a better distinction between package's configuration and metadata. It also helps with the fact that metadata can get fairly verbose.

Something like:

let package = Package(...)

let metadata = Metadata(...)
2 Likes

What problem does metadata solve? Who and how will parse it? Who will respect the keywords, licence and so on? For example, I'm sure LICENSE file will be present in addition to the metadata: section inside Package.swift.

Unless it actually can do something useful, e.g. if it can be used to setup dependencies like in Python, I mostly find this metadata section useless.

Package.swift for projects that are more than a module declaration is already quite noisy place which hard to maintain, and I'd prefer cleaning it (preferable even generating it) rather than adding other things into it.

1 Like

That sounds like a reasonable idea to me. If certain "extra" metadata items every become commonplace they could be added to the set of typed metadata options.

This also sounds like a good idea to me, but I am curious about how things that may be considered both metadata and part of the manifest would be handled (dependencies, package name, supported swift versions, etc). Will a consumer of the package metadata have to explicitly look at both the manifest information and the metadata to get everything? Or should things from the manifest be automatically included in the metadata?

My experience says that package indexes tend to be the primary consumer of metadata and this metadata makes to job of building a package index feasible. Rather than parsing Readme's, Licenses, and any other generic text files, the metadata to be displayed in the index can be easily written by devs and resolved by the package index using the SwiftPM library.

While there is no Swift Package Index today, having a standard way to declare and retrieve package metadata is a good stepping stone towards getting one. Without a solid foundation for setting and retrieving metadata, building a package index is an impossible task that would require parsing text files that could come in infinite forms or having developers manually submit all their package metadata to the package index with every update. If you think maintaining the Package.swift is hard, how would you like to have to manually submit every metadata change to a package index?

Stating that "here is the way we expect your package metadata to be in the readme" would be a plausible solution, but is more difficult to validate and enforce. A simple unnoticed typo means your metadata is not recognized while having in Package.swift would mean your project wouldn't even compile.

3 Likes

SPM resolves Package.swift by executing it and writing JSON from Package.init method into a provided file descriptor, because there is no clear way to walk through not yet compiled swift code. Now you can also assume that some similar technique will be required by the index parser thing you're talking about in order to extract metadata information from Package/Whatever.swift. That's too much complexity just to parse this I guess. That's why I assume that an initially machine readable input like JSON or YAML will be more appropriate by all means.

If you think maintaining the Package.swift is hard, how would you like to have to manually submit every metadata change to a package index?

I don't see any difference between committing a change of Package.swift file or committing a change to JSON/a bunch of files like LICENSE. Also, there will definitely be a redundancy, as usually SCMs require you to have e.g. LICENSE file to be available in the repo root to present it in the UI, and now you will have this information duplicated inside Package.swift.

1 Like

With SwiftPMs current design, maintaining the package file can already be difficult for library authors - you actually want to avoid adopting new things as it forces you to maintain multiple versions of the file to support older SwiftPMs. To the point, one is better off not using the new values added to SwiftVersion and instead using .version("5.1") to avoid having to maintain lots of revs to express support. So with metadata going into the same files, it seems like we'd be asking library maintainers to have to support multiple versions of the package file to also deal with the metadata spec evolving with time. This then has two ripple effects - which package file does one trust for the metadata data? And/Or do those indexing systems have to duplicate the logic for fixing versioned manifests (names, tags, etc.) so they get that right also?

There is also the issue of parsing the swift file. Given that already has issues with code execution, it seems like the safer thing would be for indexes to do something like @beefon suggests, and for things to instead use the JSON file SwiftPM makes (or SwiftPM getting a command to directly dup said JSON or atleast the metadata part). If if the tools for this data are likely to end up using that JSON instead, why not just cut to the chance and have the data outside the Package file, and leave the Package file just for how to build.

Thank you for this bit of information as I did not realize that. To me though, this just means that parsing Package.swift is going to be slower than simply parsing a machine readable input file (which I already assumed anyways). The infrastructure to read Package.swift is already there and this would be a relatively simple extension of that pre-existing machinery. One may argue that the performance for an index may be a necessity and that the difference is significant enough to justify building a new set of functionality that would parse metadata directly from a JSON or YAML file.

Perhaps it would be best to not include it in SwiftPM at all but rather have it be declared by a Swift Package Index framework. Then the SPI would say "If you want to be recognized by our index, then create a metadata.json file that has these any/all of the following supported keys..." This would be a perfectly valid decision to come from this discussion, but I think it further delays the possibility of getting a Swift Package Index.

Not that this would necessarily speed anything up either, but it at least get us closer and makes it easier for someone to go create their own because the infrastructure would be available with guaranteed support from the language. This sure beats making people go out and create their own standard and hope it takes off. Whatever comes from this discussion, I really just want there to be a supported way for package maintainers to declare metadata and index builders to get at the metadata.

I would rather build the infrastructure into SwiftPM which already has much of the information that would be needed rather than creating a totally separate thing that requires duplicating both the functionality and the desired information.

If this is such a big issue then why was SwiftPM built using a swift file in the first place? There are already issues when things change in SwiftPM. There have been different versions for swift 3, 4 and 5. This is already an issue today and I don't think putting metadata would make it any worse. There is likely going to be a swift-tools-version: 6.0 and just also adding metadata during that update wouldn't make it any worse.

Metadata for a package index is fairly stable in the sense that there are a certain set of items that are frequently used and I honestly don't see this metadata section changing nearly as much as the actual Package information has in the past and potentially will in the future. Swift-tools-version: 7 is way more likely to change because of things in the Package manifest than the metadata.

While I do agree that supporting multiple SwiftPM versions is a nightmare, I think that is a separate issue to solve that is more about the issues with SwiftPM than with this pitch.

There have been posts on the forms in the past about the potential issues with code execution, etc. At the moment the only thing we know for sure that would be adding with 6.0 is the new enum constant. If the only real change is metadata, then it would be seem to be a shame if just adding metadata forced library authors to go throught the complexity required to support yet another version of the manifest just for this feature SwiftPM doesn't even use.

I guess my point is we're making this worse to support something SwiftPM isn't even going to directly use.

Here's one idea to potentially resolve this problem:

It'd be great if someone can write a full proposal for supporting multiple tools versions in a single manifest.

1 Like