[Pitch] Adding metadata to SPM Package

I think it makes sense to have a discussion rather than just blindly copying features from other package manager, which is why we have an evolution process for SwiftPM. Sure, SwiftPM might end up putting this metadata in the manifest file but exploring possible solutions isn't a bad thing.

2 Likes

I didn't think I saw it mentioned but Python also includes this metadata in a module's setup.py as well.


Members of the community are more than welcome to propose other ideas and we will gladly discuss them. I don't think anyone has said that discussing other ideas is bad or unwelcome, but are more curious about what are those other ideas and what makes them better than following the prior art established by many other languages.

Clearly people are comfortable and used to how other languages declare their metadata in each package's manifest. I don't think it counts as "blindly copying" when it is a very well established way of doing things across multiple languages.

The only counterargument made so far:

Both the index tools I am familiar with (Rubygems and Python Pip) parse this information from their respective package manifests. Prior art is a very valid argument. Using the two most popular package indexes as motivation for a design seems reasonable to me.

Parsing an entire readme to pull out package metadata from a potentially infinite number of formats seems excessive (not to mention impossible) when it could be easily declared in the package manifest in a very organized and standard manner.

It feels natural to include all metadata in a single location as opposed to having two locations where metadata could be contained. Oh I want the dependencies of the package? I have to look in the Package.swift, but to get the version I have to look in Metadata.txt. Or should I redeclare all the dependencies in the metadata file as well?

This is another area we could rely on prior art to determine all the relevant metadata that is permitted. I'm not sure how frequently the available metadata keys changes...seems like something that you have a set of things available and then you almost never have to change.

I wouldn't be opposed to having it declared in a dictionary format to prevent it from being too rigid, although having it strongly typed prevents simple typos from causing an index to fail to pick up your package's metadata.


@SDGGiesbrecht do you have any examples of how other languages do localization in their package metadata? All my experience with Python and Ruby packages has been with pure English manifests (Even things like ActiveSupport in ruby has their entire gemspec in English). Is this something that would have to be a requirement?

2 Likes

Good point @Ponyboy47 @SDGGiesbrecht about localization. And although I know it's well outside the scope of SPM, the new SwiftUI approach to assume string localizations may be instructive here. Essentially they assume all static strings are default localizable with a simple Localizable.strings lookup file. So, for example:

metadata: [
  .description("SwiftUI Extensions to help center text automatically."),
  .license(type: "BSD-3-Clause"),

  .localizable({"jp":
    ["SwiftUI Extensions to help center text automatically.":"SwiftUI拡張機能、テキストを自動的に中央揃えにする。"]
   }),

Or, perhaps just point to a dictionary file (PList/JSON) within the repo, for example:

  .localizable({"jp":"localizable.jp.json"}}

Thanks.

I never used open source before I switched to Swift, so I don’t really know what other languages do. Swift is fairly unique in that its active support of Unicode source, which allows you code it in Chinese, Arabic or Russian—and people do. For that reason I think it is much more important for Swift to handle localization than it is for the many languages out there that only permit ASCII in their identifiers anyway. You do not have to learn English to write Swift. (While my open source contributions are mostly in English, I have already needed to code entire projects in German and in Greek—albeit minus keywords like func.)

What I can do is show you how I handle multilingual metadata for Swift at the moment in the absence of an official model. Symbol documentation extends the general Swift model like this. That strategy is extended into the package manifest like this. And what’s left is declared in a separate file like this.

Right, I was just trying to say that always using the prior art should not be the only approach. It seems like we all agree on that.

SwiftPM could have a standardize way to define these in the readme but I guess people wouldn't want to have restrictions on how their readme is structured.

Maybe a good way would be figuring out what kind of metadata is used in other communities and then starting with typed versions for the most common/useful subset while having a string-based escape hatch for further customization.


We could also consider separating the metadata from the Package initializer. This way everything is still in the manifest but there is a better distinction between package's configuration and metadata. It also helps with the fact that metadata can get fairly verbose.

Something like:

let package = Package(...)

let metadata = Metadata(...)
2 Likes

What problem does metadata solve? Who and how will parse it? Who will respect the keywords, licence and so on? For example, I'm sure LICENSE file will be present in addition to the metadata: section inside Package.swift.

Unless it actually can do something useful, e.g. if it can be used to setup dependencies like in Python, I mostly find this metadata section useless.

Package.swift for projects that are more than a module declaration is already quite noisy place which hard to maintain, and I'd prefer cleaning it (preferable even generating it) rather than adding other things into it.

1 Like

That sounds like a reasonable idea to me. If certain "extra" metadata items every become commonplace they could be added to the set of typed metadata options.

This also sounds like a good idea to me, but I am curious about how things that may be considered both metadata and part of the manifest would be handled (dependencies, package name, supported swift versions, etc). Will a consumer of the package metadata have to explicitly look at both the manifest information and the metadata to get everything? Or should things from the manifest be automatically included in the metadata?

My experience says that package indexes tend to be the primary consumer of metadata and this metadata makes to job of building a package index feasible. Rather than parsing Readme's, Licenses, and any other generic text files, the metadata to be displayed in the index can be easily written by devs and resolved by the package index using the SwiftPM library.

While there is no Swift Package Index today, having a standard way to declare and retrieve package metadata is a good stepping stone towards getting one. Without a solid foundation for setting and retrieving metadata, building a package index is an impossible task that would require parsing text files that could come in infinite forms or having developers manually submit all their package metadata to the package index with every update. If you think maintaining the Package.swift is hard, how would you like to have to manually submit every metadata change to a package index?

Stating that "here is the way we expect your package metadata to be in the readme" would be a plausible solution, but is more difficult to validate and enforce. A simple unnoticed typo means your metadata is not recognized while having in Package.swift would mean your project wouldn't even compile.

3 Likes

SPM resolves Package.swift by executing it and writing JSON from Package.init method into a provided file descriptor, because there is no clear way to walk through not yet compiled swift code. Now you can also assume that some similar technique will be required by the index parser thing you're talking about in order to extract metadata information from Package/Whatever.swift. That's too much complexity just to parse this I guess. That's why I assume that an initially machine readable input like JSON or YAML will be more appropriate by all means.

If you think maintaining the Package.swift is hard, how would you like to have to manually submit every metadata change to a package index?

I don't see any difference between committing a change of Package.swift file or committing a change to JSON/a bunch of files like LICENSE. Also, there will definitely be a redundancy, as usually SCMs require you to have e.g. LICENSE file to be available in the repo root to present it in the UI, and now you will have this information duplicated inside Package.swift.

1 Like

With SwiftPMs current design, maintaining the package file can already be difficult for library authors - you actually want to avoid adopting new things as it forces you to maintain multiple versions of the file to support older SwiftPMs. To the point, one is better off not using the new values added to SwiftVersion and instead using .version("5.1") to avoid having to maintain lots of revs to express support. So with metadata going into the same files, it seems like we'd be asking library maintainers to have to support multiple versions of the package file to also deal with the metadata spec evolving with time. This then has two ripple effects - which package file does one trust for the metadata data? And/Or do those indexing systems have to duplicate the logic for fixing versioned manifests (names, tags, etc.) so they get that right also?

There is also the issue of parsing the swift file. Given that already has issues with code execution, it seems like the safer thing would be for indexes to do something like @beefon suggests, and for things to instead use the JSON file SwiftPM makes (or SwiftPM getting a command to directly dup said JSON or atleast the metadata part). If if the tools for this data are likely to end up using that JSON instead, why not just cut to the chance and have the data outside the Package file, and leave the Package file just for how to build.

Thank you for this bit of information as I did not realize that. To me though, this just means that parsing Package.swift is going to be slower than simply parsing a machine readable input file (which I already assumed anyways). The infrastructure to read Package.swift is already there and this would be a relatively simple extension of that pre-existing machinery. One may argue that the performance for an index may be a necessity and that the difference is significant enough to justify building a new set of functionality that would parse metadata directly from a JSON or YAML file.

Perhaps it would be best to not include it in SwiftPM at all but rather have it be declared by a Swift Package Index framework. Then the SPI would say "If you want to be recognized by our index, then create a metadata.json file that has these any/all of the following supported keys..." This would be a perfectly valid decision to come from this discussion, but I think it further delays the possibility of getting a Swift Package Index.

Not that this would necessarily speed anything up either, but it at least get us closer and makes it easier for someone to go create their own because the infrastructure would be available with guaranteed support from the language. This sure beats making people go out and create their own standard and hope it takes off. Whatever comes from this discussion, I really just want there to be a supported way for package maintainers to declare metadata and index builders to get at the metadata.

I would rather build the infrastructure into SwiftPM which already has much of the information that would be needed rather than creating a totally separate thing that requires duplicating both the functionality and the desired information.

If this is such a big issue then why was SwiftPM built using a swift file in the first place? There are already issues when things change in SwiftPM. There have been different versions for swift 3, 4 and 5. This is already an issue today and I don't think putting metadata would make it any worse. There is likely going to be a swift-tools-version: 6.0 and just also adding metadata during that update wouldn't make it any worse.

Metadata for a package index is fairly stable in the sense that there are a certain set of items that are frequently used and I honestly don't see this metadata section changing nearly as much as the actual Package information has in the past and potentially will in the future. Swift-tools-version: 7 is way more likely to change because of things in the Package manifest than the metadata.

While I do agree that supporting multiple SwiftPM versions is a nightmare, I think that is a separate issue to solve that is more about the issues with SwiftPM than with this pitch.

There have been posts on the forms in the past about the potential issues with code execution, etc. At the moment the only thing we know for sure that would be adding with 6.0 is the new enum constant. If the only real change is metadata, then it would be seem to be a shame if just adding metadata forced library authors to go throught the complexity required to support yet another version of the manifest just for this feature SwiftPM doesn't even use.

I guess my point is we're making this worse to support something SwiftPM isn't even going to directly use.

Here's one idea to potentially resolve this problem:

It'd be great if someone can write a full proposal for supporting multiple tools versions in a single manifest.

1 Like

I think this issue is being conflated. While swift is evolving so drastically between versions most library authors are just dropping support for previous versions when they update to a new version and I think this will still be the case until module stability becomes commonplace. "Most" is just my opinion as I haven't done any real investigation and have just seen several popular SwiftPM packages that have been around since at least Swift 3 that in the past and even today have always only supported the latest version of swift and as such have also only had a single Package.swift.

From this observation, I personally think it is not as big a deal as it seems. I recognize that it will definitely affect some people, but the underlying issue is not with this pitch.

This does hold weight and is why I said earlier:

I'm going to keep arguing for keeping it in a single location where other metadata could be reused because I view the benefits of (probably) most people reusing metadata from the manifest as outweighing the cost of having the functionality in SwiftPM, but I do see the value in having a separate metadata file instead. I'll honestly be happy with whatever is decided, as long as something is decided.

As an co-maintainer of SwiftProtobuf our policy has always beed to support back one full Swift version (so as of 5.0, we support back to 4.0 toolchains). If a library is doing networking/parsing of foreign data/etc. on client devices, then you can't always assume your customers will be able to updated to the latest/greatest at any time. You might have a security fix and you need to ensure adoption of your update is as easy as possible, so either you support back old versions and/or you also commit to porting fixes to older branches. For our library, it is has been easier to ensure we support older versions so clients can update on their schedules.

2 Likes

I know that this would affect some people, but regardless of whether or not the metadata were to be added in swift-tools-version: 6.0, you would still have to support a new version of SwiftPM after the update. Adding the ability to set metadata in the Package.swift does not change this. Therefore this is not directly an issue with this pitch per-say, but with SwiftPM. Or are you saying that you would be opposed to any and all updates/changes to SwiftPM period? (at least ones requiring a new version of SwiftPM that you have to support)

As @Aciid pointed out, there is one potential idea to resolve this issue which would mitigate many of the issues with maintaining multiple Package manifests (it would at least all go into a single file instead of several).

If there were a solution in SwiftPM which resolved the issue of maintaining separate Package.swift files for multiple SwiftPM versions, would that alleviate the majority of your concerns with placing the metadata in the Package.swift?

I don't follow, all we'll have to add is .version("6"). No need for a versioned package file.

Yes, it if truly resolved the need for it. There likely would be a time period where both are needed, but eventually that will window down. If that system does happen, I hope we can retire/remove the old versioning methods, if they are still needed, then that implies the new things still isn't complete, and the concerns are still valid.

1 Like

My bad. I had assumed by "swift toolchains" you meant that you were needing to support multiple versions of the SwiftPM Package.swift manifest.

Yea, no, we can support version by just listing the support versions when there aren't other changes required. The problem cases has been historically where things are added/changed in how one has to express things in the file, and that leads to the pain points with multiple versions. That's where my fears are with adding something new that has to be expressed in the file (and might take some future changes to finish getting right).

That is basically what I already did, though I was just aiming for documentation generation, not for a package index. My strategy already loads descriptions, a license and several project links, as well as the package manifest itself. If you want to experiment with it, all you would have to do is depend on Workspace, stick it in edit mode, and expose the WSProject module as a product to get access to these two methods (1, 2) and their lighter variants. It isn’t fast (especially until SE‐0226 is implemented), but it can already do what you want. If you are aiming to put together a package index, I am certainly open to exposing and adjusting the API if it can be helpful toward your goal.