Swift Package Registry Service

stevapple · June 10, 2020, 1:08pm

Got it. I think it’s worth further discussion. Package registry support is really a huge update so any decision should be well considered and made carefully.

mattt · June 11, 2020, 12:55pm

Yeah, I don't know how adopting this approach without any other optimizations would compare. I think the performance characteristics depend on the implementation of the dependency resolver. In practice it could be much slower, or it could be a wash — like I said before, we really won't know until we benchmark it.

@Aciid discussed some alternative strategies earlier in the thread, and I think those are worth considering. Because the specific affordance in the registry API would depend on the decision we make there, I wanted to keep that separate from the main specification.

I don't know of any current functionality like this in Swift Package Manager, but that could be added as part of supporting package registries. The specification discusses the possibility of adding removal reason and security advisories for packages, either of which could be communicated to the user when they do swift package update.

One of the motivations for adopting a package registry is to mitigate this effect. You could delete the original source code repository without affecting the availability of existing packages.

Helge_Hess1 · June 17, 2020, 8:33pm

I hope this wasn't covered yet, can't possibly read the whole thread :-) But the thing which jumped into my eye is that I think you are inventing a new header here? If so, I think at the minimum it should be X-Accept-Version until it is in an official RFC.

Also HTTP infrastructure (gateways, proxies, etc) AFAIK are not required to (and often don't) preserve arbitrary headers. It became a little less of an issue w/ HTTPS, but it still exists (because backends terminate the SSL at the front and use caching proxies and such behind), especially on mobile networks.

I think the safer way to deal with versioning is the common approach to version in the API URL, or maybe even better, the content payload.

lukasa · June 17, 2020, 8:50pm

Nah, BCP 178 suggests deprecating that practice. With that said, it would be sensible to consider the guidance in RFC 7231 § 8.3.1 (which incidentally also suggests not prefixing the header with X-).

I agree that misbehaving intermediaries do not preserve arbitrary headers, but RFC 7230 § 3.2.1 does require them to:

A proxy MUST forward unrecognized header fields unless the field-name is listed in the Connection header field (Section 6.1) or the proxy is specifically configured to block, or otherwise transform, such fields.

It is definitely more common to version elsewhere though, and that will be more resilient to poorly-implemented HTTP infrastructure. Whether that's a good enough reason to change the behaviour is not cut-and-dried, in my view: I can see compelling arguments either way.

Karl · June 17, 2020, 9:03pm

What are the arguments against the more resilient option (or for the less-resilient option)?

Setting the compatibility bar as low as possible sounds like it should be a (implicit) design goal. You can imagine there might be Swift developers in developing countries - or even developed countries, TBH - who have no choice but to use less-than-excellent infrastructure.

Helge_Hess1 · June 17, 2020, 9:06pm

Fair enough. A MIME type parameter might be another option. Something like Accept/Content-type: application/json+swiftregistry-v1 or application/json...;version=1.

(though my favorite is versioning the actual content, i.e. an XML w/ a proper versioning attribute if even necessary. ups, we do JSON, sigh ;-) )

lukasa · June 17, 2020, 9:10pm

The argument against is that it is broadly just a bit harder to implement the other options. Putting things in the URL is fine, but it leads to multiple URLs for the same resource, which is suboptimal (particularly for caching). Putting things in the payload is better but it can make parsing a bit more annoying, as you cannot properly perform a parse until you've performed at least a partial parse to help tell you what you actually want.

So the question becomes: should we not do the ideal thing because we're worried about breakage? I think the worry is largely hypothetical: I am not aware of intermediaries that are behaving this way today. But I'm aware that it's a risk.

Yup, and this has the advantage that IANA is pretty liberal with MIME type representations. This also behaves really well with caches (everything just automatically works as it's accepted that Accept/Content-Type are fields that affects caching).

Helge_Hess1 · June 17, 2020, 9:14pm

This also behaves really well with caches (everything just automatically works as it's accepted that Accept/Content-Type are fields that affects caching).

The headers which affect caching need to be mentioned in the Vary header. Which maybe should be mentioned if there ends up being a spec.
(not that actual infrastructure cares about things like RFCs and such).

mattt · June 18, 2020, 1:43pm

As @lukasa correctly points out, the X- prefix is deprecated by BCP 178 / RFC 6648. And according to RFC 7230 should be preserved by proxies. Accept-Version and Content-Version is the most common spelling I saw, and I didn't find and examples of these causing problems for other web APIs.

For this proposal, we considered four possible versioning strategies:

Path: http://api.example.com/v1
Subdomain: http://api.v1.example.com
Custom MIME type in Accept header: Accept: application/vnd.example.v1+json
Custom request header: Accept-version: v1

Of these, we felt that the custom request header was the solution that was most likely to be implemented correctly by both clients and servers and least likely to cause contingencies.

For example, Accept: application/json+swiftregistry-v1 would introduce nontrivial complexity for both clients and servers, as networking libraries tend not to provide built-in handling of custom media types. I worry that this would impair content negotiation for single endpoints like GET {/namespace*}/{package}/{version} that can respond with multiple different media types.

I'm not aware of any actual compatibility issues with the Accept-Version / Content-Version or other vendor-specific headers, and am much less concerned about any hypothetical networking infrastructure than the potential for implementation mistakes on the client and server.

yim_lee · July 2, 2020, 5:40pm

Quoting from related thread:

This leads me to thinking if we should reduce the core APIs to just "list versions" and "get version" and leave it up to individual registries to decide if/how they want to handle delete and publish. For instance, an implementation may want to require release tag for publish. As for delete, I understand that it's optional, but maybe we should just exclude it to avoid confusion.

In other words, I am suggesting that the proposal shouldn't define how registry adds package versions to its catalog. The registry's main purpose is to serve packages, and it learns about the packages (by extracting metadata from file(s) for example) that it has processed so it can provide information about them as well.

Mordil · July 3, 2020, 3:54am

I support @yim_lee's suggestion of reducing the scope of the formal spec to only indexing and listing, as that is the spec that mostly will affect SPM.

We can leave it to further evolution once we have some experience in managing the lists, as there are more ways than one that are completely different in approach.

ma11hew28 · March 29, 2021, 3:13pm

Efficiency: Git allows you to create a minimal shallow clone of a specific remote branch or tag with a history truncated to one commit.

For example:

git clone -b 0.4.1 --depth 1 --no-tags --recurse-submodules --shallow-submodules https://github.com/apple/swift-argument-parser.git

See git help clone for more info.

mattt · March 29, 2021, 4:46pm

While a shallow clone can produce better results than a full clone, it's considered to be an expensive operation on the server. The GitHub Blog recently published an excellent article about various ways to speed up cloning. Here's what it had to say about shallow clones with respect to other options:

There are three ways to reduce clone sizes for repositories hosted by GitHub.

git clone --filter=blob:none <url> creates a blobless clone . These clones download all reachable commits and trees while fetching blobs on-demand. These clones are best for developers and build environments that span multiple builds.

git clone --filter=tree:0 <url> creates a treeless clone . These clones download all reachable commits while fetching trees and blobs on-demand. These clones are best for build environments where the repository will be deleted after a single build, but you still need access to commit history.

git clone --depth=1 <url> creates a shallow clone . These clones truncate the commit history to reduce the clone size. This creates some unexpected behavior issues, limiting which Git commands are possible. These clones also put undue stress on later fetches, so they are strongly discouraged for developer use. They are helpful for some build environments where the repository will be deleted after a single build.

Separate from this proposal, I'd be interested for Swift Package Manager to try blobless clones and see how that affects overall performance.

Also, for what it's worth: In an earlier post, I profiled shallow clones against curl && unzip and found HTTP to be faster. In most cases, the registry interface should result in less network traffic and a smaller amount of data on disk.

ma11hew28 · March 29, 2021, 11:02pm

I don't think creating a shallow clone is "considered to be an expensive operation on the server". In that article the author (Derrick Stolee) says "calculating a shallow fetch is computationally more expensive", which I imagine is what you're referring to, and I didn't know that, so thanks for informing me :-), but by "shallow fetch", I think Stolee means "fetching new commits" from a shallow clone, not creating a shallow clone. I think creating a shallow clone (as opposed to another type of clone) is (for both the client and the server) the most efficient way to create a clone.

But the Swift Package Manager (SwiftPM) may occasionally need to fetch new commits (when updating one of your project's dependency's). (And as Stolee suggests, and you seem to agree, running git fetch from a blobless clone would probably be more efficient than running git fetch from a shallow clone.) So I think (as you suggest) having SwiftPM try blobless clones makes sense. Does SwiftPM currently do full clones?

Sorry, I didn't read all the replies on this thread and missed your post elaborating on speed and efficiency. Thanks for pointing that out. Nice. I think adding the --no-tags flag to the shallow clone command may make that command a tiny bit faster. Also, both (full and shallow) clone commands don't download submodules and GitHub archives don't include submodules, so your benchmark may not be completely accurate for packages that have submodules, such as PromiseKit. To also include submodules, you could add the --recurse-submodules flag to both clone commands and the --shallow-submodules flag to the shallow clone command and you could create a release on GitHub and add a custom ZIP file, which you could create by running a command like find . -name .git | xargs rm -rf from within the shallow clone and then compressing the resulting shallow clone. (I just realized that GitHub archives include other files and directories that seem to only be used by Git or GitHub, such as .github, .gitignore, .gitmodules, .tidelift.yml, and .travis.yml. Anyway, maybe GitHub should remove those too before generating their archives.)

While curl && unzip may be the fastest download method, I imagine it'd be faster to update a blobless clone (rather than redownloading everything again). I guess the most efficient solution would be to use curl && unzip to download a dependency and something like rsync when it needs to be updated.

You probably already know about this, but I just found out about the Swift Package Registry. Anyway, it'd be great if GitHub Packages supported Swift packages. I guess y'all are still working on it. Thank you for your contributions and best wishes. :-)

shahzadmajeed · August 4, 2022, 7:33pm

Hi,

Is this proposal implemented in Swift 5.7 as the proposal page mentions? It looks like attached implementation PR is actually closed and not merged.

Thank you!

josercc · August 5, 2022, 12:23am

I suspect that this proposal has been abandoned

brennanMKE · August 5, 2022, 12:26am

Yes, you can see the repo was archived in February. It is not clear to me why the implementation was halted and no alternative solution has emerged.

hassila · August 5, 2022, 6:54am

There’s this:

0xTim · August 5, 2022, 12:42pm

So just to clear some things up - the changes to SwiftPM for it to be able to talk to a package registry are available in 5.7 The original proposal was accepted and there has been recent work in SwiftPM to expose this in 5.7 (e.g. expose swift-package-registry as top level tool by tomerd · Pull Request #5679 · apple/swift-package-manager · GitHub).

The other side of it is an actual registry. I have no idea what the status (if any) of the GitHub registry is, but as @hassila pointed out there's an implementation for Artifactory, which is primarily aimed at enterprise in order to put packages behind a firewall. I'm sure others will spin up as well

tomerd · August 5, 2022, 3:56pm

@0xTim that is correct. SwiftPM 5.7 includes an implementation of the client side for the package registry per SE-0292 and can work with servers such as the one offered by JFrog's Artifactory. The PR you linked to helps make configuring SwiftPM to use a registry more easily but is not required - one can edit the configuration file manually as well.