Package Registry Service - Publish Endpoint

Hi everyone,

Early in the process of pitching the package registry service, we decided to reduce scope to only the endpoints necessary for resolving package dependencies through a registry. Now that SE-0292 is accepted, I'm excited to revisit this functionality.

Below is our pitch for adding a new publishing endpoint to the package registry specification, which would provide a standard way for new package releases to be added to a registry. The first half of the post describes the feature in the structure of a Swift Evolution proposal. The second half contains an update to the registry specification, with more precise details about what this means for implementors.

Thanks to @yim_lee and @mmarston for providing some excellent feedback on an earlier draft of this proposal. I look forward to getting even more feedback from the community ahead of our formal proposal.


Package Registry Service - Publish Endpoint

Introduction

The package registry service defines endpoints for fetching packages.

This proposal extends the existing package registry specification with endpoints for publishing a package release.

Motivation

A package registry is responsible for determining which package releases are made available to a consumer.

Currently, the availability of a package release is determined by an out-of-band process. For example, a registry may consult an index of public Swift packages and make releases available for each tag with a valid version number.

Having a standard endpoint for publishing a new release to a package registry would empower maintainers to distribute their software and promote interoperability across service providers.

Proposed solution

We propose to add the following endpoint to the registry specification:

Method Path Description
POST /{scope}/{name}/{version} Create a package release

The goal of this proposal is to provide enough definition to ensure a secure, robust mechanism for distributing software while allowing individual registries enough flexibility in their governance and operation. For instance, support for this endpoint would be optional, so package registries may elect not to allow packages to be published. And because there's an expectation of durability — that is, package releases aren't removed after they're published — registries make the ultimate determination of what is made available.

Detailed design

This proposal amends the registry specification with a new, optional endpoint that a registry may implement to support the publication of packages through the web service interface. To understand what the feature does and how it works, consider the following use case:

A maintainer of an open-source Swift package ( mona.LinkedList ) creates a new release (version 1.1.1 ), and wants to submit it to a registry ( packages.example.com ) for distribution.

First, they run the swift package archive-source subcommand to generate a Zip file ( LinkedList-1.1.1.zip ) of their package.

$ swift package archive-source

Next, they upload their release to a package registry by making the following request:

$ curl -X POST --netrc \
     -H "Accept: application/vnd.swift.registry.v1+json" \ 
     -F source-archive="@LinkedList-1.1.1.zip" \ 
     "https://registry.example.com/mona/LinkedList?version=1.1.1"

The registry can respond to this request synchronously or asynchronously. This allows the server an opportunity to perform any necessary analysis and processing to ensure software quality and update its data stores.

After receiving and processing this request, the registry can make mona.LinkedList at version 1.1.1 available by including it in the response to GET /mona/LinkedList .

$ curl -X GET -H "Accept: application/vnd.swift.registry.v1+json" \ 
         "https://registry.example.com/mona/LinkedList" \ 
         | jq ".[] | keys" 

[ "1.0.0", "1.1.0", "1.1.1", ]

The next time a developer with a package that depends on mona.LinkedList resolves the dependencies of that package, Swift Package Manager would see 1.1.1 , and may attempt to update to this new version. If the version is selected, the client would download the source archive for this release by sending the request GET /mona/LinkedList/1.1.1.zip .

Security

Although this proposal has no direct impact on Swift Package Manager, it's important to consider the security implications of introducing a publishing endpoint to the Swift package ecosystem. To do this, we employ the STRIDE mnemonic below:

Spoofing

An attacker could attempt to impersonate a package maintainer in order to publish a new release containing malicious code.

Because the likelihood and potential impact of such an attack is high, registry service providers should take all necessary precautions. The registry specification recommends the use of multi-factor authentication for all requests to publish a package release.

Additional countermeasures like rate-limiting suspicious requests and analyzing uploaded source archives can also help mitigate the risk of this kind of attack.

An attacker could also attempt to trick users into downloading malicious code by publishing a package with an identifier similar to a legitimate one (for example, 4pple.swift-nio , which looks like apple/swift-nio ). A registry can mitigate typosquatting attacks like this by comparing the similarity of a new submission to existing package names with a string metric like Damerau–Levenshtein distance.

Tampering

An attacker could maliciously tamper with a generated source archive in an attempt to exploit a known vulnerability like Zip Slip, or a common software weakness like susceptibility to a Zip bomb.

Registry services should take care to identify and protect against these kinds of attacks in its implementation of source archive decompression.

To further improve the security of package submissions, a registry could restrict publishing to trusted clients, for which a chain of custody can be established. (This is effectively the "pull" model described above).

Repudiation

A dispute could arise between a package maintainer and a registry about the content or existence of a package release.

This proposal doesn't specifically provide a mechanism for resolving such a dispute. However, the design supports a variety of possible solutions. For example, a software bill of materials and the use of digital signatures can both provide non-repudiation guarantees about the provenance of package release artifacts.

Information disclosure

A user could inadvertently expose credentials when uploading a source archive for a package release.

This threat isn't substantially different from that of leaking credentials in source code with version management software, so similar strategies can be employed here. For example, registry services can help minimize this risk by rejecting any submissions that contain sensitive information.

Denial of service

An attacker could upload large payloads in an attempt to reduce the availability of a registry.

This kind of attack is typical for any web service with an endpoint for uploading resources. A registry can mitigate this threat using defensive coding practices like performing authentication checks before processing request bodies, limiting the maximum allowed size of a message payload, and routing requests through a reverse proxy or load balancer.

Escalation of privilege

It's desirable for a registry to have information about the content of a release submitted for publishing, such as the package's supported platforms, products, and dependencies. Swift package manifest files are executable code and must be evaluated by the Swift toolchain to determine this information. An attacker could construct a malicious Package.swift file containing system calls in an attempt to perform remote code execution.

Registry services should take care to evaluate package manifest files in an unprivileged container to mitigate the risk of evaluating untrusted code.

Impact on existing packages

This feature provides a mechanism for package maintainers and registries to migrate existing packages from the current URL-based system to the new registry scheme.

The specific strategy for rolling out this functionality is something to be determined by each registry operator in advance of this feature.

Alternatives considered

Endpoint for scope registration

This proposal sets no policies for how package scopes are registered or verified.

Endpoint for publishing with "pull" model

Many package managers and artifact repository services follow what we describe as a "push" model of publication: When a maintainer wants to releases a new version of their software, they produce a build locally and push the resulting artifact to a server.

For example, a developer can distribute their Ruby library by building a .gem archive and pushing it to a server like RubyGems.org.

$ gem build octokit.gemspec 
$ gem push octokit-4.20.0.gem

This model has the benefit of operational simplicity and flexibility. For example, maintainers have an opportunity to digitally sign artifacts before uploading them to the server.

Alternatively, a system might incorporate build automation techniques like continuous integration (CI) and continuous delivery (CD) into what we describe as a "pull" model: When a maintainer wants to release a new version of their software, their sole responsibility is to notify the registry; the server does all the work of downloading the source code and packaging it up for distribution.

For example, in addition to supporting the "push" model, Docker Hub can automatically build images from source code push the built image to a repository.

This model can provide strong guarantees about reproducibility, quality assurance, and software traceability.

Initial drafts for this proposal included separate endpoints for publishing with the "pull" and "push" models, with a preference for the former and its stronger guarantees of traceability. However, we determined that while these models provide a useful framework for understanding software distribution models, they are both accommodated by a single endpoint; a "pull" is equivalent a "push" where the client and server are a single entity.

Future directions

Swift Package Manager subcommand for publishing

Swift Package Manager could be updated to add a new swift package publish subcommand, that provides a more convenient interface for publishing packages to a registry. For example, it could automatically read the configuration in .swiftpm/config/registries.json to determine the correct registry endpoint, or read the user's .netrc file to authenticate the request.

The command could also subsume swift package archive-source and perform additional tasks before uploading, such as generating a software bill of materials or signing the source archive.

This feature wasn't included in the proposal because it's unnecessary for the core publishing functionality. We are also concerned that this command could bloat to the command-line interface and undermine the benefits of publishing within a CI/CD system. However, if the community finds this to be a useful feature, we'd be happy to include it in an amendment to our proposal.

Mechanism for syndicating publishing activity

A registry could syndicate new releases through an activity stream or RSS feed. This functionality could be used as an information source by package indexes or to provide federation across different registries.

Transparency logs

Similar to a syndication feed, each new package release could be added to an append-only log like Trillian or sigstore.


Package Registry Specification (Additions)

4.6. Create a package release

A client MAY send a POST request for a URI matching the expression /{scope}/{name}/{version} to publish a release of a package. A client MUST provide a body encoded as multipart form data with the following sections:

Key Content-Type Description Requirement Level
source-archive application/zip The source archive of the package. REQUIRED
metadata application/json Additional information about the release. OPTIONAL

A client MUST set a Content-Type header with the value multipart/form-data , and a Content-Length header with the total size of the body in bytes. A client SHOULD set the Accept header with the value application/vnd.swift.registry.v1+json .

POST /mona/LinkedList?version=1.1.1 HTTP/1.1
Host: packages.example.com
Accept: application/vnd.swift.registry.v1+json
Content-Type: multipart/form-data;boundary="boundary"
Content-Length: 336
Expect: 100-continue
--boundary
Content-Disposition: form-data; name="source-archive"
Content-Type: application/zip
Content-Length: 32
Content-Transfer-Encoding: base64

gHUFBgAAAAAAAAAAAAAAAAAAAAAAAA==

--boundary
Content-Disposition: form-data; name="metadata"
Content-Type: application/json
Content-Transfer-Encoding: quoted-printable
Content-Length: 3

{ }

A server SHOULD require a client to perform authentication for any requests to create a package release. Use of multi-factor authentication is RECOMMENDED.

A client MAY publish releases in any order. For example, if a package has existing 1.0.0 and 2.0.0 releases, a client MAY publish a new 1.0.1 or 1.1.0 release.

Once a release has been published, any resources associated with that release, including its source archive, MUST NOT change.

If a release already exists for a package at the specified version, the server SHOULD respond with a status code of 409 (Conflict).

HTTP/1.1 409 Conflict
Content-Version: 1
Content-Type: application/problem+json
Content-Language: en

{
   "detail": "a release with version 1.0.0 already exists"
}

It is RECOMMENDED that a server institute policies for publishing new releases of a package after a scope is transferred to a new owner. For example, the next release of an existing package is published with a new major version, or only after a period of 45 days after transfer.

If the client provides an Expect header, a server SHOULD check that the request can succeed before responding with a status code of 100 (Continue) . A server that doesn't support expectations SHOULD respond with a status code of 417 (Expectation Failed) . In response, a client MAY remove the Expect header and retry the request.

HTTP/1.1 417 (Expectation Failed)
Content-Version: 1
Content-Type: application/problem+json
Content-Language: en

{
   "detail": "expectations aren't supported"
}

Support for this endpoint is OPTIONAL. A server SHOULD indicate that publishing isn't supported by responding with a status code of 405 (Method Not Allowed).

HTTP/1.1 405 (Method Not Allowed)
Content-Version: 1
Content-Type: application/problem+json
Content-Language: en

{
   "detail": "publishing isn't supported"
}

A server MAY respond either synchronously or asynchronously. For more information, see 4.6.4.

4.6.1 Source archive

A client MUST include a multipart section named source-archive containing the source archive for a release. A client SHOULD set a Content-Type header with the value application/zip and a Content-Length header with the size of the Zip archive in bytes.

--boundary
Content-Disposition: form-data; name="source-archive"
Content-Type: application/zip
Content-Length: 32
Content-Transfer-Encoding: base64

gHUFBgAAAAAAAAAAAAAAAAAAAAAAAA==

A client SHOULD use the swift package archive-source tool to create a source archive for the release.

A server MAY analyze a package to assess its viability, perform security testing, or otherwise evaluate software quality. A server MAY refuse to publish a package release for any reason by responding with a status code of 422 (Unprocessable Entity).

HTTP/1.1 422 Unprocessable Entity
Content-Version: 1
Content-Type: application/problem+json
Content-Language: en

{
   "detail": "package doesn't contain a valid manifest (Package.swift) file"
}

A server SHOULD use the swift package compute-checksum tool to compute the checksum that's provided in response to a client's subsequent request to download the source archive for the release.

4.6.2. Package release metadata

A client MAY include a multipart section named metadata containing additional information about the release. A client SHOULD set a Content-Type header with the value application/json and a Content-Length header with the size of the JSON document in bytes. It is RECOMMENDED that package release metadata be represented in JSON-LD according to a structured data standard, as discussed in 4.2.1.

--boundary
Content-Disposition: form-data; name="metadata"
Content-Type: application/json
Content-Length: 620
Content-Transfer-Encoding: quoted-printable

{
  "@context": ["http://schema.org/"],
  "@type": "SoftwareSourceCode",
  "name": "LinkedList",
  "description": "One thing links to another.",
  "keywords": ["data-structure", "collection"],
  "version": "1.1.1",
  "codeRepository": "https://github.com/mona/LinkedList",
  "license": "https://www.apache.org/licenses/LICENSE-2.0",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "Swift",
    "url": "https://swift.org"
  },
  "author": {
      "@type": "Person",
      "@id": "https://github.com/mona",
      "givenName": "Mona",
      "middleName": "Lisa",
      "familyName": "Octocat"
  }
}

If a client doesn't provide a metadata section, a server MAY populate the metadata for a release. A client MAY request that a server not populate metadata automatically by sending an empty JSON object ( {} ) as its request body.

If a client provides an invalid JSON document, the server SHOULD respond with a status code of 422 (Unprocessable Entity) or 413 (Payload Too Large) and MAY communicate validation error details in the response body.

HTTP/1.1 422 Unprocessable Entity
Content-Version: 1
Content-Type: application/problem+json
Content-Language: en

{
   "detail": "invalid JSON provided for release metadata"
}

4.6.3 Synchronous and asynchronous publication

A server MAY respond to a request to publish a new package release either synchronously or asynchronously.

A client MAY indicate their preference for asynchronous processing with a Prefer header field containing the token respond-async and an optional wait preference, as described by RFC 7240.

POST /mona/LinkedList/1.1.1 HTTP/1.1
Host: packages.example.com
Accept: application/vnd.swift.registry.v1
Prefer: respond-async, wait=300
4.6.3.1 Synchronous publication

If processing is done synchronously, the server MUST respond with a status code of 201 (Created) to indicate that the package release was published. This response SHOULD also contain a Location header with a URL to the new release.

HTTP/1.1 201 Created
Content-Version: 1
Location: https://packages.example.com/github.com/mona/LinkedList/1.1.1

A client MAY set a timeout to guarantee a timely response to each request.

4.6.3.2 Asynchronous publication

If processing is done asynchronously, the server MUST respond with a status code of 202 (Accepted) to acknowledge that the request is being processed. This response MUST contain a Location header with a URL that the client can poll for progress updates and SHOULD contain a Retry-After header with an estimate of when processing is expected to finish. A server MAY locate the status resource endpoint at a URI of its choosing. However, the use of a non-sequential, randomly-generated identifier is RECOMMENDED.

HTTP/1.1 202 Accepted
Content-Version: 1
Location: https://packages.example.com/submissions/90D8CC77-A576-47AE-A531-D6402C4E33BC
Retry-After: 120

A client MAY send a GET request to the location provided by the server in response to a publish request to see the current status of that process.

GET /submissions/90D8CC77-A576-47AE-A531-D6402C4E33BC HTTP/1.1
Host: packages.example.com
Accept: application/vnd.swift.registry.v1

If the asynchronous publish request is still processing, the server SHOULD respond with a status code of 204 (No Content) and a Retry-After header with an estimate of when processing is expected to finish.

HTTP/1.1 204 No Content
Content-Version: 1
Retry-After: 120

If the asynchronous publish request is finished processing successfully, the server SHOULD respond with a status code of 301 (Moved Permanently) and a Location header with a URL to the package release.

HTTP/1.1 301 Moved Permanently
Content-Version: 1
Location: https://packages.example.com/mona/LinkedList/1.1.1

If the asynchronous publish request failed, the server SHOULD respond with a status code of 205 (Reset Content).

HTTP/1.1 205 Reset Content
Content-Version: 1

A client MAY send a DELETE request to the location provided by the server in response to a publish request to cancel that process.

If a request to publish a new package release were to fail, a server MUST communicate that failure in the same way if sending an immediate response as it would if responding to a client polling for status.

If a client makes a request to publish a package release to a server that is asynchronously processing a request to publish that release, the server MUST respond with a status code of 409 (Conflict)

HTTP/1.1 409 Conflict
Content-Version: 1
Content-Type: application/problem+json
Content-Language: en
Location: https://packages.example.com/submissions/90D8CC77-A576-47AE-A531-D6402C4E33BC

{
   "detail": "already processing a request to publish this package version"
}

If a client makes a request to publish a package release to a server that finished processing a failed request to publish that release, the server SHOULD try publishing that release again. A server MAY refuse to fulfill a subsequent request to publish a package release by responding with a status code of 409 (Conflict).

10 Likes

Overall, +1. Seems like a reasonable addition.

What about metadata? It seems unfortunate that I’d need to publish a new release to add a keyword or correct a typo in the description. Can we designate some metadata as immutable (name, author, license, etc), and less-important keys allowed to be mutable?

If this is determined by the server, what is the purpose of this? It seems that all clients will need to account for asynchronous publishing anyway.

It would be nice if the server had a way to send more information about which tasks are pending - for instance, it may be running CI jobs that I’ve configured in some way, and those may take a long time. Rather than have the client stuck at an opaque “Processing…” screen, it would be better if the server could choose to communicate which jobs are going on. This would be optional, and servers could choose not to provide that information.

I see 2 options for this: either a JSON document with some list of tasks to be displayed and a queued/in-progress/complete flag, or a (possibly multi-line) string, for the server to fill as it likes.

As above. Would be nice to allow an optional message here.

1 Like

Do another release that reflects the information you want then. Releases are cheap; and they reflect specifically an artifact and information about it for a specific point in time.

It'd be pretty bad if someone can publish a release with a permissive license, such as Apache, and then once the project is adopted go in and change old artifacts to suddenly be GPLv3 for example :wink: Even if projects had pinned the specific version they knew was license wise okey, the'd suddenly be in trouble (ignore the legal capability if such post-factum license change of an already released under one specific license artifact is okey or not -- for a new release to have a different license is fine, but such pulling the rug from under projects is pretty nasty and should not be supported).

3 Likes

I don’t think the license field of the metadata is authoritative: what matters is the license contained in the source archive, which I agree should be immutable. Essentially all (or perhaps even all) open-source licenses have a section which says:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software

The source archive clearly is “the software” - it’s clearly a separate piece of data in the publishing endpoint, it’s what the user downloads, and users might not even ever download the metadata. If the license in the source archive and metadata are different (which can happen even without retroactive changes to the metadata), it seems likely that the one in the source archive would win.

(Usual caveat about not being a lawyer, of course)

Thanks for taking time to review our proposal, @Karl. I appreciate your insight and feedback.

I agree with @ktoso that updates to metadata should be accommodated by creating a new release.

Setting aside the question of what constitutes a software license, an organization may consider other details when deciding whether to add it as a dependency. Allowing a maintainer to change a package's metadata at any time could create confusion and uncertainty about whether an external package satisfies the legal or regulatory requirements after it's added to a product.

For example, if I were working on iOS app in the healthcare space, it'd be really inconvenient if a 3rd-party package I was using decided to change its description to "This package violates HIPAA" as a joke.

A client can support asynchronous publishing at different levels. A simple client could see a 201 or 202 and say "Alright, good enough!", and put the onus on the user to follow up. Whereas a more sophisticated client could continue to poll for status.

Even if it's ultimately the server's decision to respond sync or async, I wanted to provide some way for the client to indicate a preference. And the respond-async from RFC 7240 was the best I could find from existing standards.

Ideally, the preference would be inverted, to prefer-sync, but a draft proposal to add that to the IANA HTTP Parameters registry appears not to have gotten traction...

That's a great point. What do you think about updating this to instead have the server respond with 202 (Accepted) and include whatever additional information in the response body?

Agreed. I think a client error (4xx) would be more appropriate here anyway.

2 Likes

I would second this, immutability is the main property of a release.

There's plenty of metadata already in the source code archive itself, e.g. we wouldn't want to offer a way to fix a typo in a doc comment without publishing a new release, so it makes sense for that to apply also to out-of-band metadata.

2 Likes

I don't really have an opinion about the specific HTTP response codes that are returned; I trust your judgement if you think that's the appropriate response.

I just think there should be a body. Nobody likes staring at a screen with no information, unsure if anything is really going on. On GitHub, it's common to set up actions which run and check every commit, and it's reasonable to think people might want that to happen on publishing too:

image

So when I publish a package, either through the SwiftPM CLI or some kind of Xcode interface, I would not want to see:

swift package publish <...>

Processing...

For 5/10/15 minutes. Rather, I would prefer to see something like:

swift package publish <...>

Stage 2: Running configured actions

✅ Swift package tests (macOS)
✅ Swift package tests (Linux)
🔄 Swift package tests (Windows) (Running)
🕜 Generate documentation (Waiting)

ℹ️ More information available at: https://....

Where that set of jobs and "more info" link is returned by the server at each async status request.

Similarly, when publishing fails, I would like to see which of these jobs failed, optionally with a link to get more information, if the server supports it:

swift package publish <...>

Stage 2: Running configured actions

✅ Swift package tests (macOS)
✅ Swift package tests (Linux)
❌ Swift package tests (Windows) (Failed)
⬜️ Generate documentation (Canceled)

❌ Publishing failed. More information available at: https://....

Again, this would all be optional. If you have a very basic package repository which doesn't run any checks or isn't able to report their status, you can return nothing and get the basic "Processing..." text.

2 Likes