OpenAPIKit

mattpolzin · January 14, 2020, 8:07am

Pitch

Include OpenAPIKit in the official and recommended SSWG projects.

Motivation

OpenAPI is a broadly used specification for writing API documentation. OpenAPI documents can be used to generate interactive documentation, automate testing, generate code, or just provide a solid source of truth and a contract between a client and a server.

As linked above, a lot of great tooling already exists around the OpenAPI specification. The aforementioned code generator even supports Swift with improvements being actively discussed.

OpenAPIKit fits into the existing ecosystem as a relatively low level library, with the intention of supporting other libraries and tools on top of it. It currently captures nearly all of the specification in Swift Codable types. Thanks to Swift's type system, OpenAPIKit validates OpenAPI documentation simply by decoding it and it guarantees that OpenAPI documentation it encodes meets the spec as well.

In short, OpenAPIKit is a foundation for any Swift code that aims to read or write OpenAPI documentation. My hope is that this spec implementation saves time for others interested in writing tooling or frameworks with a higher level of abstraction.

Project Status

The OpenAPIKit library currently implements approximately 90% of the OpenAPI specification with approximately 95% test coverage. This includes substantial support for OpenAPI schemas, which are themselves close relatives of the very comprehensive JSON Schema specification. [EDIT 2/9/2020] ~95% completion with 98% test coverage.

Next Steps

The plan is to prioritize the following.

99% Spec Implementation

The first order of business is completing the implementation of the spec. To be perfectly honest, there are a few small things with particularly large time commitments attached that I will likely leave for later, perhaps needing to be motivated by request.

Decoding Error Legibility

This is a new addition, thanks to the comments from @lassejansen below.

The error output from failed attempts at decoding OpenAPI documents is currently often a bit of a mess. There is a lot that can be done, much of it without too much work, to improve that situation. Seeing as how reading JSON/YAML representations of OpenAPI documentation is a primary focus of this library, good error output should be a primary focus as well.

Canonical API Information

OpenAPI allows for an author to document the same API in numerous different ways. This flexibility can save time and offer convenience when authoring, but as someone consuming the documentation in order to transform it or analyze it somehow it can be cumbersome.

Easy examples of this flexibility include (1) the ability to define available servers at the top level of the document but also refine or add servers in the Path Items object and (2) the ability to define parameters in the Path Items object but also add them in the Operations object. OpenAPIKit should provide easy answers to questions like "what are all of the parameters for a given endpoint?" or "what is the full list of servers used by this API?"

Protocols to facilitate generating OpenAPI

I have begun to hone in on a set of protocols (and conformances for fundamental and standard library Swift types) to facilitate generation of OpenAPI schemas from arbitrary swift code. In addition, OpenAPIKit already provides a method for generating OpenAPI schemas from arbitrary swift types using reflection (this is the foundation on which response schemas are built for the Vapor example use-case below).

Example Uses

Neither of the first two examples are trivial as-is, but they would have been drastically larger undertakings without a Swift implementation of the OpenAPI spec on top of which to build.

Vapor API documentation generation

For those interested, I've created a proof of concept library and app using Vapor that takes advantage of OpenAPIKit to generate OpenAPI documentation from the same route code that serves up responses (granted, to truly take advantage of OpenAPI I needed to introduce some type information to the routes that Vapor does not require out-of-box).

JSON:API schema generation

Another example use-case of OpenAPIKit is this library -- it takes JSON:API types and generates OpenAPI schemas for them.

Personal side note: Perhaps ironically, I am a much bigger proponent of writing OpenAPI documentation first and then creating (or generating) endpoints that meet the spec. However, for small projects especially, it can be incredibly valuable to generate API documentation from the code instead of the other way around.

Writing OpenAPI documentation

Handwriting OpenAPI documentation may sound laborious, but I am actually not the least bit opposed to doing so in the right context -- in fact, I have written NodeJS tooling in the past to facilitate easy, repeatable, standardized OpenAPI documentation as part of contract driven API development. It's actually quite nice to write OpenAPI documentation using OpenAPIKit -- the declarative structure is reminiscent of YAML but you get type safety, declared constants without $refs, and reusability a la Swift.

Scripting

This is a late addition (1/27/2020).
I threw together an example of using OpenAPIKit in a scripting environment to create tooling to help facilitate writing OpenAPI documentation. This is the kind of script I have written in NodeJS in the past to allow my team to create consistent APIs with consistent documentation. In addition to being a simple example script, keep in mind the intention in this context is not to fully formulate the OpenAPI documentation within Swift -- this kind of script populates a YAML file with templates that the user of the script would then go and fill out.

lassejansen · January 16, 2020, 10:38am

I think it's a great idea to have a separate library that handles parsing, representing and exporting OpenAPI files. Looks like you already covered a large part of the spec!

I'm not sure though if it's a good idea to use Codable for parsing and generating files. My main concerns are (1) error messages when parsing yaml and json files, and (2) losing dictionary order of the source files.

(1) Error handling

OpenAPI files can get quite large and it's easy to make mistakes when editing them. In my opinion it's important to have meaningful error messages that contain the line number of the file where the error occurred. I'm not sure if that's possible when using JSONDecoder or YAMLDecoder + Codable.

Consider this minimal example:

openapi: 3.0.0
info:
  title: API
  version: 1.0.0
paths:
  /all-items:
    summary: Get all items
    get:
      responses:
        "200":
          description: All items
  /one-item:
    get:
      summary: Get one item

The error is that the second path (/one-item) must contain at least one response. The error at the moment looks like this (formatted for readability):

Swift.DecodingError.dataCorrupted(
  Swift.DecodingError.Context(
    codingPath: [],
    debugDescription: "The given data was not valid YAML.",
    underlyingError: Optional(Poly failed to decode any of its types at: "paths//one-item"

      JSONReference<Components, PathItem> could not be decoded because:
      keyNotFound(
        CodingKeys(
          stringValue: "$ref",
          intValue: nil
        ),
        Swift.DecodingError.Context(
          codingPath: [
            CodingKeys(stringValue: "paths", intValue: nil),
            _DictionaryCodingKey(stringValue: "/one-item", intValue: nil)
          ],
          debugDescription: "No value associated with key CodingKeys(stringValue: \"$ref\", intValue: nil) (\"$ref\").",
          underlyingError: nil
        )
      )

      PathItem could not be decoded because:
      keyNotFound(
        CodingKeys(stringValue: "responses", intValue: nil),
        Swift.DecodingError.Context(codingPath: [
            CodingKeys(stringValue: "paths", intValue: nil),
            _DictionaryCodingKey(stringValue: "/one-item", intValue: nil),
            CodingKeys(stringValue: "get", intValue: nil)
          ],
          debugDescription: "No value associated with key CodingKeys(stringValue: \"responses\", intValue: nil) (\"responses\").",
          underlyingError: nil
        )
      )
    )
  )
)

I'm sure the error message can be improved by evaluating the underlying errors, but I don't know if it's possible to add line numbers with this approach.

(2) Dictionary order

Personally I think it's important to preserve the order of the dictionaries in the source files. Consider the paths object. It often contains dozens of paths and operations and people tend to create a "semantic" order in the source file (e.g. first register, then login, then list all items, then create an item, the get an item, ...). Especially when generating documentation it's very helpful if this order is preserved. SwaggerUI will do this, and the Ruby library that I've used in the past preserves the order, too.

Another case is modifying OpenAPI files programmatically. If the order and formatting isn't preserved, a git diff will show large blocks of removed and added lines, even if only one item was changed by the program.

To be able to do this we would need event-driven JSON and YAML parsers I think, and an intermediate representation of the OpenAPI document that stores lines numbers and formatting and uses ordered dictionaries.

What do you think?

mattpolzin · January 16, 2020, 4:33pm

Thank you, these are both incredibly useful points. In fact, they are both things I’ve noted (to myself) as shortcomings during various experimentation in the past but then forgot to bring up in my OP for this thread! Well, that’s what peer review is all about!

Error Handling
Now that you bring it up, I think “improving error legibility” deserves explicit mention in the “next steps” for the library. I recently underwent an effort to improve legibility of errors in a different library of mine and was thrilled with the results. However, I have not focused on line numbers in the past. I like the idea but am unsure off the top of my head how easy it would be to get there. I believe that even without line numbers the error output can be made easy enough to understand that someone at least familiar with OpenAPI nomenclature would quickly spot the problem.

For example, I’d consider the following human readable error message pretty darn good and totally within reach with a little more work: “responses key is missing for the GET operation under /one-item”

Ordering
This actually should be readily solvable so I will create myself a GitHub issue and tackle it sooner than later. The trick will be using ordered dictionaries instead of dictionaries and neither the dictionary literal syntax (writing dictionary literals in Swift) nor the decoding process should stand in the way. [EDIT] This resulted in https://github.com/mattpolzin/OpenAPIKit/pull/8.

mackoj · January 16, 2020, 4:38pm

Thanks for kickstarting this I thinks it's important to have a Swift building bloc for working with OpenAPI.

I use an OpenAPI code generator(Swaggen) at work regularly.

Splitting it in multiple part is the way to go.

What other part of it do you think should be part of a SSWG project ?

parser: what you are building
semantic: what you get is valid and proper OpenAPI
code generator: Template(with parser data) -> Swift
documentation generator: Template(with parser data) -> website
default template: Client, Server

How do you plan to test you parser(maybe these files could help) ?

Did you investigate existing tooling?

mattpolzin · January 17, 2020, 12:15am

Good question.

Naturally I think the parser should be written in Swift and is a good candidate for the SSWG since we are discussing that here, but that's less a focus on parsing for me personally and more about having an easy-to-work-with syntax tree for OpenAPI that can be interfaced with directly from other Swift code. Being able to read/write (i.e. decode/encode, parse/generate) a JSON or YAML representation of OpenAPI natively in Swift is more of an ends to that means. Then, another thing that "just falls out of" an implementation based on Codable Swift types is the guarantee that if you can write Swift for a particular OpenAPI document then it will produce a valid OpenAPI document when encoded as JSON/YAML.

Code generation I think is a natural eventual horizon (for building on top of something like OpenAPIKit, not building into OpenAPIKit). This is not because the existing code generation tools built around OpenAPI aren't good enough, but because I think Swift code generation over time could benefit from leaning more and more heavily on things like SwiftSyntax.

Documentation UI generation seems less critical to me (as a native Swift project). Projects like Swagger-UI and Redoc and a few others I've looked into are doing a really good job of this already so as long as your OpenAPI documentation can eventually be represented as JSON/YAML, these tools strike me as a good fit for generating a user interface to the documentation. That said, I'm not about to tell someone not to write a native Swift powered UI. That sounds very cool, just less essential.

These are great. Thanks for digging up some good sources. So far I have tested my library against a couple of specs (one of them is my own, the other is proprietary to the company I work for) but having a corpus of specs to test against would be beneficial and I will not be surprised if a spec found under one of your links either exposes a bug in my current implementation or else proves to have a bug my current implementation finds. I'll work on a CI step that parses some of those specs and maybe asserts some things against the results in the near future!

I did look into some of what was out there. Most things I found did not take advantage of Codable or did not support OpenAPI 3 or were not a native Swift solution.

Kitura's OpenAPI support was for OpenAPI 2 (i.e. Swagger), if I recall correctly, but their Kitura-OpenAPI was inspiring as I began thinking about my VaporOpenAPI library (currently just a prototype/showcase, but I plan to develop it further in the future).

yonaskolb/SwagGen is a fantastic example of the sort of library I hope could benefit from SSWG adoption of a library like OpenAPIKit. I see that SwagGen gives attribution to a parsing library called SwaggerParser that appears to have had a similar goal to OpenAPIKit but did not make it past OpenAPI 2 support. The current OpenAPI 3 support in SwagGen appears to read JSON but not write it (after all, it doesn't need to write it out in order to generate code) and because it is not based on Codable it lacks YAML support without converting the YAML to JSON (in the case of OpenAPI, not a problem, just an inconvenience). Given the right implementation of a library like the one I am proposing here (emphasis on right intended to imply that my library is not necessarily the clear choice), SwagGen and others like it could skip over the substantial consideration of how to import the spec and just worry about the end goal (in this case generating code). I would be very interested to get feedback on my library's implementation from the creator of SwagGen.

openapi-generator, vapor-server-codegen, and swagger-codegen are awesome projects. I am really grateful users of those libraries have added support for Swift, but as projects not written in Swift, I think they do fall tangential to my motivation for OpenAPIKit. The fact that good code generation tools exist is one of the reasons I don't think there needs to be a native Swift code generation option immediately (but projects like SwagGen above are still very exciting to me).

Therein lies my thinking that more immediate examples of uses of OpenAPIKit might be OpenAPI generation (opposite direction from code generation), direct authoring of specs in Swift, or ingestion of the spec by Swift code to power whatever other behavior -- maybe a Swift app or Swift-powered server could accept an OpenAPI spec and automate some API tests or take the information from the OpenAPI spec and plug it into an integration config format native to that app (i.e. create an internal API binding by importing OpenAPI documentation instead of requiring manual setup). Then again, the existence of yonaskolb/SwagGen perhaps indicates native Swift implementations of OpenAPI code generators could be a more immediate desire than I had realized.

lassejansen · January 20, 2020, 8:54am

For example, I’d consider the following human readable error message pretty darn good and totally within reach with a little more work: “ responses key is missing for the GET operation under /one-item ”

Sound great!

This actually should be readily solvable so I will create myself a GitHub issue and tackle it sooner than later. The trick will be using ordered dictionaries instead of dictionaries and neither the dictionary literal syntax (writing dictionary literals in Swift) nor the decoding process should stand in the way. [EDIT] This resulted in introduce dictionary ordering. by mattpolzin · Pull Request #8 · mattpolzin/OpenAPIKit · GitHub.

Very cool!

This is sadly not the case for the Apple JSONDecoder as a side effect of it using JSONSerializer under the hood and that in turn using NSDictionary to back its keyed container.

This is something I ran into before, and a reason why I suggested not to use Codable. But you are right, that's an issue of JSONDecoder and JSONEncoder, not Codable itself.

mattpolzin · January 20, 2020, 5:57pm

I have not tried it yet, but one example of an alternative to the Foundation JSON decoder that does retain ordering is GitHub - omochi/FineJSON: More useful JSONEncoder, Decoder. I wish it was more battle tested or at least had more unit test coverage but the underlying parser does appear to have better test coverage.

mattpolzin · January 27, 2020, 8:11am

I added the following new example use-case to the original post as well.

Scripting

I threw together an example of using OpenAPIKit in a scripting environment to create tooling to help facilitate writing OpenAPI documentation. This is the kind of script I have written in NodeJS in the past to allow my team to create consistent APIs with consistent documentation. In addition to being a simple example script, keep in mind the intention in this context is not to fully form the OpenAPI documentation within Swift -- this kind of script populates a YAML file with templates that the user of the script would then go and fill out.

mattpolzin · February 20, 2020, 3:30pm

I hesitate to keep updating this thread, but since the WG may not have discussed this pitch yet, I'll keep adding wood to the fire.

Diffing

The company I work for recently needed to produce a list of changes for our API across two arbitrary versions. I tried the two most readily googleable open source diffing tools but one of them crashed and the other one hung indefinitely when given our OpenAPI documentation. Thanks to OpenAPIKit, I was able to write an OpenAPIDiff library and openapi-diff executable entirely in Swift. It's not a finished or polished product, but it served us well in a pinch and I would not have been able to solve the problem as quickly had I needed to use a language I was less familiar with.

OpenAPIKit Status Update

The library has seen a few more additions reflected in the Project Status on the GitHub page and I continue to fill in test coverage.

I took @mackoj's suggestion and began adding a compatibility testing suite which already resulted in several good developments:

The Google Books API and TomTom Search API are both parsed in the compatibility suite.
In order to successfully parse them, I needed to add missing support for untyped JSON Schema and arbitrary JSON Schema format strings and fix a bug with parsing OpenAPI Schema examples
The Jira OpenAPI documentation actually did not pass validation against OpenAPIKit, although the failure is quite nit-picky and points to a future desire to be able to fine-tune what fails vs. just producing a parsing warning. The Jira OpenAPI documentation specifies a Server with an empty-string URL, which does not meet my interpretation of the spec for "a valid URL" (even a relative URL would at least contain "/", which incidentally is the default anyway if you entirely omit the array of servers).

tomerd · February 20, 2020, 5:29pm

please do keep updating the thread! OpenAPI is an important topic for the SSWG and we are discussing it pretty regularly

mattpolzin · February 29, 2020, 2:02am

I've been working on improving error output of the decoding process. Thanks again to @lassejansen for pushing for these improvements.

Now you can wrap DecodingErrors coming out of OpenAPIKit types with OpenAPI.Error(from:) to get easy access to human readable descriptions and coding paths.

There's lots of room for improvement, but following are a few examples of the human readable output.

Response header with both `content` and `schema`

openapi: "3.0.0"
info:
    title: test
    version: 1.0
paths:
    /hello/world:
        get:
            responses:
                '200':
                    description: hello
                    content: {}
                    headers:
                        hi:
                            schema:
                                type: string
                            content:
                                application/json:
                                    schema:
                                        type: string

Description:

Found neither a $ref nor a Header in .headers.hi for the status code '200' response of the GET endpoint under /hello/world.

Header could not be decoded because:
Inconsistency encountered when parsing Header: A single path parameter must specify one but not both content and schema.

Coding Path String: .paths['/hello/world'].get.responses.200.headers.hi

Security Scheme that has not been added to the Components Object

openapi: 3.0.0
info:
    title: test
    version: 1.0
paths: {}
components: {}
security:
    - missing: []

Description:

Inconsistency encountered when parsing security in the root Document object: Each key found in a Security Requirement dictionary must refer to a Security Scheme present in the Components dictionary.

Coding Path String: .security

JSON Schema `type` that is a Hash instead of a String

openapi: "3.0.0"
info:
    title: test
    version: 1.0
paths:
    /hello/world:
        get:
            requestBody:
                content:
                    application/json:
                        schema:
                            type:
                                hi: there
            responses: {}

Description:

Found neither a $ref nor a JSONSchema in .content['application/json'].schema for the request body of the GET endpoint under /hello/world.

JSONSchema could not be decoded because:
Expected type value to be parsable as Scalar but it was not.

Coding Path String: .paths['/hello/world'].get.requestBody.content['application/json'].schema

tanner0101 · March 4, 2020, 10:22pm

I finally had time to dig deeper into this project and it's awesome. The OpenAPI specification is quite large and having battle tested structs for parsing and serializing the specification will make building OpenAPI compatible tools a lot easier.

We discussed this during the SSWG meeting today and we all agree this should move forward to a proposal.

There are a few points of feedback I collected during my review and I'm interested to see what you think:

OpenAPIKit currently pulls in quite a few dependencies. This will make accepting the package more difficult since we want to make sure this package plus everything it pulls in adhere to our standards.
Swift generally recommends using standard library types where ever possible. I noticed OrderedDictionary is used a lot. Is this necessary to adhere to the OpenAPI spec or could we get away with using a Swift dictionary? This point is kind of related to the previous one, too, since that would mean one less dependency to worry about.
Finally, I noticed this package attempts to support reflection. I think getting reflection right is less straight forward and people may have different opinions on how to do this. Perhaps that would be better as a separate package? I think this may also be related to the first point since it seems like the AnyCodable package is used for reflection. I'm not certain though.

Note that this feedback doesn't necessarily need to be addressed before proposing. We can continue this conversation on the proposal. I just want to give you an idea of my thoughts after reviewing.

Thanks, @mattpolzin !

mattpolzin · March 5, 2020, 2:30am

First of all, thank you (@tanner0101) for taking on the task of digging into OpenAPIKit and presenting your findings to the WG and thank you (to the WG) for the discussion and support.

Second of all, thank you for the feedback. I will reply to your feedback right away, but also continue to mull on it as I begin to write a proposal and make some tweaks to address the more readily addressable points.

This makes sense and I expected this would come up during review. I can definitely make this burden a little lighter before submitting the proposal (I'll elaborate in response to your second and third point).

I introduced OrderedDictionary in response to the observation (credit to @lassejansen) that decoding with OpenAPIKit and then re-encoding results in undefined ordering of JSON/YAML hashes when using the Foundation library's Dictionary type. This does not break adherence with the OpenAPI spec, but it does cause problems in two situations with which the output of OpenAPIKit is likely to be faced:

Plaintext diffs such as those performed by GitHub will more often than not pick up on changes to JSON/YAML produced by OpenAPIKit that are simply hash ordering differences and it is not desirable to add to the cognitive load of a reviewer by forcing them to confront such inconsequential differences. Worse, these differences might result in harder merge conflicts to resolve.
The ordering of things such as OpenAPI path items should reasonably be expected to be stable in the output of a UI such as SwaggerUI or Redoc. If OpenAPIKit produces the documentation used by such UIs (or even just read over by humans), it should be able to provide stable ordering.

I could see making OpenAPIKit generic over the dictionary type thus eliminating the dependency and allowing the consumer to decide whether or not ordering is important, but that feels like a lot of extra work for that flexibility unless the community and WG decide to push for it.

This I definitely concede to be both targeted at a very specific use of the library and additionally "finicky." I think your suggestion to move it to a different library is good. Doing so would immediately remove the need for the Sampleable dependency. It is quite a bit trickier to remove the need for AnyCodable, though. AnyCodable is used in a number of places to encode/decode OpenAPI examples (which can be anything from a String to an arbitrary nesting of Dictionary, Array, etc.). One option here could be to bring that support into this library instead of depending on AnyCodable -- I have noticed this is fairly common and it seems reasonable because nothing should generally need to change about a stable well-tested implementation of those features.

Thanks again for your time and thoughts! I welcome continued discussion of this feedback here or we can wait until I have time to spin up a proposal thread.

[EDIT] @tanner0101 Do test-only dependencies have any relevance to the discussion? Both Yams and FineJSON are only used by test targets.

johannesweiss · March 5, 2020, 2:21pm

CC @millenomi & @Tony_Parker : Does JSONEncoder support encoding into deterministic JSON? I understand that Dictionary is (for very good reasons -- security) not stable but we could (as opt-in) still provide a deterministic JSON representation (for example by sorting the keys or so).

clayellis · March 5, 2020, 5:08pm

I know at one point there was a pitch to allow customizing the underlying serializer by passing JSONSerialization.WritingOptions (in this case: .sortedKeys) but I can't seem to find it anywhere.

If I remember correctly, though, there was pushback on solving this issue that way because that would expose implementation details of the underlying serializer.

mattpolzin · March 5, 2020, 5:14pm

JSONEncoder does currently support key sorting via the outputFormatting property. I use this feature to produce consistent JSON in stringy tests in OpenAPIKit: OpenAPIKit/TestHelpers.swift at main · mattpolzin/OpenAPIKit · GitHub

clayellis · March 5, 2020, 5:27pm

What the, since when? TIL...

Jon_Shier · March 5, 2020, 5:50pm

I mean, it doesn’t really help if source order is important. Lack of an OrderedDictionary hits Alamofire as well, as we now offer our own URL encoded form Encoder and there are (bad) APIs that require parameters in some particular order. Not being able to express such a requirement in a spec is bad as well.

Tony_Parker · March 5, 2020, 6:03pm

Since these releases

/// Produce JSON with dictionary keys sorted in lexicographic order.
@available(macOS 10.13, iOS 11.0, watchOS 4.0, tvOS 11.0, *)
public static let sortedKeys    = OutputFormatting(rawValue: 1 << 1)

clayellis · March 5, 2020, 6:05pm

Right, I must've just missed that completely, and sounds like I wasn't the only one