True. This might be the best option from them all yet (less drawbacks than the other two). Thank you for bring it forward @pyrtsa . However, it still seems subpar that the user has to wrap the input on an enum before passing it on. Moreover when the user doesn't have to do that when using JSONDecoder
or PropertyListDecoder
.
Personally, I’d design it so that it conforms to TopLevelDecoder
with the Data
input, to mirror the Foundation
decoders. Leave the file URL input as a separate method, with an explicit argument label to denote that the input will in fact be processed by streaming.
I hope if TopLevelEncoder
and TopLevelDecoder
are ever added to the standard library, it doesn't break on non-Apple platforms when using open-source libraries like OpenCombine that define their own version of TopLevelEncoder
and TopLevelDecoder
.
I think it'll be the same as it was with adding the Result
type to the standard library. IIRC, it was solved by always preferring types from modules other than the standard library in case of ambiguity.
Before these get @frozen
, I’d like to make a request: add the following requirement to both TopLevelEncoder
and TopLevelDecoder
:
var userInfo: [CodingUserInfoKey : Any] { get set }
Decoder
already requires it to be accessible, and all of the implementations I’ve seen already satisfy it. It’s more or less essential, and it would make writing generic functions much easier.
I just had to do this instead, and I never want to do it again:
Mirror(reflecting: decoder) // TopLevelDecoder
.children
.first { $0.label == "userInfo" }?
.value as? [CodingUserInfoKey: Any]
If the next step is writing a proposal to move the protocols into the SL, proper—I’d be glad to help draft it (I’ll probably have questions hah). It’d be my first shot at an SE pitch.
er, sorry my bad just saw Daniel’s previous mention on drafting. I can take a crack at the implementation.
Could you give an example of why you want this so we can work it into the proposal?
Like many developers, I want to decode NSManagedObject
subclasses asynchronously. Since I need to do it often, I decided to write a Combine operator that functions identically to the existing decode(type:decoder:)
operator, except that it operates on the private queue of an NSManagedContext
if one exists in the given decoder with a key of context
.
While I could just require the context to be provided at the call site, I think it would be more reasonable to require userInfo
. After all, Decoder
already requires it, and that’s useless if there is never an opportunity to add anything.
@jasdev and I have drafted up a proposal for adding these protocols.
We have explicitly left out the userInfo part discussed because there were cases of TopLevelEncoder/Decoders we found on GitHub that didn't have a userInfo property and we want to allow as many libraries to conform their types as possible. We'll take further advice on this however.
@jasdev has added the protocols to his fork of swift where you can compare the changes.
TopLevelEncoder and TopLevelDecoder Protocols
- Proposal: SE-NNNN
- Authors: Daniel Tull, Jasdev Singh
- Review Manager:
- Status:
- Implementation: WIP branch
Introduction
This proposal introduces TopLevelEncoder
and TopLevelDecoder
protocols to the Standard Library (currently located in Combine), which are useful for representation-agnostic encoding and decoding of data.
Swift Evolution pitch thread: Move Combine’s TopLevelEncoder and TopLevelDecoder protocols into the standard library
Motivation
The following can apply to both decoding and encoding, but to prevent repetition we will focus on decoding.
A function may want to be able to decode data, but not know the implementation details of the specific encoding used. Consider an open-source networking package with a type to wrap the details of a network request.
struct Resource<Value> {
let request: URLRequest
let transform: (Data, URLResponse) throws -> Value
}
It may want to include an initializer for decoding a decodable type using a decoder, but be agnostic to the format of the data, so it also specifies a protocol. The package also defines conformance for the decoders in Foundation, JSONDecoder
and PropertyListDecoder
.
protocol Decoder {
func decode<T>(_ type: T.Type, from: Data) throws -> T where T: Decodable
}
extension JSONDecoder: Decoder {}
extension PropertyListDecoder: Decoder {}
extension Resource where Value: Decodable {
init<D: Decoder>(request: URLRequest, value: Value.Type, decoder: D) {
self.init(request: request) { data, _ in
try decoder.decode(Value.self, from: data)
}
}
}
This is fine if the caller wishes to use it for JSON or property list formatted data. However, they may have data defined as YAML and thus choose to use a package that provides a YAMLDecoder
.
class YAMLDecoder {
init() {}
func decode<T>(_ type: T.Type, from: Data) throws -> T where T: Decodable {
// Implementation of YAML decoder.
}
}
For YAMLDecoder
to conform to Decoder
, it would have to import the networking package which isn’t great because users of the YAML package may not necessarily wish to do so.
Users of both packages have to conform to a protocol they don’t control and further, possible changes to the library might break existing conformances.
And lastly, the Combine team relayed their intent on these two protocols living in Swift proper:
Proposed solution
Introduce new TopLevelDecoder
and TopLevelEncoder
protocols:
/// A type that defines methods for decoding.
public protocol TopLevelDecoder {
/// The type this decoder accepts.
associatedtype Input
/// Decodes an instance of the indicated type.
func decode<T>(_ type: T.Type, from: Self.Input) throws -> T where T: Decodable
}
/// A type that defines methods for encoding.
public protocol TopLevelEncoder {
/// The type this encoder produces.
associatedtype Output
/// Encodes an instance of the indicated type.
func encode<T>(_ value: T) throws -> Self.Output where T : Encodable
}
These protocols can be adopted by packages defining new decoders or encoders and those wanting the functionality described above.
Detailed design
The JSONDecoder
and PropertyListDecoder
types in Foundation should be made to conform to the TopLevelDecoder
protocol.
Likewise, the JSONEncoder
and PropertyListEncoder
types in Foundation should be made to conform to the TopLevelEncoder
protocol.
Source compatibility
This is a purely additive change. We can similarly lean on the shadowing work from SE-0235 that allowed Result
to be added to the Standard Library.
Effect on ABI stability
This is a purely additive change.
Effect on API resilience
This has no impact on API resilience which is not already captured by other language features.
Alternatives considered
None.
Hi @danielctull, thanks for kicking off proposal work here
I've been looking into this since a while and while did not yet have the time to hash out all specifics allow me to add some more context what will need to be done here, as there's a few more concerns than it seems at first:
ABI implications of "moving" a type
Since the symbols already exist in an Apple framework (Combine) and we'd want Combine to be able to use those "new" (moved) TopLevelEn/Decoder
types as well, we have to take some care around this and the change is not really just additive. First, we'll need utilize a recent compiler feature[1] that @Xi_Ge developed, an mark the "new" types using @_originallyDefinedIn
At te same time, we'll need to coordinate with Combine, to have it re-export the stdlib's version of those types/symbols, since "old apps" compiled against a version of Combine which had those types defined in itself would still be looking there for them. Instead, we'd want those apps to find the types that are now part of the stdlib (and there the originally defined in annotation will ensure that the symbol matches with what the applications are looking for).
Re-considering userInfo
:
If I remember my prior digging into this correctly I quickly arrived at the conclusion that including userInfo
is necessary for many real scenarios, so we should perhaps re-visit this (I'll give it a look next week on my end).
I hope to get time soon to look into the specifics of coordinating this dance between libraries and teams.
[1] TBDGen/IRGen: generate $ld$hide$os symbols for decls marked with @_originallyDefinedIn #28691
I think the reason that userInfo
was left out is clear: Combine doesn’t need it.
So long as Decoder
and Encoder
require userInfo
, TopLevelDecoder
and TopLevelEncoder
should too. On a related note, did you find any implementations that didn’t initialize empty userInfo
dictionaries? I don’t see how it would be possible to conform to the existing protocols without doing so.
You’re correct that the implementations that didn’t have a userInfo property, included it in the initialisers, so I imagine it wouldn’t be hard for those implementations to adopt this requirement.
I'm confused, what does "top level" refer to in TopLevelEncoder and TopLevelDecoder? The SE-0167 mentions:
It should be noted here that
JSONEncoder
andJSONDecoder
do not themselves conform toEncoder
andDecoder
... This is becauseJSONEncoder
andJSONDecoder
must present a different top-level API than they would at intermediate levels.
But I don't really understand why (there is an internal __JSONEncoder/Decoder that conform to Encoder and Decoder)
Does it mean it only encodes/decodes complete json, that the root top-level object is supposed to be?
So the status quo and intended difference is between
- "the thing which has
func container<Key>(keyedBy:)
-and-friends defined on it. In other words used when implementing Codable conformances. Those areDecoder
/Encoder
) - "the thing that people can only call encode()/decode() on" on the "outer"/"top level" layer in applications etc, where they want to use the coding infrastructure to encode/decode a type. This is not the same as 1. because it does not necessarily make sense to expose container and other functions on that API one can argue. These are the types and use case discussed here,
TopLevelEncoder/Decoder
.
The naming here isn't the most intuitive, I agree somewhat.
So the goal was to not expose the "used by coder implementation" functions to people who only need to invoke "encode/decode my type". The prior use case has types in the stdlib: Encoder/Decoder, the latter (today) does not which caused Combine (and others) create such type because indeed it's quite needed to abstract over accepting various coders in libraries.
In a way, yes. The "top level" means looking at the data from the very root of the structure. Encoder
and Decoder
are protocols that make the most sense when you're looking to decode values inside the tree, but that level of specificity doesn't make for great API when you're not the one doing the format-specific decoding. Most API consumers just care about taking their JSON Data
and getting values out of the whole data blob, not about extracting data with containers.
For a little more background, see my post in Are custom Encoders/Decoders supported? - #7 by itaiferber (and the linked post inside)
@ktoso's summary is a good way to put it, too.
If you imagine the data structure being coded as a tree drawn with its root at the top and its leaves at the bottom, then Encoder
and Decoder
are used at every level of the tree so that each node can add its children to the tree. TopLevelEncoder
and TopLevelDecoder
, on the other hand, are used to encode or decode the "top level"—i.e. the root—of that tree.
I'm not sure I love the name, but on the other hand, I'm having trouble coming up with a better one.
Didn’t you just came up with one? If everyone calls it “the root”, perhaps Root would be a better name than TopLevel.