Add userInfo protocols to standard library

I don't understand what your example is trying to illustrate here. What useful task is accomplished by extracting an arbitrary user info key from values of arbitrary type?

2 Likes

Examples 1 and 2 are just examples of how to use a protocol in Swift. I’m not seeing any actual functionality that is enabled by this.

Moreover, there is a heavy of use of generic constraints, without any explanation as to how they can be satisfied. For example:

When I construct my object (say, an NSNotification or NSError or something of that sort), how am I supposed to know what UserInfoKey type your processing object expects?

The name “user info” is a hint that the dictionary contains additional library or application-specific data. Your design would force every type with a user-info dictionary to also take library or application-specific generic parameters (and thus to make that additional data part of its own type - for example, NSNotification<UIKeyboardEventUserInfoKeys>). That defeats the point of the user info dictionary, and Is better modelled as a stores property.

This is shown again in the next example:

You can’t just go and instantiate some unknown type with a raw value you pulled out of a hat.

What should become clear from this is that user-info dictionaries are intentionally loosely-typed. A String->Any dictionary really is the best representation for them, and this protocol doesn’t really serve much purpose.

5 Likes

If you think the protocol should require [String: Any], fine.

Also, what “processing object”? This for putting something in a type and getting it out later. Potentially one of multiple types, especially if you are using different frameworks depending on availability.

The specific impetus for pitching these protocols is the proposal to move Combine’s TopLevelDecoder and TopLevelEncoder protocols to the standard library.

The protocols introduced by Combine don’t require userInfo dictionaries, even though they are required by Decoder and Encoder. Why? Because having a userInfo dictionary isn’t actually relevant to serialization. Any type might arguably benefit from having such a property, but it would be ridiculous to add that requirement to all of them.

It is widely agreed that the most “Swifty” design patterns make heavy use of composition, especially protocol composition: good protocols require only what is directly relevant to their intended meaning, and inherit from other protocols as necessary.

If Decoder requires userInfo to be a concrete type, then it becomes mutually exclusive with other protocols that require a different type. While that is now set in stone, it would be better for future protocols to avoid such requirements.

That’s the point of my examples: keeping protocols separate allows for more flexibility, especially when those writing the protocols are not the same people implementing them. And in the case of userInfo, that’s more or less the entire point.

Actually it is. See the original Codable proposal for the rationale:

CodingUserInfoKey is a bit of a useless type, though. It just wraps a string - I guess we probably should have just used String directly. It has already been acknowledged as a mistake: Why is CodingUserInfoKey's rawValue initializer failable? - #2 by itaiferber

1 Like

Because Combine’s TopLevelDecoder doesn’t require userInfo, and no protocol exists that only requires userInfo, it is more or less impossible to write a generic method that interacts with the userInfo dictionary of a type conforming to TopLevelDecoder. This is in spite of the fact that Decoder requires it, so every TopLevelDecoder must at some point supply such a dictionary.

I ran into this problem while trying to write a publisher that operated on an NSManagedObjectContext stored in a TopLevelDecoder.

import Combine

import class CoreData.NSManagedObjectContext

extension Publisher {
  /// Decodes the output from upstream using a specified decoder.
  ///
  /// If `decoder` has an `NSManagedObjectContext` inside a `userInfo` dictionary with a `CodingUserInfoKey` of `context`, decoding will be performed on that context's private queue.
  func performDecode<Item: Decodable, Coder: TopLevelDecoder>(type: Item.Type, decoder: Coder)
    -> Publishers.FlatMap<Future<Item, Error>, Publishers.MapError<Self, Error>>
  where Self.Output == Coder.Input {
    mapError { $0 as Error }.flatMap { (element: Coder.Input) in
      Future { promise in
        let decode = { promise(.init { try decoder.decode(type, from: element) }) }
        guard
          let context = Mirror(reflecting: decoder).children.first(where: { $0.label == "userInfo" }
          ).map({ $0.value as? [CodingUserInfoKey: Any] })?.map({ userInfo in
            CodingUserInfoKey(rawValue: "context").flatMap { userInfo[$0] }
          }) as? NSManagedObjectContext
        else { return decode() }
        return context.perform(decode)
      }
    }
  }
}

Needless to say, this is not ideal. My initial solution was to ask that userInfo be required by TopLevelDecoder, since it is required by Decoder. However, I’ve since concluded that the Combine team had the right idea when they excluded it: why require something you don’t need?

Let’s see how I could write this if the protocols I am proposing existed:

import Combine

import class CoreData.NSManagedObjectContext

extension Publisher {
  /// Decodes the output from upstream using a specified decoder.
  ///
  /// If `decoder` has an `NSManagedObjectContext` inside its `userInfo` dictionary with a key of `context`, decoding will be performed on that context's private queue.
  func performDecode<Item: Decodable, Coder: TopLevelDecoder & UserInfoProviding>(
    type: Item.Type, decoder: Coder
  ) -> Publishers.FlatMap<Future<Item, Error>, Publishers.MapError<Self, Error>>
  where
    Self.Output == Coder.Input, Coder.UserInfoKey: RawRepresentable,
    Coder.UserInfoKey.RawValue: ExpressibleByStringLiteral
  {
    mapError { $0 as Error }.flatMap { (element: Coder.Input) in
      Future { promise in
        let decode = { promise(.init { try decoder.decode(type, from: element) }) }
        if let context = Coder.UserInfoKey(rawValue: "context").flatMap({
          decoder.userInfo[$0] as? NSManagedObjectContext
        }) {
          return context.perform(decode)
        } else {
          return decode()
        }
      }
    }
  }
}

That is much more resilient, and allows callees as much flexibility as possible. I don’t think I can provide a more concrete example than that.

The original proposal explains that exposing a userInfo dictionary may be useful, but by including it in the protocol itself they are requiring it.

In my opinion, it would have been better to move that requirement into a different protocol, so implementations could conditionally access userInfo dictionaries if they existed. There’s no point in guaranteeing that every Decoder and Encoder has a userInfo property of a certain type: there’s no guarantee that it would have what you are trying to access anyway.

Of course, that wouldn’t be possible right now: such conditional access would require generalized existentials, unless the userInfo protocols only used concrete types.

Sure - because some decoding algorithms require it, and if it wasn’t required, we’d need more protocols for model types to declare things like DecodableButRequiresUserInfo; meaning you couldn’t decode it with a regular Decoder but would instead require a DecoderThatProvidesUserInfo. That’s just horrible - regardless of whether it’s a protocol composition or a refinement.

Yeah I would agree that it should be on TopLevelDecoder for the same reason that it’s on the standard Decoder.

It is impossible for a decoding algorithm to require that certain keys in userInfo contain values of certain types. All of them already need to handle the possibility that a value isn’t there. I don’t see how this would be any different.

Just to give a little bit of background on userInfo and its inclusion in the protocol types. The Codable API surface is a meeting point for 3 (potentially different) actors:

  1. A Codable-conforming type being encoded/decoded
  2. An Encoder/Decoder actually encoding/decoding the type (1)
  3. A top-level actor triggering the encoding/decoding from a top-level view of the encoder/decoder (2)

Each of these actors has some amount of say in the encoding/decoding process, which you can think of as a relationship between two of the actors:

  • The actual Encodable/Decodable conformance forms the contract between (1) and (2), where (1) offers behavior through its conformance, and (2) can offer different behavior via specific overrides
  • Encoding/decoding strategies form the contract between (2) and (3), where (2) offers a default implementation of encoding and decoding, and (3) can request different behavior via specific overrides
  • The userInfo dictionary forms the contract between (1) and (3), where (1) offers behavior through its conformance to the protocols, and (3) can request different behavior via specific overrides, assuming there is shared knowledge between (1) and (3) about how to handle those cases

The point of requiring the userInfo dictionary is to ensure that the relationship between (1) and (3) is always possible, even if it isn't necessary. You're right that this is less often used than the other two methods for asserting control, but short of global state, there's no other way for (1) and (3) to communicate with one another except for through (2). We included it in the protocol to offer guarantees to (1) and (3), not to augment (2). The number of Codable-conforming types, and actors triggering encoding/decoding far outweigh the number of Encoders and Decoders, so rather than exploding the protocol hierarchy even more, we wanted to offer that guarantee from the get-go.

You're also correct that a Codable-conforming type shouldn't require a specific value inside the userInfo dictionary because it can be missing, but this is a debatable part of library design — I think it's just an unclear implicit contract, and it's better to provide a sensible default.


With this, I think it would be relatively reasonable to want to add userInfo dictionaries to TopLevelEncoder/TopLevelDecoder specifically, but I don't know if I would go so far as to suggest expanding this requirement into its own protocol — given source- and ABI-compatibility requirements, API has to clear a very high bar for inclusion, and I don't think I'd personally advocate for this.

I think this suggestion has merit as one of the more reasonable ways to expose userInfo generically (the alternatives including adding a requirement to TopLevelEncoder/TopLevelDecoder with dummy default implementations to maintain source- and ABI-compatibility), but I'm not sure I would personally consider that quite sufficient for introducing API which will need to be effectively supported forever.

7 Likes

As an aside, CodingUserInfoKey itself is useful as a place to hang predefined string constants without polluting String, like other String-RawRepresentable types found throughout Foundation and many other frameworks. The mistake is not CodingUserInfoKey but that its initializer is incorrectly failable.

2 Likes

In practice, to avoid a breaking change, Decoder and Encoder would have to inherit UserInfoProviding, and provide concrete values for the associated types. However, future protocols could benefit from keeping userInfo separate.

As for TopLevelDecoder and TopLevelEncoder, a separate protocol is the only way a generic method could access userInfo without adding requirements that aren’t in the Combine version.

That relationship is only possible with shared knowledge, as you said. That knowledge consists of which keys to use and which values to expect. If a concrete userInfo property isn’t required, (1) also needs to know the type of the key and whether a userInfo dictionary exists at all: but (1) already had to know that, since it has to have the keys in the first place.

The point is somewhat moot for Decoder and Encoder now, but I’d like to propose this ahead of any future standard library types that might need this.

Alternatively, someone could add it to Foundation, which could then extend Decoder and Encoder along with its many relevant types. They use it most, after all. I can’t write pitches for Foundation, though, so here we are.

I agree with @itaiferber.

Since there's a concrete use case involving TopLevelEncoder and TopLevelDecoder, then it makes sense to consider adding these requirements to those protocols. As Itai has explained, it was already decided not to explode the protocol hierarchy when it came to Encoder and Decoder. There is no reason why we cannot add requirements to a new standard library protocol that we sink from Combine.

Meanwhile, there is no facility in Swift for changing Decoder and Encoder to refine another protocol anyway (we say protocols refine other protocols, not inherit, by the way), so that is not even a possibility to begin with.

I was going with the nomenclature used in the Language Guide. I’ve heard it both ways.

Is that a breaking change even if the requirements are unchanged? If so, then fair enough I suppose. I thought this would be non-breaking:

public protocol Decoder: UserInfoProviding {
  var codingPath: [CodingKey] { get }
  var userInfo: [CodingUserInfoKey: Any] { get }
  func container<Key>(keyedBy type: Key.Type) throws -> KeyedDecodingContainer<Key>
  func unkeyedContainer() throws -> UnkeyedDecodingContainer
  func singleValueContainer() throws -> SingleValueDecodingContainer
}

From the language guide:

Protocol Inheritance

A protocol can inherit one or more other protocols and can add further requirements on top of the requirements it inherits.

From the language reference:

protocol protocol name: inherited protocols {
protocol member declarations
}

[…]

Protocol types can inherit from any number of other protocols. When a protocol type inherits from other protocols, the set of requirements from those other protocols are aggregated, and any type that inherits from the current protocol must conform to all those requirements. For an example of how to use protocol inheritance, see Protocol Inheritance.

Fair enough! I hadn't seen that usage in the language guide. Thanks for pointing that out.

Yes.

I still think that the bar for pitching protocols should not be usefulness within the Standard Library. If that’s the criteria, then we’re going to keep needing to sink conspicuously-absent protocols like TopLevelDecoder and Identifiable from Apple frameworks.

Behavior that the Standard Library implicitly encourages or requires should be described by protocols for the sake of interoperability. People should be able to write generic methods without waiting for someone at Apple to need to. Those protocols do not necessarily need to exist in the Standard Library, but they need to be supported everywhere Swift is. That might mean the Core Libraries, it might mean some other project.

Decoder and Encoder use userInfo dictionaries. That’s a common design pattern across the community. Ergo, it is worth considering protocols that encapsulate that. I thought I would start the discussion here.

The bar isn't necessarily "usefulness within the Standard Library"—the question, as @xwu posted in the first reply to this thread, is:

The mere fact that a property is common amongst many different types does not inherently mean that it deserves to be a protocol. The goal of a pitch for a new protocol should be to demonstrate how that protocol would be used in a generic context.

@itaiferber wrote an excellent justification for the presence of a userInfo property on Encoder and Decoder, but it's heavily tied to the specific details of those protocols and the Codable API in general.

This proposal strikes me as highly similar to the discussion around the DefaultInitializable protocol, which would have had one requirement: init(). Ultimately, it fell short on the same grounds, as no one could present a compelling use for the protocol aside from noticing "hey, a lot of types offer init()".

Your posts at the beginning where you were writing functions with signatures like func example<T: UserInfoProviding>(_ input: T) were on the right track for justifying this protocol, but the bodies of those functions need to do more than just access an arbitrary value from the userInfo dictionary. The algorithm should strive to do something useful which couldn't be done with existing types.

My personal opinion aligns with many others here. It is not at all apparent that knowing the fact "this type provides some dictionary called 'userInfo'" is helpful to know when divorced from all other context about that type.

1 Like

I was thinking this would mainly be used with protocol composition. Knowing that a type provides some dictionary called userInfo and conforms to other specified protocols is useful.