Improving JSONDecoder/Encoder Performance for large apps

k.benua · August 27, 2025, 10:20am

JSONDecoder/Encoder Performance Problem

Introduction

swift_conformsToProtocolMaybeInstantiateSuperclasses method is slow, because it traverses all protocol-conformance-descriptors in whole app when gets called first time for pair (class/enum/struct, protocol).

EmergeTools have great article about poor performance of swift_conformsToProtocolMaybeInstantiateSuperclasses.

Briefly, the more protocol-conformance your app has, the slower is swift_conformsToProtocolMaybeInstantiateSuperclasses. Our app has more than 150k of protocol conformances. It can be easily measured using this bash one-liner.

otool -l path/to/your/binary | grep '__swift5_proto$' -A 5 | grep 'size' | awk '/size/ { hex = $2; sub("0x", "", hex); print int("0x" hex)/4 + 0 }'

We take size of __swift5_proto section and divide it by 4 (4-byte integer offsets are stored here).

When `swift_conformsToProtocol` is called

In short, there are 3 ways to trigger this method:

T.self is SomeProtocol.Type
as?/as!/as (in switch statement) SomeProtocol
Generic-classes with type-generic-constraints
- swift_conformsToProtocol is triggered because class metadata contains GenericParameterVector. And GenericParameterVector has to contain protocol-witness-tables for each protocol that generic parameter conforms.

JSONDecoder Performance Flaws

unwrap function

The first place in JSONDecoder where swift_conformsToProtocolMaybeInstantiateSuperclasses is used is unwrap function

func unwrap<T: Decodable>(_ mapValue: JSONMap.Value, as type: T.Type, for codingPathNode: _CodingPathNode, _ additionalKey: (some CodingKey)? = nil) throws -> T {
    ...
    if T.self is _JSONStringDictionaryDecodableMarker.Type {
        return try self.unwrapDictionary(from: mapValue, as: type, for: codingPathNode, additionalKey)
    }
    ...
}

KeyedDecodingContainer

KeyedDecodingContainer has type-generic-constraint: K: CodingKey. It is the second place where swift_conformsToProtocol gets called.

JSONDecoder swift_conformsToProtocol Performance Impact

swift_conformsToProtocol consumes at least 84% of all JSONDecoder.decode time in our app startup scenario.

JSONEncoder Performance Flaws

wrapGeneric function

The first place in JSONEncoder where swift_conformsToProtocolMaybeInstantiateSuperclasses is used is wrapGeneric function

func wrapGeneric<T: Encodable>(_ value: T, for additionalKey: (some CodingKey)? = _CodingKey?.none) throws -> JSONEncoderValue? {
    ...
    else if let encodable = value as? _JSONStringDictionaryEncodableMarker {
        return try self.wrap(encodable as! [String:Encodable], for: additionalKey)
    } else if let array = value as? _JSONDirectArrayEncodable {
        ...
    }
    ...
}

KeyedEncodingContainer

KeyedEncodingContainer has type-generic-constraint: K: CodingKey. It is the second place where swift_conformsToProtocol gets called.

JSONEncoder swift_conformsToProtocol Performance Impact

swift_conformsToProtocol consumes at least 84% of all JSONEncoder.encode time in out app startup scenario.

Proposed Optimizations

Firstly ABI/API break-free optimizations will be covered:

#1 JSONDecoder unwrap optimization

_JSONStringDictionaryDecodableMarker is used to make String-keyed Dictionaries exempt from key conversion. So if there is no key-conversion we can skip this slow check:

switch options.keyDecodingStrategy {
  case .useDefaultKeys:    
    break
  case .convertFromSnakeCase, .custom:    
    if T.self is _JSONStringDictionaryDecodableMarker.Type {        
       return try unwrapDictionary(...)    
    }
}

return try self.with(value: mapValue, path: codingPathNode.appending(additionalKey)) {
    try type.init(from: self)
}

instead of

if T.self is _JSONStringDictionaryDecodableMarker.Type {
    return try self.unwrapDictionary(from: mapValue, as: type, for: codingPathNode, additionalKey)
}

return try self.with(value: mapValue, path: codingPathNode.appending(additionalKey)) {
    try type.init(from: self)
}

So this optimization is suitable only for .useDefaultKeys strategy.

#2 JSONEncoder wrapGeneric optimization

There are two ways to attempt optimization of this function.

If we believe that as? _JSONDirectArrayEncodable deals more benefit than harm to performance (at least in our app and in this benchmark it does more harm), then we will optimize only _JSONStringDictionaryEncodableMarker check the same way we did it for JSONDecoder and _JSONStringDictionaryDecodableMarker
If not it's better to remove as? _JSONDirectArrayEncodable check at all

Here is _JSONStringDictionaryEncodableMarker check optimization:

switch options.keyEncodingStrategy {
  case .useDefaultKeys:    
    break
  case .convertToSnakeCase, .custom:    
    if let encodable = value as? _JSONStringDictionaryEncodableMarker {        
      return try wrap(encodable as! [String: Encodable], for: additionalKey) 
    }
}

So this optimization is suitable only for .useDefaultKeys strategy.

Optimization #1 and #2 are implemented in FastCoders library.

#3 Possibly ABI/API breaking optimizations

So here we will try to solve performance issue with KeyedDecodingContainer and KeyedEncodingContainer type-generic-constraints.

The problem is not about calling KeyedDecodingContainer or KeyedEncodingContainer init, it is about referencing type with specified generic-type:

For example, take this code:

import Foundation

struct A: Codable {
    let a: Int
}

Its init(from: Decoder) throws method SIL has line like

%5 = alloc_stack [lexical] [var_decl] $KeyedDecodingContainer<A.CodingKeys>, scope 22

And its IR is:

  %4 = call ptr @__swift_instantiateConcreteTypeFromMangledName(ptr @"demangling cache variable for type metadata for Swift.KeyedDecodingContainer<output.A.(CodingKeys in _60494E8B9C642A7C4A26F3A3B6CECEB9)>") #2, !dbg !194

Internally __swift_instantiateConcreteTypeFromMangledName triggers swift_conformsToProtocol in this scenario.

So we mention type KeyedDecodingContainer with specific type A.CodingKeys.

func encode(to: Encoder) throws has the same flaw.

There are two possible ways to tackle them:

Change KeyedDecodingContainer and KeyedEncodingContainer type signature to avoid type generic constraints (wasn't implemented in this repository)
Use the same CodingKey in Codable/Decodable/Encodable conformance auto-generated code. For example, String.

#3.1 Changing type signature

So the trick is to get rid of K: CodingKey type-generic-constraint in type-declaration and move it to extension. So there will be no need for GenericParameterVector to contain protocol-witness-table and there will be no swift_conformsToProtocol call when generic-type is mentioned or instantiated.

Before:

public struct KeyedDecodingContainer<K: CodingKey> :
  KeyedDecodingContainerProtocol
{
  public typealias Key = K

  /// The container for the concrete decoder.
  internal var _box: _KeyedDecodingContainerBase

  /// Creates a new instance with the given container.
  ///
  /// - parameter container: The container to hold.
  public init<Container: KeyedDecodingContainerProtocol>(
    _ container: Container
  ) where Container.Key == Key {
    _box = _KeyedDecodingContainerBox(container)
  }

  /// The path of coding keys taken to get to this point in decoding.
  public var codingPath: [any CodingKey] {
    return _box.codingPath
  }

  // continue to conform to KeyedDecodingContainerProtocol protocol
  ...
}

After:

public struct KeyedDecodingContainer<K>
{
  /// The container for the concrete decoder.
  internal var _box: _KeyedDecodingContainerBase

  /// Creates a new instance with the given container.
  ///
  /// - parameter container: The container to hold.
  public init<Container: KeyedDecodingContainerProtocol>(
    _ container: Container
  ) where Container.Key == Key {
    _box = _KeyedDecodingContainerBox(container)
  }
}

extension KeyedDecodingContainer: KeyedDecodingContainerProtocol where K: CodingKey {
  public typealias Key = K

  /// The path of coding keys taken to get to this point in decoding.
  public var codingPath: [any CodingKey] {
    return _box.codingPath
  }

  // continue to conform to KeyedDecodingContainerProtocol protocol
  ...
}

Same trick can be applied to KeyedEncodingContainer.

Note: despite _KeyedDecodingContainerBox has type-generic-constraint it seems like we can avoid rewriting code to avoid it because of the way it gets called:

public init<Container: KeyedDecodingContainerProtocol>(
    _ container: Container
) where Container.Key == Key {
    _box = _KeyedDecodingContainerBox(container)
}

In this scenario, in IR-code there is reference to protocol-witness-table of Container implementing KeyedDecodingContainerProtocol:

define protected swiftcc ptr @"output.KeyedDecodingContainerV2.init<A where A == A1.Key, A1: Swift.KeyedDecodingContainerProtocol>(A1) -> output.KeyedDecodingContainerV2<A>"(ptr noalias %0, ptr %K, ptr %Container, ptr %Container.KeyedDecodingContainerProtocol) #0 !dbg !84

and there is no __swift_instantiateConcreteTypeFromMangledName call.

#3.2 Use String as CodingKey

Why this would be faster:

swift_conformsToProtocol works slowly only when it gets called for the first time for each (class/enum/struct, protocol) pair.
So if we will use String as CodingKey, swift_conformsToProtocol will be called with the same types: String and CodingKey
And only first call will be slow. All subsequent calls are going to be much-much faster, because ConcurrentReadableHashMap is used for caching in swift_conformsToProtocol.

How String can conform CodingKey

extension String: CodingKey {    
  public init?(stringValue: String) { 
    self = stringValue 
  }    
  public init?(intValue: Int) { nil }    
  public var intValue: Int? { nil }    
  public var stringValue: String { 
    self 
  }
}

How this can be implemented

We can introduce experimental flag. When flag is enabled, we don't auto-generate enum CodingKeys for our struct/enum and use raw String as CodingKeys in init(from: Decoder) throws and encode(to: Encoder) throws.

Additional advantages

Each auto-generated enum CodingKeys adds 5 protocol-conformance-descriptors. godbolt:

CodingKey
Hashable
Equatable
CustomDebugStringConvertible
CustomStringConvertible

Also, it each CodingKey adds around 1.8 kb to app size (measured on the same 10k Codable structures):

codable-benchmark-package-no-coding-keys - where String is used as CodingKey but there are CodingKeys to match __swift5_proto section size
- 49 mb
codable-benchmark-package-no-coding-keys-measure-size - where String is used as CodingKey and there are no CodingKeys
- 31.1 mb
So each CodingKey adds around 1.8 kb to application binary size.

So if shared CodingKey is implemented we could:

Optimize application size
Optimize overall application performance due to boosting swift_conformsToProtocol method by __swift5_proto section size reduction.
- codable-benchmark-package-no-coding-keys has 70321 protocol conformance descriptos
- codable-benchmark-package-no-coding-keys-measure-size has only 20321 protocol conformance descriptos

Optimizations results

Measurements in our app

In our app we applied only JSONDecoder.unwrap and JSONEncoder.wrapGeneric optimizations without using String as CodingKeys.

We've measured all JSONDecoder.decode and JSONEncoder.encode durations and added them together.

We have 80k measurements from different devices. ~40k with optimized JSONDecoder and JSONEncoder and ~40k with standard JSONDecoder and JSONEncoder with duration logging.

quantile	0.1	0.25	0.5	0.75	0.9
standard JSONDecoder	198 ms	282 ms	422 ms	667 ms	1017 ms
optimized JSONDecoder	100 ms	133 ms	200 ms	322 ms	528 ms
Difference	↑49.5%	↑52.8%	↑52.6%	↑51.7%	↑48.1%

And for JSONEncoder:

quantile	0.1	0.25	0.5	0.75	0.9
standard JSONEncoder	59 ms	94 ms	159 ms	289 ms	547 ms
optimized JSONEncoder	14 ms	30 ms	73 ms	135 ms	220 ms
Difference	↑76%	↑68%	↑54%	↑53.2%	↑59.8%

Briefly, new JSONDecoder became as twice as fast as standard JSONDecoder and JSONEncoder is at least twice as fast as standard JSONEncoder.

My benchmark measurements

I've implemented my own benchmark for JSONDecoder/Encoder: GitHub - ChrisBenua/JSONDecoderEncoderBenchmarks: Illustrating high overhead in JSONDecoder/Encoder and Codable implementation

JSONDecoder

In this benchmark I've measured performance in 4 variations:

standard JSONDecoder
standard JSONDecoder + String as CodingKey
optimized JSONDecoder
optimized JSONDecoder + String as CodingKey

quantile	0.25	0.5	0.75
standard JSONDecoder	5.81 s	5.826 s	5.86 s
standard JSONDecoder + String as CodingKey	3.24 s (↑44%)	3.26 s (↑44%)	3.29 s (↑43.9%)
optimized JSONDecoder	2.64 s (↑55%)	2.65 s (↑55%)	2.66 s (↑54.6%)
optimized JSONDecoder + String as CodingKey	0.113 s (↑98%)	0.114 s (↑98%)	0.116 s (↑98%)

JSONEncoder

In this benchmark I've measured performance in 4 variations:

standard JSONEncoder
standard JSONEncoder + String as CodingKey
optimized JSONEncoder
optimized JSONEncoder + String as CodingKey

quantile	0.25	0.5	0.75
standard JSONEncoder	8.06 s	8.08 s	8.12 s
standard JSONEncoder + String as CodingKey	5.49 s (↑32%)	5.52 s (↑32%)	5.55 s (↑32%)
optimized JSONEncoder	2.67 s (↑67%)	2.68 s (↑67%)	2.69 s (↑67%)
optimized JSONEncoder + String as CodingKey	0.148 s (↑98.1%)	0.149 s (↑98.2%)	0.151 s (↑98.1%)

My benchmark illustrates how big Swift Runtime slows down JSONDecoder and JSONEncoder.

Apple Benchmark

Swift-foundation repository has some JSONDecoder/Encoder benchmarking logic: JSONBenchmark.swift.

Apple Benchmark Flaws

It decodes/encode the same models for 1 bln times without relaunching app
- This way all swift_conformsToProtocol overhead is disguised, because swift_conformsToProtocol is slow only on first iteration.
- Small binary size and small __swift5_proto section

My benchmark

Structure

Library FastCoders contains optimized realizations of JSONDecoder/JSONEncoder
RegularModels contains 10k Codable models with standard Codable implementation. These 10k Codable models can be semantically splitted to 2.5k groups of 4.
StringCodingKeyModels contains same 10k Codable models with manually implemented Codable with String as CodingKey
codable-benchmark-package - target where 2.5k decodings and encodings of RegularModels duration is measured
codable-benchmark-package-no-coding-keys - target where 2.5k decodings and encodings of StringCodingKeyModels duration is measured.
codable-benchmark-package and codable-benchmark-package-no-coding-keys use A1_Hierarchy.json file for decoding. Its size is only 319 bytes.

Notes:

To match size of __swift5_proto in codable-benchmark-package-no-coding-keys match size of __swift5_proto in codable-benchmark-package I've generated CodingKeys enum in each class but it is not used in encode(to: Encoder) or decode(from: Decoder).

Building

Use ./build.sh for building and stripping codable-benchmark-package and codable-benchmark-package-no-coding-key.

Checking __swift5_proto size

To get amount of protocol-conformance-descriptors in binary use this script:

otool -l .build/arm64-apple-macosx/release/codable-benchmark-package | grep '__swift5_proto$' -A 5 | grep 'size' | awk '/size/ { hex = $2; sub("0x", "", hex); print int("0x" hex)/4 + 0 }' outputs 70320.
otool -l .build/arm64-apple-macosx/release/codable-benchmark-package-no-coding-keys | grep '__swift5_proto$' -A 5 | grep 'size' | awk '/size/ { hex = $2; sub("0x", "", hex); print int("0x" hex)/4 + 0 }' outputs 70321.
So in case of swift_conformsToProtocol performance both binaries are pretty similar.

Running

codable-benchmark-package and codable-benchmark-package-no-coding-key has 4 modes:
- decode - measures decoding using standard JSONDecoder
- decode_new - measure decoding using optimized JSONDecoder
- encode - measure encoding using standard JSONEncoder
- encode_new - measure encoding using standard JSONEncoder

I've used run_bench.py script to run binary for each mode. It measures each binary and each mode 100 times. It takes a while to run. You can easiliy adjust amount of repetitions in run_bench.py.

k.benua · August 27, 2025, 11:31am

Also I've created issue in swift-foundation repository

github.com/swiftlang/swift-foundation

Boosting JSONDecoder/Encoder performance for large apps

opened 11:21AM - 27 Aug 25 UTC

ChrisBenua

## Table of Contents 1. JSONDecoder/Encoder Performance Problem 2. JSONDecoder… Performance Flaws 3. Proposed Optimizations 4. Optimizations Results 5. Apple Benchmark Overview 6. Apple Benchmark Flaws 7. My Benchmark ## JSONDecoder/Encoder Performance Problem ### Introduction `swift_conformsToProtocolMaybeInstantiateSuperclasses` method is slow, because it traverses all protocol-conformance-descriptors in whole app when gets called first time for pair (class/enum/struct, protocol). EmergeTools have great [article](https://www.emergetools.com/blog/posts/SwiftProtocolConformance) about poor performance of `swift_conformsToProtocolMaybeInstantiateSuperclasses`. Briefly, the more protocol-conformance your app has, the slower is `swift_conformsToProtocolMaybeInstantiateSuperclasses`. Our app has more than 150k of protocol conformances. It can be easily measured using this bash one-liner. `otool -l path/to/your/binary | grep '__swift5_proto$' -A 5 | grep 'size' | awk '/size/ { hex = $2; sub("0x", "", hex); print int("0x" hex)/4 + 0 }'` We take size of `__swift5_proto` section and divide it by 4 (4-byte integer offsets are stored here). #### When `swift_conformsToProtocol` is called In short, there are 3 ways to trigger this method: * `T.self is SomeProtocol.Type` * `as?/as!/as (in switch statement) SomeProtocol` * Generic-classes with type-generic-constraints * `swift_conformsToProtocol` is triggered because class metadata contains GenericParameterVector. And GenericParameterVector has to contain protocol-witness-tables for each protocol that generic parameter conforms. ### JSONDecoder Performance Flaws #### unwrap function The first place in `JSONDecoder` where `swift_conformsToProtocolMaybeInstantiateSuperclasses` is used is `unwrap` function ```swift func unwrap<T: Decodable>(_ mapValue: JSONMap.Value, as type: T.Type, for codingPathNode: _CodingPathNode, _ additionalKey: (some CodingKey)? = nil) throws -> T { ... if T.self is _JSONStringDictionaryDecodableMarker.Type { return try self.unwrapDictionary(from: mapValue, as: type, for: codingPathNode, additionalKey) } ... } ``` #### KeyedDecodingContainer [`KeyedDecodingContainer`](https://developer.apple.com/documentation/swift/keyeddecodingcontainer) has type-generic-constraint: `K: CodingKey`. It is the second place where `swift_conformsToProtocol` gets called. #### JSONDecoder swift_conformsToProtocol Performance Impact ![JSONDecoder](https://global.discourse-cdn.com/swift/optimized/3X/e/9/e91bfcc92932e52df65b822276b5f977dfd5e82a_2_1380x318.jpeg) `swift_conformsToProtocol` consumes at least 84% of all `JSONDecoder.decode` time in our app startup scenario. ### JSONEncoder Performance Flaws #### wrapGeneric function The first place in `JSONEncoder` where `swift_conformsToProtocolMaybeInstantiateSuperclasses` is used is `wrapGeneric` function ```swift func wrapGeneric<T: Encodable>(_ value: T, for additionalKey: (some CodingKey)? = _CodingKey?.none) throws -> JSONEncoderValue? { ... else if let encodable = value as? _JSONStringDictionaryEncodableMarker { return try self.wrap(encodable as! [String:Encodable], for: additionalKey) } else if let array = value as? _JSONDirectArrayEncodable { ... } ... } ``` #### KeyedEncodingContainer [`KeyedEncodingContainer`](https://developer.apple.com/documentation/swift/keyedencodingcontainer) has type-generic-constraint: `K: CodingKey`. It is the second place where `swift_conformsToProtocol` gets called. #### JSONEncoder swift_conformsToProtocol Performance Impact ![JSONEncoder](https://global.discourse-cdn.com/swift/original/3X/3/7/3734b798980365837ac4052292cc36e233207d74.jpeg). `swift_conformsToProtocol` consumes at least 84% of all `JSONEncoder.encode` time in out app startup scenario. ## Proposed Optimizations Firstly ABI/API break-free optimizations will be covered: ### №1 JSONDecoder unwrap optimization `_JSONStringDictionaryDecodableMarker` is used to make `String`-keyed Dictionaries exempt from key conversion. So if there is no key-conversion we can skip this slow check: ```swift switch options.keyDecodingStrategy { case .useDefaultKeys: break case .convertFromSnakeCase, .custom: if T.self is _JSONStringDictionaryDecodableMarker.Type { return try unwrapDictionary(...) } } return try self.with(value: mapValue, path: codingPathNode.appending(additionalKey)) { try type.init(from: self) } ``` instead of ```swift if T.self is _JSONStringDictionaryDecodableMarker.Type { return try self.unwrapDictionary(from: mapValue, as: type, for: codingPathNode, additionalKey) } return try self.with(value: mapValue, path: codingPathNode.appending(additionalKey)) { try type.init(from: self) } ``` So this optimization is suitable only for `.useDefaultKeys` strategy. ### №2 JSONEncoder wrapGeneric optimization There are two ways to attempt optimization of this function. * If we believe that `as? _JSONDirectArrayEncodable` deals more benefit than harm to performance (at least in our app and in this benchmark it does more harm), then we will optimize only `_JSONStringDictionaryEncodableMarker` check the same way we did it for `JSONDecoder` and `_JSONStringDictionaryDecodableMarker` * If not it's better to remove `as? _JSONDirectArrayEncodable` check at all Here is `_JSONStringDictionaryEncodableMarker` check optimization: ```swift switch options.keyEncodingStrategy { case .useDefaultKeys: break case .convertToSnakeCase, .custom: if let encodable = value as? _JSONStringDictionaryEncodableMarker { return try wrap(encodable as! [String: Encodable], for: additionalKey) } } ``` So this optimization is suitable only for `.useDefaultKeys` strategy. Optimization №1 and №2 are implemented in `FastCoders` library. ### №3 Possibly ABI/API breaking optimizations So here we will try to solve performance issue with `KeyedDecodingContainer` and `KeyedEncodingContainer` type-generic-constraints. The problem is not about calling `KeyedDecodingContainer` or `KeyedEncodingContainer` init, it is about referencing type with specified generic-type: For example, take this code: ```swift import Foundation struct A: Codable { let a: Int } ``` Its `init(from: Decoder) throws` method SIL has line like ```SIL %5 = alloc_stack [lexical] [var_decl] $KeyedDecodingContainer<A.CodingKeys>, scope 22 ``` And its IR is: ```IR %4 = call ptr @__swift_instantiateConcreteTypeFromMangledName(ptr @"demangling cache variable for type metadata for Swift.KeyedDecodingContainer<output.A.(CodingKeys in _60494E8B9C642A7C4A26F3A3B6CECEB9)>") #2, !dbg !194 ``` Internally `__swift_instantiateConcreteTypeFromMangledName` triggers `swift_conformsToProtocol` in this scenario. So we mention type `KeyedDecodingContainer` with specific type `A.CodingKeys`. `func encode(to: Encoder) throws` has the same flaw. There are two possible ways to tackle them: * Change `KeyedDecodingContainer` and `KeyedEncodingContainer` type signature to avoid type generic constraints (wasn't implemented in this repository) * Use the same `CodingKey` in `Codable/Decodable/Encodable` conformance auto-generated code. For example, `String`. #### №3.1 Changing type signature So the trick is to get rid of `K: CodingKey` type-generic-constraint in type-declaration and move it to `extension`. So there will be no need for GenericParameterVector to contain protocol-witness-table and there will be no `swift_conformsToProtocol` call when generic-type is mentioned or instantiated. Before: ```swift public struct KeyedDecodingContainer<K: CodingKey> : KeyedDecodingContainerProtocol { public typealias Key = K /// The container for the concrete decoder. internal var _box: _KeyedDecodingContainerBase /// Creates a new instance with the given container. /// /// - parameter container: The container to hold. public init<Container: KeyedDecodingContainerProtocol>( _ container: Container ) where Container.Key == Key { _box = _KeyedDecodingContainerBox(container) } /// The path of coding keys taken to get to this point in decoding. public var codingPath: [any CodingKey] { return _box.codingPath } // continue to conform to KeyedDecodingContainerProtocol protocol ... } ``` After: ```swift public struct KeyedDecodingContainer<K> { /// The container for the concrete decoder. internal var _box: _KeyedDecodingContainerBase /// Creates a new instance with the given container. /// /// - parameter container: The container to hold. public init<Container: KeyedDecodingContainerProtocol>( _ container: Container ) where Container.Key == Key { _box = _KeyedDecodingContainerBox(container) } } extension KeyedDecodingContainer: KeyedDecodingContainerProtocol where K: CodingKey { public typealias Key = K /// The path of coding keys taken to get to this point in decoding. public var codingPath: [any CodingKey] { return _box.codingPath } // continue to conform to KeyedDecodingContainerProtocol protocol ... } ``` Same trick can be applied to `KeyedEncodingContainer`. Note: despite `_KeyedDecodingContainerBox` has type-generic-constraint it seems like we can avoid rewriting code to avoid it because of the way it gets called: ```swift public init<Container: KeyedDecodingContainerProtocol>( _ container: Container ) where Container.Key == Key { _box = _KeyedDecodingContainerBox(container) } ``` In this scenario, in IR-code there is reference to protocol-witness-table of `Container` implementing `KeyedDecodingContainerProtocol`: ```ir define protected swiftcc ptr @"output.KeyedDecodingContainerV2.init<A where A == A1.Key, A1: Swift.KeyedDecodingContainerProtocol>(A1) -> output.KeyedDecodingContainerV2<A>"(ptr noalias %0, ptr %K, ptr %Container, ptr %Container.KeyedDecodingContainerProtocol) #0 !dbg !84 ``` and there is no `__swift_instantiateConcreteTypeFromMangledName` call. #### №3.2 Use String as CodingKey Why this would be faster: * `swift_conformsToProtocol` works slowly only when it gets called for the first time for each (class/enum/struct, protocol) pair. * So if we will use `String` as `CodingKey`, `swift_conformsToProtocol` will be called with the same types: `String` and `CodingKey` * And only first call will be slow. All subsequent calls are going to be much-much faster, because `ConcurrentReadableHashMap` is used for caching in `swift_conformsToProtocol`. ##### How String can conform CodingKey ```swift extension String: CodingKey { public init?(stringValue: String) { self = stringValue } public init?(intValue: Int) { nil } public var intValue: Int? { nil } public var stringValue: String { self } } ``` ##### How this can be implemented We can introduce experimental flag. When flag is enabled, we don't auto-generate `enum CodingKeys` for our `struct/enum` and use raw `String` as `CodingKeys` in `init(from: Decoder) throws` and `encode(to: Encoder) throws`. ##### Additional advantages Each auto-generated `enum CodingKeys` adds 5 protocol-conformance-descriptors. [godbolt](https://godbolt.org/z/z3rE1b5xs): * `CodingKey` * `Hashable` * `Equatable` * `CustomDebugStringConvertible` * `CustomStringConvertible` Also, it each `CodingKey` adds around 1.8 kb to app size (measured on the same 10k `Codable` structures): * codable-benchmark-package-no-coding-keys - where `String` is used as `CodingKey` but there are `CodingKeys` to match `__swift5_proto` section size * 49 mb * codable-benchmark-package-no-coding-keys-measure-size - where `String` is used as `CodingKey` and there are no `CodingKeys` * 31.1 mb * So each `CodingKey` adds around 1.8 kb to application binary size. So if shared `CodingKey` is implemented we could: * Optimize application size * Optimize overall application performance due to boosting `swift_conformsToProtocol` method by `__swift5_proto` section size reduction. * codable-benchmark-package-no-coding-keys has 70321 protocol conformance descriptos * codable-benchmark-package-no-coding-keys-measure-size has only 20321 protocol conformance descriptos ### Optimizations results #### Measurements in our app In our app we applied only `JSONDecoder.unwrap` and `JSONEncoder.wrapGeneric` optimizations without using `String` as `CodingKeys`. We've measured all `JSONDecoder.decode` and `JSONEncoder.encode` durations and added them together. We have 80k measurements from different devices. \~40k with optimized `JSONDecoder` and `JSONEncoder` and \~40k with standard `JSONDecoder` and `JSONEncoder` with duration logging. | quantile | 0.1 | 0.25 | 0.5 | 0.75 | 0.9 | |----|----|----|----|----|----| | standard JSONDecoder | 198 ms | 282 ms | 422 ms | 667 ms | 1017 ms | | optimized JSONDecoder | 100 ms | 133 ms | 200 ms | 322 ms | 528 ms | | Difference | ↑49.5% | ↑52.8% | ↑52.6% | ↑51.7% | ↑48.1% | And for `JSONEncoder`: | quantile | 0.1 | 0.25 | 0.5 | 0.75 | 0.9 | |----|----|----|----|----|----| | standard JSONEncoder | 59 ms | 94 ms | 159 ms | 289 ms | 547 ms | | optimized JSONEncoder | 14 ms | 30 ms | 73 ms | 135 ms | 220 ms | | Difference | ↑76% | ↑68% | ↑54% | ↑53.2% | ↑59.8% | Briefly, new `JSONDecoder` became as twice as fast as standard `JSONDecoder` and `JSONEncoder` is at least twice as fast as standard `JSONEncoder`. #### My benchmark measurements I've implemented my own benchmark for JSONDecoder/Encoder: https://github.com/ChrisBenua/JSONDecoderEncoderBenchmarks?tab=readme-ov-file#proposed-optimizations ##### JSONDecoder In this benchmark I've measured performance in 4 variations: * standard `JSONDecoder` * standard `JSONDecoder` + `String` as `CodingKey` * optimized `JSONDecoder` * optimized `JSONDecoder` + `String` as `CodingKey` | quantile | 0.25 | 0.5 | 0.75 | |----|----|----|----| | standard JSONDecoder | 5.81 s | 5.826 s | 5.86 s | | standard JSONDecoder + String as CodingKey | 3.24 s (↑44%) | 3.26 s (↑44%) | 3.29 s (↑43.9%) | | optimized JSONDecoder | 2.64 s (↑55%) | 2.65 s (↑55%) | 2.66 s (↑54.6%) | | optimized JSONDecoder + String as CodingKey | 0.113 s (↑98%) | 0.114 s (↑98%) | 0.116 s (↑98%) | ##### JSONEncoder In this benchmark I've measured performance in 4 variations: * standard `JSONEncoder` * standard `JSONEncoder` + `String` as `CodingKey` * optimized `JSONEncoder` * optimized `JSONEncoder` + `String` as `CodingKey` | quantile | 0.25 | 0.5 | 0.75 | |----|----|----|----| | standard JSONEncoder | 8.06 s | 8.08 s | 8.12 s | | standard JSONEncoder + String as CodingKey | 5.49 s (↑32%) | 5.52 s (↑32%) | 5.55 s (↑32%) | | optimized JSONEncoder | 2.67 s (↑67%) | 2.68 s (↑67%) | 2.69 s (↑67%) | | optimized JSONEncoder + String as CodingKey | 0.148 s (↑98.1%) | 0.149 s (↑98.2%) | 0.151 s (↑98.1%) | My benchmark illustrates how big Swift Runtime slows down `JSONDecoder` and `JSONEncoder`. ## Apple Benchmark Swift-foundation repository has some JSONDecoder/Encoder benchmarking logic: [JSONBenchmark.swift](https://github.com/swiftlang/swift-foundation/blob/4e013668a999a01b9cca29473a2c687e707f23cd/Benchmarks/Benchmarks/JSON/JSONBenchmark.swift#L81). ### Apple Benchmark Flaws * It decodes/encode the same models for 1 bln times without relaunching app * This way all `swift_conformsToProtocol` overhead is disguised, because `swift_conformsToProtocol` is slow only on first iteration. * Small binary size and small `__swift5_proto` section ## My benchmark ### Structure * Library `FastCoders` contains optimized realizations of `JSONDecoder`/`JSONEncoder` * `RegularModels` contains 10k Codable models with standard Codable implementation. These 10k Codable models can be semantically splitted to 2.5k groups of 4. * `StringCodingKeyModels` contains same 10k Codable models with manually implemented `Codable` with `String` as `CodingKey` * `codable-benchmark-package` - target where 2.5k decodings and encodings of `RegularModels` duration is measured * `codable-benchmark-package-no-coding-keys` - target where 2.5k decodings and encodings of `StringCodingKeyModels` duration is measured. * `codable-benchmark-package` and `codable-benchmark-package-no-coding-keys` use `A1_Hierarchy.json` file for decoding. Its size is only 319 bytes. Notes: * To match size of `__swift5_proto` in `codable-benchmark-package-no-coding-keys` match size of `__swift5_proto` in `codable-benchmark-package` I've generated CodingKeys enum in each class but it is not used in `encode(to: Encoder)` or `decode(from: Decoder)`. ### Building Use `./build.sh` for building and stripping `codable-benchmark-package` and `codable-benchmark-package-no-coding-key`. ### Checking \__swift5_proto size To get amount of protocol-conformance-descriptors in binary use this script: * `otool -l .build/arm64-apple-macosx/release/codable-benchmark-package | grep '__swift5_proto$' -A 5 | grep 'size' | awk '/size/ { hex = $2; sub("0x", "", hex); print int("0x" hex)/4 + 0 }'` outputs 70320. * `otool -l .build/arm64-apple-macosx/release/codable-benchmark-package-no-coding-keys | grep '__swift5_proto$' -A 5 | grep 'size' | awk '/size/ { hex = $2; sub("0x", "", hex); print int("0x" hex)/4 + 0 }'` outputs 70321. * So in case of `swift_conformsToProtocol` performance both binaries are pretty similar. ### Running * `codable-benchmark-package` and `codable-benchmark-package-no-coding-key` has 4 modes: * `decode` - measures decoding using standard `JSONDecoder` * `decode_new` - measure decoding using optimized `JSONDecoder` * `encode` - measure encoding using standard `JSONEncoder` * `encode_new` - measure encoding using standard `JSONEncoder` I've used `run_bench.py` script to run binary for each mode. It measures each binary and each mode 100 times. It takes a while to run. You can easiliy adjust amount of repetitions in `run_bench.py`.

kperryua · August 27, 2025, 6:42pm

Hello!

Thank you for these amazing contributions. I’ll definitely respond to your PR as I’m always eager to see additional performance optimizations in this space.

I think optimizations #1 and #2 will be pretty straightforward to take. I’m a little reticent about both #3.1 and #3.2 however. At this stage, I don’t think there’s any way to reconcile the ABI compatibility break caused by #3.1.

RE: #3.2, I’m also quite sympathetic to the hidden (writable!) DATA region cost incurred by the 5 protocol conformances created for each CodingKey type. I would hope that the language could be more conservative with that resource for something so common. However, A) adding a public conformance of CodingKey to String could conflict with other such conformances in existing code (despite the encouragement NOT to do so), and B) for structural types, this defeats one of the main purposes of the CodingKey to begin with, which is making it harder to write incorrect codes through the use of strong typing.

As I’m sure you know, the compiler allows you to do piecemeal usage of the synthesized Codable implementation. In particular, you can write custom init(from:)and/or encode(to:) implementations that reference the synthesized CodingKeys type. If String is used as the CodingKey, then the compiler will accept any String value in custom implementations, opening the door for potential mistakes. Perhaps this approach could be used if the compiler detects that it is synthesizing the entire implementation, in which case the strong typing is irrelevant. I’d also be more amenable if there was some way to opt-in per Codable instead of per-module as your flag approach implies. Or maybe there’s some way we could convince the compiler to treat a synthesized CodingKey, String-RawRepresentable type as completely transparent, where the type essentially disappears entirely at runtime and Strings are used directly instead (hand-waving quite a bit here).

Jon_Shier · August 27, 2025, 6:49pm

You could also look at GitHub - michaeleisel/ZippyJSON: A much faster version of JSONDecoder to see if there are any other optimizations you can take, or even if you want to fork it and apply all your enhancements directly. Modern JSONDecoder is about as fast, but wonder if there's a way to combine the newer Decoder implementation of JSONDecoder with the simd string parsing of ZippyJSONDecoder for an even faster implementation. No encoder support though.

k.benua · August 27, 2025, 7:16pm

Hello, Kevin!

Thank you for your response! I'm glad to hear that my small performance research is valuable!

Sure, #3.1 and #3.2 are quite risky changes. You're absolutely right about adding CodingKey conformance to String! Maybe we can introduce some struct like AnyCodingKey?

 struct AnyCodingKey: CodingKey {
    let stringValue: String
    var intValue: Int? { nil }

    init?(stringValue: String) {
        self.stringValue = stringValue
    }

    init?(intValue: Int) {
        nil
    }
}

Thanks for pointing out about using only one piece of automatically generated Codable conformance code. In this case, no doubts, we should stick with enum-like CodingKeys. But if the whole implementation is autogenerated by compiler, we can stick with AnyCodingKey or String.

I see, there is lots of obstacles in implementing third optimization. Maybe I should write some Swift Macro for this, if there is no other way?

kperryua · August 27, 2025, 7:27pm

If we were to go with this kind of approach, yeah, I’d absolutely suggest a shared String-backed CodingKey like this.

It’s a possible approach if the entire implementation is autogenerated, potentially even without a flag, but…

… ultimately I hesitate to make any large changes to the compiler synthesis side of Codable. Ultimately exciting new things in this space are coming down the pike which specifically target resolving the issues identified and tackled here (and more!) which will diminish the value of big changes in present-day Codable.

k.benua · August 27, 2025, 7:52pm

Hello, Jon!

Thanks for mentioning ZippyJSONDecoder!

I've made small dive into ZippyJSONDecoder and here is what I've found:

Surely, ZippyJSONDecoder is faster in case of parsing, but still struggles from the same overhead from Swift Runtime when using casts and type-generic-constraints.

There is check for protocol-conformance

And it also uses KeyedDecodingContainer (because we must use it when we subclass JSONDecoder)

So in case of Swift Runtime overhead standard JSONDecoder and ZippyJSONDecoder solutions are pretty similar.

In case if @michaeleisel still supports this repo, I can slightly improve ZippyJSONDecoder performance

nocchijiang · August 29, 2025, 5:43am

Are you aware that dyld has been generating a Swift type/protocol conformance cache for apps starting from iOS 16 (and equivalent versions on other Apple platforms) which greatly reduce the cost of conformsToProtocol calls?

k.benua · August 29, 2025, 10:03am

Sure, I've read EmergeTools article about that Emerge Tools Blog | How iOS 16 makes your app launch faster.

But my team and I conducted AB-test in production. 98% of our users already use iOS 16 or newer iOS versions. And still we get massive improvement in JSONDecoder/JSONEncoder speed.

Here are my thought on why this optimization does not work here.

Lets see how _JSONStringDictionaryDecodableMarkerType is introduced:

private protocol _JSONStringDictionaryDecodableMarker {
    static var elementType: Decodable.Type { get }
}

extension Dictionary : _JSONStringDictionaryDecodableMarker where Key == String, Value: Decodable {
    static var elementType: Decodable.Type { return Value.self }
}

It adds exactly one protocol conformance descriptor to binary. So when we check whether [String: T] where T: Decodable conforms to _JSONStringDictionaryDecodableMarker we are trying to find protocol conformance descriptor of [String: T] to _JSONStringDictionaryDecodableMarker but dyld cache contains protocol conformance descriptor of [String: Decodable] conforming to _JSONStringDictionaryDecodableMarker. And this check fails dyld/dyld/DyldAPIs.cpp at c8a445f88f9fc1713db34674e79b00e30723e79d · apple-oss-distributions/dyld · GitHub

I could be wrong but I guess that we don't use dyld cache in this case at all.

Jon_Shier · September 5, 2025, 3:26pm

Just saw another new JSONDecoder announced: GitHub - reers/ReerJSON: A faster version of JSONDecoder based on yyjson, which seems to be even faster than ZippyJSON (which is still faster than Foundation).

k.benua · September 5, 2025, 6:08pm

And this implementation struggles from the same thing as Foundation.JSONDecoder and ZippyJSONDecoder.

There is similar check that triggers swift_conformsToProtocol ReerJSON/Sources/ReerJSON/JSONDecoderImpl.swift at main · reers/ReerJSON · GitHub

I'll create another PR here.

Thanks for spotting new JSONDecoder implementation!

michaeleisel · September 8, 2025, 9:28pm

FYI, I’ve made an experimental library designed to speed up protocol conformance checking when an object does not conform to a protocol: GitHub - michaeleisel/FastCast: Fast protocol casting and conformance checking for Swift

It does this by caching the result of protocol conformance checks (whether the check returned true or not) from previous runs of the same app version.

When used for the protocol conformance check for _JSONStringDictionaryDecodableMarker for instance, that @k.benua sped up when the keys are not being modified, we can see a significant speedup with his benchmarking library for when the keys are being modified.