I ask for advice on how to design a public protocol

I have a library for encoding and decoding types in a private binary format. The library is very simple and is meant to be fast.
Focusing only on the encoding (the issue is the same for decoding), it follows the classic scheme in which a type adopts the protocol:

public protocol BEncodable {
	func encode(to encoder: inout some BEncoder) throws
}

and receive a BEncoder istance:

public protocol BEncoder {
	var userVersion: UInt32 { get }
	var userData: Any? { get }
	mutating func encode<Value> (_ value: Value ) throws where Value : BEncodable
}

to encode its fields, like in this example from my package:

extension CGSize : BEncodable {
	public func encode(to encoder: inout some BEncoder) throws {
		try encoder.encode( width )
		try encoder.encode( height )
	}
}

The type that implements the BEncoder protocol is a struct called BinaryIOEncoder (a class makes the code slower by 10-20%, but this is not the subject of the question) and contains the functions necessary to encode the "primitive" types, which in this case are Bool, all integer types, Float, Double, String, Data.

And so a type adopting BEncodable encodes its fields, which in turn encode their fields, and so on, until a primitive type is reached, say UInt16:

 extension UInt16: BEncodable {
	 public func encode(to encoder: inout some BEncoder) throws {
		/* ... */
	 }
 }

The encode(...) method of UInt16 therefore needs to call the primitive function in the BinaryIOEncoder struct which adopts the BEncoder protocol and effectively writes the value in the data buffer.

So I have two possibilities.
The first is to include in the BEncoder protocol the BinaryIOEncoder methods for encoding primitive types, which, once they become part of a public protocol, also become public:

public protocol BEncoder {
	var userVersion: UInt32 { get }
	var userData: Any? { get }
	mutating func encode<Value> (_ value: Value ) throws where Value : BEncodable
  
  // Everything that follows should be an internal detail:
  mutating func encodeBool( _ value:Bool ) throws
	
	mutating func encodeUInt8( _ value:UInt8 ) throws
	mutating func encodeUInt16( _ value:UInt16 ) throws
	mutating func encodeUInt32( _ value:UInt32 ) throws
	mutating func encodeUInt64( _ value:UInt64 )
	mutating func encodeUInt( _ value:UInt ) throws

	mutating func encodeInt8( _ value:Int8 ) throws
	mutating func encodeInt16( _ value:Int16 ) throws
	mutating func encodeInt32( _ value:Int32 ) throws
	mutating func encodeInt64( _ value:Int64 ) throw
	mutating func encodeInt( _ value:Int ) throws

	mutating func encodeFloat( _ value:Float ) throws
	mutating func encodeDouble( _ value:Double ) throws
	
	mutating func encodeString<T>( _ value:T ) throws
	where T:StringProtocol
	
	mutating func encodeData<T>( _ value:T ) throws
	where T:MutableDataProtocol

}

and then:

 extension UInt16: BEncodable {
	 public func encode(to encoder: inout some BEncoder) throws {
		try encoder.encodeUInt16( self )
	 }
 }

But I don't really want to make public all these methods which are an implementation detail , so the second possibility is not to change at all the BEncoder protocol but writing for each primitive type this sort of horrendous "unsafe cast":

extension UInt16: BEncodable {
	public func encode(to encoder: inout some BEncoder) throws {
		try withUnsafeMutablePointer(to: &encoder) {
			try $0.withMemoryRebound(to: BinaryIOEncoder.self, capacity: 1) {
				try $0.pointee.encodeUInt16( self )
			}
		}
	}
}

I don't even know how reliable it is.

Do I miss the third, fourth, …, nth possibility? How can I design this public interface and work around the fact that I can't declare internal methods in a public protocol?

Thank you in advance for any suggestion.


Post scriptum: I know, I can get rid of BEncoder and pass directly BinaryIOEncoder as encode(to:...) parameter:

public protocol BEncodable {
	func encode(to encoder: inout BinaryIOEncoder) throws
}

I can then set up how I want visibility.

This was my first design but there are reasons why I would like to avoid doing it.

1 Like

Here's one idea (note: I haven't actually tried this):

  1. Declare your primitive encoding methods in the protocol, but give them a name that begins with an underscore, e.g. _encodeUInt16. The methods are still public, but by convention, clients are supposed to treat underscored APIs as implementation details. Many IDEs (e.g. Xcode) don't show these APIs in code completion, and I believe DocC doesn't show them in the generated documentation (for better or for worse).

  2. Give these methods an additional parameter for a value of a type that only you can construct. This prevents external clients from calling these functions.

Example:

public struct PreventExternalCallers {
   internal init() {}
}

public protocol BEncoder {
    // Public API
    …

    // "Internal" API
    mutating func encodeUInt16(_ value: UInt16, preventExternalCallers: PreventExternalCallers) throws
}

Not sure if this is what you had in mind. If you're also concerned about clients implementing your "private" methods (as part of implementing the protocol, if that's something you expect clients to do), you could also give the methods a publicly unconstructable return type.

As far as I can tell, Apple does something similar in SwiftUI. The full definition of the View protocol looks like this (slightly simplified):

public protocol View {
  static func _makeView(view: _GraphValue<Self>, inputs: _ViewInputs) -> SwiftUI._ViewOutputs
  static func _makeViewList(view: _GraphValue<Self>, inputs: _ViewListInputs) -> _ViewListOutputs
  static func _viewListCount(inputs: _ViewListCountInputs) -> Int?

  associatedtype Body : SwiftUI.View
  @ViewBuilder @MainActor(unsafe) var body: Self.Body { get }
}

But only the last two lines are the actual public interface. The other methods all take arguments of public types that have no public initializers, so external clients can't call them.

2 Likes

When you say "the type" does that imply that BEncoder is not intended to be conformed to by other types? If so, why is it a protocol at all? Or, if other types are meant to conform to BEncoder, how could MyBEncoder provide an implementation that functions properly without BEncoder exposing an interface that somehow communicates the fact that the Int/Float/String/Data types are to be treated as primitive?

1 Like

Leaving aside whether you actually need the protocol, here’s a way to enforce things based on access control, though it will do some dynamic dispatch:

public protocol BEncoder {
  mutating func _withPrimitiveEncoder(_ body: (inout OpaquePrimitiveEncoder) throws -> Void) rethrows
}

public struct OpaquePrimitiveEncoder {
  internal var rawValue: any PrimitiveEncoder
}

internal protocol PrimitiveEncoder { … }

Now implementations within your module can use try encoder._withPrimitiveEncoder { try $0.rawValue.encodeBool(…) }; those outside can access _withPrimitiveEncoder, but it does them no good.

P.S. I’m still quietly in favor of a proposal to add access control to individual protocol requirements, but that’ll need a bunch of working out.

3 Likes

Can there be different implementations of the BEncoder protocol?

If yes - then primitive types should be able to encode themselves not only using BinaryIOEncoder but using other implementations as well. So they need some interface to communicate with the encoder which abstracts over different implementations of the encoder. So the first approach is the way to go.

If no - you don't need the protocol at all. Just use BinaryIOEncoder directly, and make methods for encoding primitives internal or fileprivate. And place implementation of the primitive conformances in the same module/file.

1 Like

Thanks ole.
More than anything I was concerned that I had missed something obvious to fix the issue. It would seem not.

1 Like

Thank you Jordan for your ingenious trick. I'll give it a try but the library is explicitly designed to avoid dynamic dispatch (except for accessing userData) and I'm afraid this system would slow it down a lot.

EDIT:
Unless I'm missing something, your suggestion implies that the BinaryIOEncoder instance (which is a value type) needs to be copied 2 times whenever a primitive type is stored which makes the slowdown exponential:

public struct BinaryIOEncoder: PrimitiveEncoder {
	public mutating func _withPrimitiveEncoder(
		_ body: (inout OpaquePrimitiveEncoder) throws -> Void
	) rethrows {
        var encoder = OpaquePrimitiveEncoder(rawValue: self)	// copy
        try body( &encoder )
        self = encoder.rawValue as! Self   // copy
	}
	/* ... */
}

To work BinaryIOType should be a class or its storage must be placed in a class.

The remaining code:
public protocol BEncoder {
	var userVersion: UInt32 { get }

  var userData: Any? { get }
	
  mutating func encode<Value> (_ value: Value ) throws
  where Value : BEncodable
	
	mutating func _withPrimitiveEncoder(
		_ body: (inout OpaquePrimitiveEncoder) throws -> Void
	) rethrows
}

public struct OpaquePrimitiveEncoder {
	internal var rawValue: any PrimitiveEncoder
}

internal protocol PrimitiveEncoder : BEncoder {
	mutating func encodeBool( _ value:Bool ) throws
	mutating func encodeUInt8( _ value:UInt8 ) throws
	mutating func encodeUInt16( _ value:UInt16 ) throws
	mutating func encodeUInt32( _ value:UInt32 ) throws
	mutating func encodeUInt64( _ value:UInt64 ) throws
	mutating func encodeUInt( _ value:UInt ) throws
	mutating func encodeInt8( _ value:Int8 ) throws
	mutating func encodeInt16( _ value:Int16 ) throws
	mutating func encodeInt32( _ value:Int32 ) throws
	mutating func encodeInt64( _ value:Int64 ) throws
	mutating func encodeInt( _ value:Int ) throws
	mutating func encodeFloat( _ value:Float ) throws
	mutating func encodeDouble( _ value:Double ) throws
	mutating func encodeString<T>( _ value:T ) throws
	where T:StringProtocol
	mutating func encodeData<T>( _ value:T ) throws
	where T:MutableDataProtocol
}

//	------

extension Int: BDecodable {
	public func encode(to encoder: inout some BEncoder) throws {
		try encoder._withPrimitiveEncoder {
			try $0.rawValue.encodeInt( self )
		}
	}
}
/* etc... */

:heart:

I try to explain. The user can interact with the library in two different ways.
The first is simple, and at this level the user never interacts directly with the BinaryIOEncoder even when encoding the root of the archive, because a BEncodable extension provides the proper method so that the user can simply write:

let array	= [1,2,3,4,5]	// archive root
let data	= try array.binaryIOData() as Data

In the second (advanced) way the user istantiate and interacts directly with BinaryIOEncoder (and, in parallel, with BinaryIODecoder) for example to handle the position the encoding or decoding will take place. In this way it is possible to encode the archive so that it allows the decoding of partial sections of itself on request. BinaryIOEncoder and BinaryIODecoder therefore have "low-level" public methods to do these operations.

If I use BinaryIOEncoder directly, the user can just write:

extension MyCGSize : BEncodable {
	public func encode(to encoder: inout BinaryIOEncoder) throws {
   	    encoder.position = 42	/* !!! */
	    try encoder.encode( width )
	    try encoder.encode( height )
	}
}

and destroy the archive. As I wrote, the initial version of the library used exactly this system and I switched to the protocol precisely because of this issue.

However, I want to point out that I'm just trying to figure out what is the best possible "user interface" for my library.

Then you want to separate interfaces which allow to change the position from interface for writing values. Something like this:

public class Writer {
    public var position: Int

    public func write<T: Encodable>(_ value: T) {
        value.encode(to: BinaryIOEncoder(self))
    }
}

public struct BinaryIOEncoder {
    internal init(_ writer: Writer) {}
    public func encodeBool(_ x: Bool) {}
    …
}
2 Likes

I agree with @Nickolas_Pohilets—it doesn’t sound like you really want a protocol here, just separation of the interface into the “basic” and “advanced” portions in two different structs.

Once you’ve split up the interface as suggested above, all the primitive encodeBool (etc.) can be made internal because only your first-party BEncodable conformances are the only ones that will need to use them.

1 Like

If I understand correctly, you're suggesting something like this, right?

BEncodable.swift
import Foundation

typealias Bytes	= [UInt8]

protocol BEncodable {
	func encode(to encoder: inout BEncoder) throws
}

extension BEncodable {
	func binaryIOData() throws -> Bytes {
		var encoder = BinaryIOEncoder()
		try encoder.encode( self )
		return encoder.data()
	}
}
BinaryIOEncoder.swift
import Foundation

struct BEncoder {
	private var _data			= Bytes()
	fileprivate var	position 	= 0

	fileprivate mutating func encodeUInt8(_ value: UInt8 ) throws {
		if position == _data.endIndex {
			_data.append( contentsOf: [value] )
		} else if position >= _data.startIndex {
			let endIndex	= _data.index( position, offsetBy: 1 )
			let range		= position ..< Swift.min( _data.endIndex, endIndex )
			_data.replaceSubrange( range, with: [value] )
		} else {
			preconditionFailure("errore")
		}
		position += 1
	}
	
	fileprivate func data() -> Bytes {
		return _data
	}
	
	mutating func encode<Value:BEncodable> (_ value: Value ) throws {
		try value.encode(to: &self)
	}
}

extension UInt8: BEncodable {
	func encode(to encoder: inout BEncoder) throws {
		// primitive type see encodeUInt8()
		try encoder.encodeUInt8(self)
	}
}

struct BinaryIOEncoder {
	private var	encoder = BEncoder()

  mutating func encode<Value:BEncodable> (_ value: Value ) throws {
		try encoder.encode( value )
	}

	// change visibility
	var	position : Int {
		get { encoder.position }
		set { encoder.position = newValue }
	}

	
	// change visibility
	func data() -> Bytes {
		return encoder.data()
	}
}
main.swift
import Foundation

struct Test : BEncodable {
	let a, b, c : UInt8
	
	func encode(to encoder: inout BEncoder) throws {
		// see only encode
		try encoder.encode( a )
		try encoder.encode( b )
		try encoder.encode( c )
	}
}

do {  // standard user
	let test = Test(a: 1, b: 2, c: 3)
	let data = try test.binaryIOData()
	
	print( data )	// print [1, 2, 3]
}

do { // advanced user
	let test1 = Test(a: 1, b: 2, c: 3)
	let test2 = Test(a: 4, b: 5, c: 6)

	var encoder	= BinaryIOEncoder()
	
	try encoder.encode( test1 )
	encoder.position = 1
	try encoder.encode( test2 )

	let data = encoder.data()

	print( data )	// print [1, 4, 5, 6]
}

Thank you! It seems to work.

1 Like

Yup that’s what I was imagining, though in my head it was BinaryIOEncoder that implemented the full public interface and BEncoder that stored a private BinaryIOEncoder and exposed only the simple interface. But that’s just the mental model that made sense to me—I think it’s basically isomorphic to what you have.

1 Like