+1 on the idea
-1 on this API
I believe diffing belongs in the standard library. It is a complicated problem that is difficult to get efficiently right. The usage of a good diff is widely applicable.
I do not like this implementation, for two huge reasons:
-
I do not like that this is essentially the naĂŻve implementation of diffing: it will only tell you what has been inserted and what has been deleted. There is minimal advantage to this instead of just doing some basic Set<T>
symmetric differences in my own code
-
The code, as it stands, does not provide a way for me to supplement the diffing algorithm with my own varieties. There is inferringMoves()
, but even that is insufficient because it does not allow me to indicate "equal but not identical" (ie, "updated")
I would much prefer to see a public protocol defined that defines the API to diff two collections. The standard library could provide a single, concrete implementation of this protocol to do the InsertAndDeleteDiff
implementation as it currently stands. The inferringMoves()
bit of the API would simply be an alternate implementation of the algorithm that delegates to InsertAndDeleteDiff
, and then layers the move inference on top of it.
However, the usage of a protocol would then allow me, as an app author, to provide yet another diffing algorithm that accounts for object updatability, so that for my problem space, I can get the semantics that I require.
I could see the API looking something like this (I'm using Array
as the collection type for brevity):
protocol DiffChange { }
protocol DiffingAlgorithm {
associatedtype ChangeType: DiffChange
func compute<T>(_ source: Array<T>, destination: Array<T>, elementsEqual: (T, T) -> Bool) -> Array<ChangeType>
}
The associated type on the protocol means that for an InsertAndDelete
implementation, you could have an enum that only has .insert
and .delete
cases. But if you used the InferringMoves
implementation, then the enum could have .insert
, .delete
, and .move
cases. (Or it doesn't even have to be an enum!)
From this you have the basic building blocks to build a diffing system that could truly be extensible (as something in the standard library should be) instead of being locked in to a single implementation that will not be useful except for the most trivial of cases.
The default implementations on BidirectionalCollection
or whatever would basically be:
extension BidirectionalCollection {
func difference<D: DiffingAlgorithm>(to other: Self, using: D) â Array<D.ChangeType> { ... }
func difference(to other: Self) â Array<InsertAndDelete.ChangeType> {
return difference(to: other, using: InsertAndDelete())
}
}