Is this a flaw in `Data` design?

Here is a simple example that crashed in my library because I had the assumption that the returned Data will have adjusted indexes which apperently it doesn't since it's not really Data but a view into a different storage.

func first(_ data: Data) {
  guard data.count == 8 else { return }
  second(data[0 ... 3])
  second(data[4 ... 7]) // This will lead to a crash due to a promotion of a slice where no slice is expected
}

func second(_ data: Data) {
  guard data.count == 4 else { return }
  print(data[0 ... 3])
}

let data = Data([0,1,2,3,4,5,6,7])
first(data)

Shouldn't this susbcript return rather something like a DataSlice instead, to prevent passing a view where it's not expected?

https://github.com/apple/swift-corelibs-foundation/blob/master/Foundation/Data.swift#L1680

One can workaround this by change the range in second into something like data.startIndex ..< data.endIndex but then why would we allow numeric subscripting in first place and I do not expect Data to be a slice of a different Data similar to how a String has a Substring or an Array has an ArraySlice which clearly signals to the user this kind of issues.

2 Likes

I definitely agree that the current slicing API for Data is sub-optimal and should be improved to better guide users towards the correct way to slice Data values (by using startIndex and endIndex when dealing with Data values where it's potentially unknown whether it's a slice or not).

There is good discussion in the comments of SR-5300 about why this is the current state of the API. To very briefly summarise, each of the potential alternatives considered have their own flaws:

  1. Using a separate slice type rather than making Data its own slice means that (without significant overhaul) the slice cannot be directly used by other frameworks that deal in terms of Data/NSData.

  2. Using an opaque index type would have source compatibility and ergonomic concerns.

  3. Making Data slices always 0 indexed would be inconsistent with other collections

2 Likes

It’s good enough for String and Array, which are far more widely-used (also in performance-sensitive code).

My point is that we’ve had this discussion. Maybe after that bug was filed.

2 Likes