Possible unsoundness with RawSpan and MutableRawSpan

I agree that it's important to be able to convert a RawSpan to a Span. I mentioned the option of performing a freeze on each individual load because it avoids the difficulties I mentioned earlier of freezing the entire memory range.

To further elaborate on those difficulties: I don't think LLVM has an instruction to freeze a range of memory. It's possible to do it manually with something like rawSpan[i] = freeze(rawSpan[i]), but that has two problems:

  1. It mutates the memory, so it risks LLVM-level data races unless there's a guarantee of exclusivity. To avoid UB, getting a RawSpan would have to be a mutating operation.
  2. It would have to compile down to actually writing to memory, because of the MADV_FREE-related problems, making it an O(n) operation. The Rust RFC I linked above has given up on providing an in-place freeze operation for that reason.

There are multiple dimensions of memory safety. Even if RawSpan doesn't guarantee initialization safety, I think it would still be a meaningful improvement over unsafe pointers by guaranteeing bounds safety and lifetime safety.

I also think it's probably the best option. We want RawSpan to be a safer way to manipulate the raw bytes of values, and restricting it to types without padding would severely limit that use case. With that use case, dealing with uninitialized memory is generally unavoidable.

What I do think is bad for safety, though, is the status quo, where RawSpan has ambiguous safety guarantees.

I agree that a function taking a RawSpan most likely relies on additional safety guarantees that aren't ensured by the type itself. For that reason, I think RawSpan should be marked as @unsafe. (To my understanding, marking a type as @unsafe is basically a heuristic to interoperate with code without unsafety annotations; strictly speaking, it is operations on types that are unsafe, not types themselves.)


The other thread suggests that RawSpan could be used as a safe abstraction over fully-initialized memory, such as arrays of integers. That's useful, but it conflicts with the other major use case. In my opinion, the problem is that it's inadequate to address multiple dimensions of unsafety with only one safe type.

Maybe it would be a good idea to let RawSpan just ensure lifetime safety and bounds safety, and introduce a new type that also ensures initialization safety. Maybe the type could be called ByteSpan.

There's also another dimension of safety: type safety, but like @Joe_Groff mentioned earlier, that would be addressed by marking MutableSpan.mutableBytes as @unsafe, which has already been done.

1 Like