Omitting Returns in String: Case Study of SE-0255
I’m a fan of @nate_chandler’s SE-0255 which allows us to omit the return
keyword in a single-expression function or computed variables. Here is a case study of SE-0255 as applied to String’s internal implementation.
This is only to String and related Unicode functionality; other areas such as SIMDVector.swift (@scanon) could similarly benefit.
Omission Wins
SE-0255 allows us to remove 393 single-expression returns. Here is the PR. Below are a few common themes in return omission.
Forwarding declarations
- internal var count: Int { return _object.count }
+ internal var count: Int { _object.count }
- internal var isSmall: Bool { return _object.isSmall }
+ internal var isSmall: Bool { _object.isSmall }
- internal var isASCII: Bool {
- return _object.isASCII
- }
+ internal var isASCII: Bool { _object.isASCII }
- internal var isNFC: Bool { return _object.isNFC }
+ internal var isNFC: Bool { _object.isNFC }
public var isBidiControl: Bool {
- return _hasBinaryProperty(__swift_stdlib_UCHAR_BIDI_CONTROL)
+ _hasBinaryProperty(__swift_stdlib_UCHAR_BIDI_CONTROL)
}
Collection boilerplate
- public var startIndex: Int {
- return 0
- }
+ public var startIndex: Int { 0 }
- public var endIndex: Int {
- return 0 + UTF8.width(value)
- }
+ public var endIndex: Int { 0 + UTF8.width(value) }
- public __consuming func makeIterator() -> Iterator {
- return Iterator(_guts)
- }
+ public __consuming func makeIterator() -> Iterator { Iterator(_guts) }
Standard boilerplate for conformances
- public var customMirror: Mirror {
- return Mirror(reflecting: description)
- }
+ public var customMirror: Mirror { Mirror(reflecting: description) }
- public var description: String { return String(self) }
+ public var description: String { String(self) }
- public var debugDescription: String { return String(self).debugDescription }
+ public var debugDescription: String { String(self).debugDescription }
Constant-wrapping vars and simple bit-level abstractions
- internal static var countMask: UInt64 { return 0x0000_FFFF_FFFF_FFFF }
+ internal static var countMask: UInt64 { 0x0000_FFFF_FFFF_FFFF }
- internal static var flagsMask: UInt64 { return ~countMask }
+ internal static var flagsMask: UInt64 { ~countMask }
- internal static var isASCIIMask: UInt64 { return 0x8000_0000_0000_0000 }
+ internal static var isASCIIMask: UInt64 { 0x8000_0000_0000_0000 }
- internal static var isNFCMask: UInt64 { return 0x4000_0000_0000_0000 }
+ internal static var isNFCMask: UInt64 { 0x4000_0000_0000_0000 }
internal static func small(isASCII: Bool) -> UInt64 {
- return isASCII ? 0xE000_0000_0000_0000 : 0xA000_0000_0000_0000
+ isASCII ? 0xE000_0000_0000_0000 : 0xA000_0000_0000_0000
}
internal var isImmortal: Bool {
- return (discriminatedObjectRawBits & 0x8000_0000_0000_0000) != 0
+ (discriminatedObjectRawBits & 0x8000_0000_0000_0000) != 0
}
public static func isLeadSurrogate(_ x: CodeUnit) -> Bool {
- return (x & 0xFC00) == 0xD800
+ (x & 0xFC00) == 0xD800
}
Operators
public static func != <RHS: StringProtocol>(lhs: Self, rhs: RHS) -> Bool {
- return !(lhs == rhs)
+ !(lhs == rhs)
}
public static func > <RHS: StringProtocol>(lhs: Self, rhs: RHS) -> Bool {
- return rhs < lhs
+ rhs < lhs
}
public static func <= <RHS: StringProtocol>(lhs: Self, rhs: RHS) -> Bool {
- return !(rhs < lhs)
+ !(rhs < lhs)
}
public static func >= <RHS: StringProtocol>(lhs: Self, rhs: RHS) -> Bool {
- return !(lhs < rhs)
+ !(lhs < rhs)
Cannot Omit
These are areas where one could argue that the return could be omitted, but the current language does not have the mechanisms to allow this without a loss of clarity. Below are 111 declarations grouped by theme. I’m not arguing that these must be solved nor for any specific mechanism to solve them. The purpose of this is to provide empirical data to feature authors, irrespective of the other merits and drawbacks of their particular feature.
Precondition/Postcondition/Assertions
The standard library has 3 levels of invariant checking, mentioned in the programmer’s manual.
_precondition
is checked in all build configurations of client code. For example, it is used to enforce important invariants at runtime such as memory safety.
_debugPrecondition
is checked only in testing/debug builds of client code. For example, it can be used to check memory safety of “unsafe” APIs.
_internalInvariant
is only checked in testing builds of the library itself, and never when run by client code with a shipping toolchain. These are commonly referred to as internal “assertions”.
Invariant checking can block return omission as they require a separate statement.
A potential mechanism is to support pre/post conditions on declarations. Preconditions would be checked receiving the parameters of the function, and postconditions would be checked additionally receiving the return value of the function. This feature should be explored regardless of how it enables return omission (a minor difference compared to the feature itself), but the below can help provide some fodder for a proposer.
Preconditions
public subscript(position: Int) -> UTF8.CodeUnit {
_precondition(position >= startIndex && position < endIndex,
"Unicode.Scalar.UTF8View index is out of bounds")
return value.withUTF8CodeUnits { $0[position] }
}
public static func leadSurrogate(_ x: Unicode.Scalar) -> UTF16.CodeUnit {
_precondition(width(x) == 2)
return 0xD800 + UTF16.CodeUnit(truncatingIfNeeded:
(x.value - 0x1_0000) &>> (10 as UInt32))
}
public static func trailSurrogate(_ x: Unicode.Scalar) -> UTF16.CodeUnit {
_precondition(width(x) == 2)
return 0xDC00 + UTF16.CodeUnit(truncatingIfNeeded:
(x.value - 0x1_0000) & (((1 as UInt32) &<< 10) - 1))
}
public func index(after i: Index) -> Index {
_debugPrecondition(i._biasedBits != 0)
return Index(_biasedBits: i._biasedBits >> 8)
}
public func index(after i: Index) -> Index {
_precondition(i < endIndex, "Cannot increment beyond endIndex")
_precondition(i >= startIndex, "Cannot increment an invalid index")
return _slice.index(after: i)
}
public func index(before i: Index) -> Index {
_precondition(i <= endIndex, "Cannot decrement an invalid index")
_precondition(i > startIndex, "Cannot decrement beyond startIndex")
return _slice.index(before: i)
}
public subscript(r: Range<Index>) -> Substring.UTF8View {
_precondition(r.lowerBound >= startIndex && r.upperBound <= endIndex,
"UTF8View index range out of bounds")
return Substring.UTF8View(_slice.base, _bounds: r)
}
public subscript(r: Range<Index>) -> Substring {
_boundsCheck(r)
return Substring(Slice(base: self, bounds: r))
}
public var utf8Start: UnsafePointer<UInt8> {
_precondition(
hasPointerRepresentation,
"StaticString should have pointer representation")
return UnsafePointer(bitPattern: UInt(_startPtrOrData))!
}
public var unicodeScalar: Unicode.Scalar {
_precondition(
!hasPointerRepresentation,
"StaticString should have Unicode scalar representation")
return Unicode.Scalar(UInt32(UInt(_startPtrOrData)))!
}
public var utf8CodeUnitCount: Int {
_precondition(
hasPointerRepresentation,
"StaticString should have pointer representation")
return Int(_utf8CodeUnitCount)
}
Assertions
String’s implementation adopts LLVM’s doctrine of assert liberally, using internal assertions as an engineering tool for writing low-level fiddly pieces. These pieces have assumptions about the way they are called or the kinds of values they are passed, but they cannot pay the performance cost of checking that at runtime. Assertions let us catch bugs earlier and catch subtle bugs that don’t manifest in behavior differences until some other future change or unanticipated use case happens. With a rich suite of assertions in place, refactoring and other changes are less scary.
internal func _decodeUTF8(_ x: UInt8) -> Unicode.Scalar {
_internalInvariant(UTF8.isASCII(x))
return Unicode.Scalar(_unchecked: UInt32(x))
}
internal func _foreignIndex(_ i: Index, offsetBy n: Int) -> Index {
_internalInvariant(_guts.isForeign)
return _index(i, offsetBy: n)
}
internal func _foreignIndex(
_ i: Index, offsetBy n: Int, limitedBy limit: Index
) -> Index? {
_internalInvariant(_guts.isForeign)
return _index(i, offsetBy: n, limitedBy: limit)
}
internal func _foreignDistance(from i: Index, to j: Index) -> Int {
_internalInvariant(_guts.isForeign)
return _distance(from: i, to: j)
}
internal func _foreignCount() -> Int {
_internalInvariant(_guts.isForeign)
return _distance(from: startIndex, to: endIndex)
}
internal func _foreignIndex(after i: Index) -> Index {
_internalInvariant(_guts.isForeign)
return i.nextEncoded
}
internal func _foreignIndex(before i: Index) -> Index {
_internalInvariant(_guts.isForeign)
return i.priorEncoded
}
internal func _foreignSubscript(position i: Index) -> UTF16.CodeUnit {
_internalInvariant(_guts.isForeign)
return _guts.foreignErrorCorrectedUTF16CodeUnit(at: i)
}
internal func _foreignDistance(from start: Index, to end: Index) -> Int {
_internalInvariant(_guts.isForeign)
return end._encodedOffset - start._encodedOffset
}
internal func _foreignIndex(_ i: Index, offsetBy n: Int) -> Index {
_internalInvariant(_guts.isForeign)
return i.encoded(offsetBy: n)
}
internal func _foreignCount() -> Int {
_internalInvariant(_guts.isForeign)
return endIndex._encodedOffset - startIndex._encodedOffset
}
internal func _foreignIsWithin(_ target: String.UTF16View) -> Bool {
_internalInvariant(target._guts.isForeign)
// If we're transcoding, we're a UTF-8 view index, not UTF-16.
return self.transcodedOffset == 0
}
internal static func _decodeSurrogates(
_ lead: CodeUnit,
_ trail: CodeUnit
) -> Unicode.Scalar {
_internalInvariant(isLeadSurrogate(lead))
_internalInvariant(isTrailSurrogate(trail))
return Unicode.Scalar(
_unchecked: 0x10000 +
(UInt32(lead & 0x03ff) &<< 10 | UInt32(trail & 0x03ff)))
}
internal var nextEncoded: String.Index {
_internalInvariant(self.transcodedOffset == 0)
return String.Index(_encodedOffset: self._encodedOffset &+ 1)
}
internal var priorEncoded: String.Index {
_internalInvariant(self.transcodedOffset == 0)
return String.Index(_encodedOffset: self._encodedOffset &- 1)
}
internal func transcoded(withOffset n: Int) -> String.Index {
_internalInvariant(self.transcodedOffset == 0)
return String.Index(encodedOffset: self._encodedOffset, transcodedOffset: n)
}
internal var _countAndFlags: CountAndFlags {
_internalInvariant(!isSmall)
return CountAndFlags(rawUnchecked: _countAndFlagsBits)
}
internal static func small(withCount count: Int, isASCII: Bool) -> UInt64 {
_internalInvariant(count <= _SmallString.capacity)
return small(isASCII: isASCII) | UInt64(truncatingIfNeeded: count) &<< 56
}
internal var largeFastIsTailAllocated: Bool {
_internalInvariant(isLarge && providesFastUTF8)
return _countAndFlags.isTailAllocated
}
internal var largeIsCocoa: Bool {
_internalInvariant(isLarge)
return (discriminatedObjectRawBits & 0x4000_0000_0000_0000) != 0
}
internal var smallCount: Int {
_internalInvariant(isSmall)
return _StringObject.getSmallCount(fromRaw: discriminatedObjectRawBits)
}
internal var smallIsASCII: Bool {
_internalInvariant(isSmall)
return _StringObject.getSmallIsASCII(fromRaw: discriminatedObjectRawBits)
}
internal var largeCount: Int {
_internalInvariant(isLarge)
return _countAndFlags.count
}
internal var largeAddressBits: UInt {
_internalInvariant(isLarge)
return UInt(truncatingIfNeeded:
discriminatedObjectRawBits & Nibbles.largeAddressMask)
}
internal var nativeUTF8Start: UnsafePointer<UInt8> {
_internalInvariant(largeFastIsTailAllocated)
return UnsafePointer(
bitPattern: largeAddressBits &+ _StringObject.nativeBias
)._unsafelyUnwrappedUnchecked
}
internal var nativeUTF8: UnsafeBufferPointer<UInt8> {
_internalInvariant(largeFastIsTailAllocated)
return UnsafeBufferPointer(start: nativeUTF8Start, count: largeCount)
}
internal var objCBridgeableObject: AnyObject {
_internalInvariant(hasObjCBridgeableObject)
return Builtin.reinterpretCast(largeAddressBits)
}
static func _toUTF16CodeUnit(_ x: UTF8.CodeUnit) -> UTF16.CodeUnit {
_internalInvariant(x <= 0x7f, "should only be doing this with ASCII")
return UTF16.CodeUnit(truncatingIfNeeded: x)
}
static func _fromUTF16CodeUnit(
_ utf16: UTF16.CodeUnit
) -> UTF8.CodeUnit {
_internalInvariant(utf16 <= 0x7f, "should only be doing this with ASCII")
return UTF8.CodeUnit(truncatingIfNeeded: utf16)
}
Subexpression refactoring
Refactoring out subexpressions into their own declaration can aid code readability, debug-ability, maintainability, and presents the opportunity to provide a useful name. While it does prohibit return elision, the benefits far out-weight this.
Below are instances where I found an argument could be made for return elision with some future mechanism, though I have nothing specific in mind. These may serve as fodder for expression simplification and API enhancements unrelated to return elision.
Subexpressions
public var generalCategory: Unicode.GeneralCategory {
let rawValue = __swift_stdlib_UCharCategory(
__swift_stdlib_UCharCategory.RawValue(
__swift_stdlib_u_getIntPropertyValue(
icuValue, __swift_stdlib_UCHAR_GENERAL_CATEGORY)))
return Unicode.GeneralCategory(rawValue: rawValue)
}
public var canonicalCombiningClass: Unicode.CanonicalCombiningClass {
let rawValue = UInt8(__swift_stdlib_u_getIntPropertyValue(
icuValue, __swift_stdlib_UCHAR_CANONICAL_COMBINING_CLASS))
return Unicode.CanonicalCombiningClass(rawValue: rawValue)
}
public var numericType: Unicode.NumericType? {
let rawValue = __swift_stdlib_UNumericType(
__swift_stdlib_UNumericType.RawValue(
__swift_stdlib_u_getIntPropertyValue(
icuValue, __swift_stdlib_UCHAR_NUMERIC_TYPE)))
return Unicode.NumericType(rawValue: rawValue)
}
public var numericValue: Double? {
let icuNoNumericValue: Double = -123456789
let result = __swift_stdlib_u_getNumericValue(icuValue)
return result != icuNoNumericValue ? result : nil
}
public func _bufferedScalar(bitCount: UInt8) -> Encoding.EncodedScalar {
let x = UInt32(_buffer._storage) &+ 0x01010101
return _ValidUTF8Buffer(_biasedBits: x & ._lowBits(bitCount))
}
internal func withFastCChar<R>(
_ f: (UnsafeBufferPointer<CChar>) throws -> R
) rethrows -> R {
try self.withFastUTF8 { utf8 in
let ptr = utf8.baseAddress._unsafelyUnwrappedUnchecked._asCChar
return try f(UnsafeBufferPointer(start: ptr, count: utf8.count))
}
}
internal func _slowWithCString<Result>(
_ body: (UnsafePointer<Int8>) throws -> Result
) rethrows -> Result {
_internalInvariant(!_object.isFastZeroTerminated)
try String(self).utf8CString.withUnsafeBufferPointer {
let ptr = $0.baseAddress._unsafelyUnwrappedUnchecked
return try body(ptr)
}
}
internal var characterStride: Int? {
let value = (_rawBits & 0x3F00) &>> 8
return value > 0 ? Int(truncatingIfNeeded: value) : nil
}
internal var _countAndFlagsBits: UInt64 {
let rawBits = UInt64(truncatingIfNeeded: _flags) &<< 48
| UInt64(truncatingIfNeeded: _count)
return rawBits
}
public func index(before i: Index) -> Index {
let offset = _ValidUTF8Buffer(_biasedBits: i._biasedBits).count
_debugPrecondition(offset != 0)
return Index(_biasedBits: _biasedBits &>> (offset &<< 3 - 8))
}
internal var zeroTerminatedRawCodeUnits: RawBitPattern {
let smallStringCodeUnitMask = ~UInt64(0xFF).bigEndian // zero last byte
return (self._storage.0, self._storage.1 & smallStringCodeUnitMask)
}
internal func computeIsASCII() -> Bool {
let asciiMask: UInt64 = 0x8080_8080_8080_8080
let raw = zeroTerminatedRawCodeUnits
return (raw.0 | raw.1) & asciiMask == 0
}
internal subscript(_ bounds: Range<Index>) -> SubSequence {
self.withUTF8 { utf8 in
let rebased = UnsafeBufferPointer(rebasing: utf8[bounds])
return _SmallString(rebased)._unsafelyUnwrappedUnchecked
}
}
internal func withUTF8<Result>(
_ f: (UnsafeBufferPointer<UInt8>) throws -> Result
) rethrows -> Result {
var raw = self.zeroTerminatedRawCodeUnits
return try Swift.withUnsafeBytes(of: &raw) { rawBufPtr in
let ptr = rawBufPtr.baseAddress._unsafelyUnwrappedUnchecked
.assumingMemoryBound(to: UInt8.self)
return try f(UnsafeBufferPointer(start: ptr, count: self.count))
}
}
internal func _stdlib_binary_CFStringCreateCopy(
_ source: _CocoaString
) -> _CocoaString {
let result = _swift_stdlib_CFStringCreateCopy(nil, source) as AnyObject
return result
}
internal func _cocoaStringSubscript(
_ target: _CocoaString, _ position: Int
) -> UTF16.CodeUnit {
let cfSelf: _swift_shims_CFStringRef = target
return _swift_stdlib_CFStringGetCharacterAtIndex(cfSelf, position)
}
internal func _cocoaStringCompare(
_ string: _CocoaString, _ other: _CocoaString
) -> Int {
let cfSelf: _swift_shims_CFStringRef = string
let cfOther: _swift_shims_CFStringRef = other
return _swift_stdlib_CFStringCompare(cfSelf, cfOther)
}
func _toUTF16Indices(_ range: Range<Int>) -> Range<Index> {
let lowerbound = _toUTF16Index(range.lowerBound)
let upperbound = _toUTF16Index(range.lowerBound + range.count)
return Range(uncheckedBounds: (lower: lowerbound, upper: upperbound))
}
private func _stringCompareSlow(
_ leftUTF8: UnsafeBufferPointer<UInt8>,
_ rightUTF8: UnsafeBufferPointer<UInt8>,
expecting: _StringComparisonResult
) -> Bool {
let left = _StringGutsSlice(_StringGuts(leftUTF8, isASCII: false))
let right = _StringGutsSlice(_StringGuts(rightUTF8, isASCII: false))
return left.compare(with: right, expecting: expecting)
}
internal var _offsetRange: Range<Int> {
let (start, end) = (startIndex, endIndex)
_internalInvariant(
start.transcodedOffset == 0 && end.transcodedOffset == 0)
return Range(uncheckedBounds: (start._encodedOffset, end._encodedOffset))
}
final internal func character(at offset: Int) -> UInt16 {
let str = asString
return str.utf16[str._toUTF16Index(offset)]
}
public func index(_ i: Index, offsetBy n: Int) -> Index {
let result = _slice.index(i, offsetBy: n)
_precondition(
(_slice._startIndex ... _slice.endIndex).contains(result),
"Operation results in an invalid index")
return result
}
Control Flow
Control flow constructs are not expressions, though there are pitches out there to revisit this. The below are cases that could be simplified and would enable return omission. Again, return omission is a relatively minor benefit compared to the feature, but this could be useful fodder.
If-then-else and guard
internal func _utf8ScalarLength(_ x: UInt8) -> Int {
_internalInvariant(!UTF8.isContinuation(x))
if UTF8.isASCII(x) { return 1 }
return (~x).leadingZeroBitCount
}
internal func errorCorrectedScalar(
startingAt i: Int
) -> (Unicode.Scalar, scalarLength: Int) {
if _fastPath(isFastUTF8) {
return withFastUTF8 { _decodeScalar($0, startingAt: i) }
}
return foreignErrorCorrectedScalar(
startingAt: String.Index(_encodedOffset: i))
}
internal func errorCorrectedCharacter(
startingAt start: Int, endingAt end: Int
) -> Character {
if _fastPath(isFastUTF8) {
return withFastUTF8(range: start..<end) { utf8 in
return Character(unchecked: String._uncheckedFromUTF8(utf8))
}
}
return foreignErrorCorrectedGrapheme(startingAt: start, endingAt: end)
}
public func index(after i: Index) -> Index {
if _fastPath(_guts.isFastUTF8) {
return i.nextEncoded
}
return _foreignIndex(after: i)
}
public func index(before i: Index) -> Index {
precondition(!i.isZeroPosition)
if _fastPath(_guts.isFastUTF8) {
return i.priorEncoded
}
return _foreignIndex(before: i)
}
public func index(_ i: Index, offsetBy n: Int) -> Index {
if _fastPath(_guts.isFastUTF8) {
_precondition(n + i._encodedOffset <= _guts.count)
return i.encoded(offsetBy: n)
}
return _foreignIndex(i, offsetBy: n)
}
public func distance(from i: Index, to j: Index) -> Int {
if _fastPath(_guts.isFastUTF8) {
return j._encodedOffset &- i._encodedOffset
}
return _foreignDistance(from: i, to: j)
}
public var count: Int {
if _fastPath(_guts.isFastUTF8) {
return _guts.count
}
return _foreignCount()
}
public var count: Int {
if _slowPath(_guts.isForeign) {
return _foreignCount()
}
return _nativeGetOffset(for: endIndex)
}
public func _parseMultipleCodeUnits() -> (isValid: Bool, bitCount: UInt8) {
_internalInvariant(
!Encoding._isScalar(UInt16(truncatingIfNeeded: _buffer._storage)))
if _fastPath(_buffer._storage & 0xFC00_FC00 == 0xD800_DC00) {
return (true, 2*16)
}
return (false, 1*16)
}
public func _parseMultipleCodeUnits() -> (isValid: Bool, bitCount: UInt8) {
_internalInvariant(
!Encoding._isScalar(UInt16(truncatingIfNeeded: _buffer._storage)))
if _fastPath(_buffer._storage & 0xFC00_FC00 == 0xDC00_D800) {
return (true, 2*16)
}
return (false, 1*16)
}
public func withContiguousStorageIfAvailable<R>(
_ body: (UnsafeBufferPointer<Element>) throws -> R
) rethrows -> R? {
guard _guts.isFastUTF8 else { return nil }
return try _guts.withFastUTF8(body)
}
internal func withFastUTF8<R>(
_ f: (UnsafeBufferPointer<UInt8>) throws -> R
) rethrows -> R {
_internalInvariant(isFastUTF8)
if self.isSmall { return try _SmallString(_object).withUTF8(f) }
defer { _fixLifetime(self) }
return try f(_object.fastUTF8)
}
internal func withCString<Result>(
_ body: (UnsafePointer<Int8>) throws -> Result
) rethrows -> Result {
if _slowPath(!_object.isFastZeroTerminated) {
return try _slowWithCString(body)
}
return try self.withFastCChar {
return try body($0.baseAddress._unsafelyUnwrappedUnchecked)
}
}
internal var utf8Count: Int {
if _fastPath(self.isFastUTF8) { return count }
return String(self).utf8.count
}
internal func _characterStride(startingAt i: Index) -> Int {
if let d = i.characterStride { return d }
if i == endIndex { return 0 }
return _guts._opaqueCharacterStride(startingAt: i._encodedOffset)
}
internal func _characterStride(endingAt i: Index) -> Int {
if i == startIndex { return 0 }
return _guts._opaqueCharacterStride(endingAt: i._encodedOffset)
}
internal var isImmortal: Bool {
if case .immortal = self { return true }
return false
}
internal var isASCII: Bool {
if isSmall { return smallIsASCII }
return _countAndFlags.isASCII
}
internal var isNFC: Bool {
if isSmall {
return smallIsASCII
}
return _countAndFlags.isNFC
}
internal var fastUTF8: UnsafeBufferPointer<UInt8> {
_internalInvariant(self.isLarge && self.providesFastUTF8)
guard _fastPath(self.largeFastIsTailAllocated) else {
return sharedUTF8
}
return UnsafeBufferPointer(
start: self.nativeUTF8Start, count: self.largeCount)
}
internal var isFastZeroTerminated: Bool {
if _slowPath(!providesFastUTF8) { return false }
if isSmall { return true }
return largeFastIsTailAllocated
}
internal func isOnUnicodeScalarBoundary(_ index: Int) -> Bool {
guard index < count else {
_internalInvariant(index == count)
return true
}
return !UTF8.isContinuation(self[index])
}
internal var nativeCapacity: Int? {
guard hasNativeStorage else { return nil }
return _object.nativeStorage.capacity
}
internal var nativeUnusedCapacity: Int? {
guard hasNativeStorage else { return nil }
return _object.nativeStorage.unusedCapacity
}
internal var uniqueNativeCapacity: Int? {
@inline(__always) mutating get {
guard isUniqueNative else { return nil }
return _object.nativeStorage.capacity
}
}
internal var uniqueNativeUnusedCapacity: Int? {
guard isUniqueNative else { return nil }
return _object.nativeStorage.unusedCapacity
}
internal var utf8Count: Int {
if _fastPath(self.isFastUTF8) {
return _offsetRange.count
}
return Substring(self).utf8.count
}
internal func foreignHasNormalizationBoundary(
before index: String.Index
) -> Bool {
if index == range.lowerBound || index == range.upperBound {
return true
}
return _guts.foreignHasNormalizationBoundary(before: index)
}
private func _getCocoaStringPointer(
_ cfImmutableValue: _CocoaString
) -> CocoaStringPointer {
if let utf8Ptr = _cocoaUTF8Pointer(cfImmutableValue) {
return .ascii(utf8Ptr)
}
if let utf16Ptr = _swift_stdlib_CFStringGetCharactersPtr(cfImmutableValue) {
return .utf16(utf16Ptr)
}
return .none
}
internal func _stringCompare(
_ lhs: _StringGuts, _ rhs: _StringGuts, expecting: _StringComparisonResult
) -> Bool {
if lhs.rawBits == rhs.rawBits { return expecting == .equal }
return _stringCompareWithSmolCheck(lhs, rhs, expecting: expecting)
}
internal func _stringCompareInternal(
_ lhs: _StringGuts, _ rhs: _StringGuts, expecting: _StringComparisonResult
) -> Bool {
guard _fastPath(lhs.isFastUTF8 && rhs.isFastUTF8) else {
return _stringCompareSlow(lhs, rhs, expecting: expecting)
}
let isNFC = lhs.isNFC && rhs.isNFC
return lhs.withFastUTF8 { lhsUTF8 in
return rhs.withFastUTF8 { rhsUTF8 in
return _stringCompareFastUTF8(
lhsUTF8, rhsUTF8, expecting: expecting, bothNFC: isNFC)
}
}
}
internal func _stringCompare(
_ lhs: _StringGuts, _ lhsRange: Range<Int>,
_ rhs: _StringGuts, _ rhsRange: Range<Int>,
expecting: _StringComparisonResult
) -> Bool {
if lhs.rawBits == rhs.rawBits && lhsRange == rhsRange {
return expecting == .equal
}
return _stringCompareInternal(
lhs, lhsRange, rhs, rhsRange, expecting: expecting)
}
static func _tryFromUTF8(_ input: UnsafeBufferPointer<UInt8>) -> String? {
guard case .success(let extraInfo) = validateUTF8(input) else {
return nil
}
return String._uncheckedFromUTF8(input, isASCII: extraInfo.isASCII)
}
internal static func _fromSubstring(
_ substring: __shared Substring
) -> String {
if substring._offsetRange == substring.base._offsetRange {
return substring.base
}
return String._copying(substring)
}
internal static func _copying(_ str: Substring) -> String {
if _fastPath(str._wholeGuts.isFastUTF8) {
return str._wholeGuts.withFastUTF8(range: str._offsetRange) {
String._uncheckedFromUTF8($0)
}
}
return Array(str.utf8).withUnsafeBufferPointer {
String._uncheckedFromUTF8($0)
}
}
internal var _wholeGuts: _StringGuts {
if let str = self as? String {
return str._guts
}
if let subStr = self as? Substring {
return subStr._wholeGuts
}
return String(self)._guts
}
internal func _nativeIsEqual<T:_AbstractStringStorage>(
_ nativeOther: T
) -> Int8 {
if count != nativeOther.count {
return 0
}
return (start == nativeOther.start ||
(memcmp(start, nativeOther.start, count) == 0)) ? 1 : 0
}
final internal var hash: UInt {
if isASCII {
return _cocoaHashASCIIBytes(start, length: count)
}
return _cocoaHashString(self)
}
final internal func _fastCStringContents(
_ requiresNulTermination: Int8
) -> UnsafePointer<CChar>? {
if isASCII {
return start._asCChar
}
return nil
}
public static func encode(
_ source: Unicode.Scalar
) -> EncodedScalar? {
guard source.value < (1&<<7) else { return nil }
return EncodedScalar(UInt8(truncatingIfNeeded: source.value))
}
Switch
```swift
internal var _isSymbol: Bool {
switch self {
case .mathSymbol, .currencySymbol, .modifierSymbol, .otherSymbol:
return true
default: return false
}
}
internal var _isPunctuation: Bool {
switch self {
case .connectorPunctuation, .dashPunctuation, .openPunctuation,
.closePunctuation, .initialPunctuation, .finalPunctuation,
.otherPunctuation:
return true
default: return false
}
}
public static func width(_ x: Unicode.Scalar) -> Int {
switch x.value {
case 0..<0x80: return 1
case 0x80..<0x0800: return 2
case 0x0800..<0x1_0000: return 3
default: return 4
}
}
static func ==(
_ lhs: _StringComparisonResult, _ rhs: _StringComparisonResult
) -> Bool {
switch (lhs, rhs) {
case (.equal, .equal): return true
case (.less, .less): return true
default: return false
}
}
internal static func _fromUTF8Repairing(
_ input: UnsafeBufferPointer<UInt8>
) -> (result: String, repairsMade: Bool) {
switch validateUTF8(input) {
case .success(let extraInfo):
return (String._uncheckedFromUTF8(
input, asciiPreScanResult: extraInfo.isASCII
), false)
case .error(let initialRange):
return (repairUTF8(input, firstKnownBrokenRange: initialRange), true)
}
}
func hasBreakWhenPaired(_ x: Unicode.Scalar) -> Bool {
switch x.value {
case 0x3400...0xa4cf: return true
case 0x0000...0x02ff: return true
case 0x3041...0x3096: return true
case 0x30a1...0x30fc: return true
case 0x0400...0x0482: return true
case 0x061d...0x064a: return true
case 0xac00...0xd7af: return true
case 0x2010...0x2029: return true
case 0x3000...0x3029: return true
case 0xFF01...0xFF9D: return true
default: return false
}
}
internal func _cString(encoding: UInt) -> UnsafePointer<UInt8>? {
switch (encoding, isASCII) {
case (_cocoaASCIIEncoding, true):
fallthrough
case (_cocoaUTF8Encoding, _):
return start
default:
return _cocoaCStringUsingEncodingTrampoline(self, encoding)
}
}
public var _objectIdentifier: ObjectIdentifier? {
switch _form {
case ._cocoa(let object): return ObjectIdentifier(object)
case ._native(let object): return ObjectIdentifier(object)
default: return nil
}
}