How to know if something has reference semantic?

young · September 10, 2022, 5:45pm

I was re-watching Understanding Swift Performance. It talked about eliminating any properties that have reference semantic. My question is: you can't tell if something has reference semantic just by looking at it. You have to just know. My question is, is there anyway the compiler or Xcode can help me finding out anything that has reference semantic behind the scene?

import Foundation

extension String {
    // just a dummy stub to make thing compile
    var isMimeType: Bool {
        true
    }
}


print("Part 1\n=========================================")

// So this model object struct contains reference semantic properties:
// how to find out which property has reference semantic so that we can eliminate it with value semantic?
// the the compiler tell? Or Xcode?
struct Attachment {
    let fileURL: URL
    let uuid: String      // this has reference
    let mimeType: String  // this also has refernce

    init?(fileURL: URL, uuid: String, mimeType: String) {
        guard mimeType.isMimeType else {
            return nil
        }

        self.fileURL = fileURL
        self.uuid = uuid
        self.mimeType = mimeType
    }
}

let couldBeAnAttachment = Attachment(fileURL: URL(string: "https://apple.com")!, uuid: UUID().uuidString, mimeType: "such & such")
dump(couldBeAnAttachment!)


print("\nPart 2\n=========================================")
// ====================== Now improve Attachment by eliminating any reference semantic type ==================
// but the question is how to know which properties is such and need to change to value type?

//enum MimeType {
//    case jpeg, png, gif
//    init?(rawValue: String) {
//        switch rawValue {
//        case "image/jpeg":
//            self = .jpeg
//        case "image/png":
//            self = .png
//        case "image/gif":
//            self = .gif
//        default:
//            return nil
//        }
//    }
//}

enum MimeType: String {
    case jpeg = "image/jpeg"
    case png = "image/png"
    case gif = "image/gif"
}

// Now every property is value semantic
struct AttachmentImproved {
    let fileURL: URL
    let uuid: UUID
    let mimeType: MimeType

    init?(fileURL: URL, uuid: UUID, mimeTypeString: String) {
        guard let mimeType = MimeType(rawValue: mimeTypeString) else {
            return nil
        }

        self.fileURL = fileURL
        self.uuid = uuid
        self.mimeType = mimeType
    }
}

let couldBeAnAttachmentImproved = AttachmentImproved(fileURL: URL(string: "https://apple.com")!, uuid: UUID(), mimeTypeString: "image/jpeg")
dump(couldBeAnAttachmentImproved!)

Nickolas_Pohilets · September 10, 2022, 8:10pm

Are you only concerned about performance, or are you looking for "value semantics" as a proxy for something more specific?

For example, if you want to be able to transfer attachments between different owners in the same address space, you need Sendable. And if you want to be able to transfer attachments between different owners in different address spaces, you need Codable.

young · September 10, 2022, 8:14pm

I haven't even consider Sendable and Codable. I'm just re-watching that WWDC talk and this is the example shown to eliminate any property inside a struct that has reference semantic to all value type so there is no ARC overhead.

So how do you know which one needs the change? How can I tell? Or is this just from knowledge and experience: you just need to find out? Or are there compiler/Xcode tools that help identify these reference semantic types?

Are there any lint like tool that can identify reference semantic types?

I’m only concern with finding reference semantic types.

Karl · September 10, 2022, 9:06pm

String has value semantics. AFAIK there isn't really a way to identify value semantics other than to read the documentation.

As for performance, types with value semantics do not eliminate any ARC overheads. The thing that differentiates reference semantics from value semantics is that, while reference-types have shared mutable state, the state of a value-type is immutable while it is being shared.

That improves your ability to locally reason about your code, without needing to perform "defensive copies" (which are often necessary in object-oriented code). You get the benefits of shared storage without the complexity.

That being said, in your example you have replaced two Strings - one with a UUID, and another with a plain enum. Those new types happen to have an additional property; they are "plain old data"/POD types (also known as "trivial" types), with a fixed size and no shared storage at all. Since they do not have shared storage, they will not involve ARC, and you will likely see improved performance because of that (although you should always test to make sure), but it has nothing to do with reference vs. value semantics.

young · September 10, 2022, 10:20pm

I thought those are allocate on the stack so there is no need for ref count? Please explain to me what I mis-understand.

michelf · September 10, 2022, 11:11pm

For instance: String. String is a struct allocated on the stack. But inside the struct is a reference to heap allocated storage, which is managed by ARC. That's how a String can contain data of arbitrary length, despite occupying the space of fixed-size struct on the stack. (There's an exception for very small strings: those are storing the data inline inside the struct, no reference to the heap, because there's enough space.)

String is maintaining the illusion of being a value with no reference by never mutating the data on the heap if there is more than one string pointing to it. If you mutate a string and there's more than one reference to the heap data, it'll make a copy of the data before performing the mutation on the copy. (This technique is called copy-on-write.)

Note that arrays and dictionaries work in a similar manner to strings.

tera · September 11, 2022, 12:01am

I don't believe this is possible in general. Even a struct like this:

struct T {
    private var val: Int
}

extension T {
    init() { ... }
    var value: Int {
        get {...}
        set {...}
    }
}

might allocate memory if wants so, and either store a reference to it or use it temporarily
might reference some global storage
might have either value or reference semantic
ARC traffic might happen in the implementation of "value" or other methods.

young · September 11, 2022, 12:17am

What should I do if I don’t want any ARC in my semi-real-time life or death app! I need deterministic real-time response.

tera · September 11, 2022, 3:01am

Worth reading this thread.

Unless we have "@realtime" attribute you'd need to avoid classes, many value types (like Strings, URLs, Arrays, Dictionaries), certain constructs (escaping closures, variadic functions), mutexes, semaphores, dispatch async calls, memory allocation, and to be absolutely sure you'd need to check resulting asm. I'd say all of this is nearly impossible / impractical to do 100% right manually, although in practice you'd be able to get 99% there.

But then, even if all that is done 100% correct, remember that a mere memory access on a system with virtual memory backed by secondary storage can take unbound time.. If I am not mistaken hard realtime system can't afford having virtual memory / paging.

On top of that, for deterministic response you need to know the worst case complexity of your algorithms - this would be quite a challenge to determine in general case.

Though I am curious, what is it the app you are doing.

young · September 11, 2022, 3:52am

I have no where to run … oh what to do? 16 billions transistors and neuron engine., so what if my user crash into obstacle? I don’t want to just detect crash. I want to avoid it.

Collision avoidance is what I’m doing! And not just for a single user but for the whole convoy.

The latest Watch annd Phone only does collision detection. It’s not good enough.

Can we do better with the best hardware in the world?

If the phone can play music and video without hiccup, then why can I not help my user safe guard their life?

patrickgoley · September 11, 2022, 5:38am

The audio doesn’t hiccup on your phone because Core Audio is a C library and devs have been very careful when working in the real time threads. It has been possible to do this for decades, long before Apple was a trillion dollar company. It is also possible to write such code with Swift, more so now with the @realtime attribute to provide some static checking for common issues. However if your app is truly life and death then I’d recommend doing intensive testing and profiling and not rely entirely on static analysis either way.

cukr · September 11, 2022, 7:15am

It cannot. Apple just made reliable enough that you either don't remember the hiccups, or they don't happen to you specifically. My and my sister's iPhones randomly stop playing audio until you reconnect headphones, and the Mac I use for work has an audio hiccup every time a breakpoint is triggered/added/removed

young · September 11, 2022, 9:56am

That’s why I’ve reverted back to wired headphone. I just ordered the new AirPod 2. Maybe it’ll work better than AKG and SENNHEISER

tera · September 11, 2022, 11:41am

Even if audio does hiccup every now and then that's just the audio glitch, not life or death situation. It's not wise to use the system of "best hardware in the world" + non realtime OS on top + swift to control life critical things like a nuclear reactor or a space station or a plane, or indeed the thing in the above video. With some heavy customisations it might be possible (eg disabled VM, seriously modified and stripped down version of OS, using a very limited subset of swift which would be as restrictive as C).

hisekaldma · September 11, 2022, 12:47pm

ARC may not be fast enough for realtime, but it is deterministic. That’s one of its advantages over ”stop the world” garbage collection.

tera · September 11, 2022, 2:32pm

Without dipping into that tangent too much I'd notice that deterministic behaviour while good for some cases is not good enough for others. If you need a guarantee that the following code executes in a certain allowed amount of time:

item = nil

you can't do that without knowledge of what item holds, as

deinit itself can take unbound time.

class A {
    deinit {
        sleep(arbitraryTime)
    }
}

or it might take time proportional to a number of "subitems",

class C {
    var next: D?
    init(next: D?) {
        self.next = next
    }
}

class D {
    var next: C?
    init(next: C?) {
        self.next = next
    }
}

The previous example will even crash if the list of subitems is deep enough (although to be fair this crash is not inherent to ARC itself but to a particular ARC implementation we have now).

young · September 11, 2022, 9:15pm

Yes agree completely. Ours is an AR augmented reality system as a sensory enhancement to people operating xxx (sorry cannot say what yet). We do not actually control the actual machinery but to aid its operations. So it’s not super critical to be hard real-time but we want to have instantaneous respond time with no or minimum lag. But we do need excellent response time so the user have enough time to react.

I wonder what OS Apple uses for their car. I don’t believe they would use QNX or WindRiver. I’m sure the would make their own RTOS.

The video above was project Star War Brilliant Pebbles done in 1989 - 1992 (I deleted it as it’s not relevant here) we wrote our own RTOS and the provide the base to run code written in Ada.

Subsequent to star war, we worked on the Space Station flight control and simulation environment. For part of the system, we used WindRiver because its Unix-like api.