Calling object_getClass on Swift objects

I am using the following low level debugging call to print detailed information about memory address location represented by an integer:

func debugPrintAddressInfo(_ address: Int) {
    print(String(format: "address       : %016lx", address))
    let p = UnsafeRawPointer(bitPattern: address)! // check nil later
    let mallocSize = malloc_size(p)
    var instanceSize = 0
    print("malloc_size   : \(mallocSize)")
    
    // πŸ€”πŸ€”πŸ€”
    if let objectClass = object_getClass2(address) {
        instanceSize = class_getInstanceSize(objectClass)
        print("instance size : \(instanceSize)")

        print("instance     ", terminator: "")
        var c: AnyClass? = objectClass
        while c != nil {
            print(" : \(c!)", terminator: "")
            c = class_getSuperclass(c!)
        }
        print()
    }
    print("bytes         : ", terminator: "")
    let size = instanceSize != 0 ? instanceSize : mallocSize != 0 ? mallocSize : 16
    let bytes = p.assumingMemoryBound(to: UInt8.self)
    for i in 0 ..< size {
        print(String(format: "%02x ", bytes[i]), terminator: "")
    }
    print("\n")
}

Note that the call receives Int as an address, the static type information is not available (type(of: ...) is not applicable). Usage example:

class BaseClass {
    var values: (UInt64, UInt64, UInt64) = (0x1111111111111111, 0x2222222222222222, 0x3333333333333333)
}

class SubClass: BaseClass {}

let c = SubClass()
let intRepresentation = unsafeBitCast(c, to: Int.self)
debugPrintAddressInfo(intRepresentation)

Prints:

 address       : 0000600000c04c00
 malloc_size   : 48
 instance size : 40
 instance      : SubClass : BaseClass : _TtCs12_SwiftObject
 bytes         : 68 82 00 00 01 00 00 00 03 00 00 00 00 00 00 00 11 11 11 11 11 11 11 11 22 22 22 22 22 22 22 22 33 33 33 33 33 33 33 33 

Note that in the implementation (line :thinking::thinking::thinking:) I am using a small C wrapper call:

// wrapper.c
#import <objc/runtime.h>

Class object_getClass2(long v) {
    void* object = (void *)v;
    return object_getClass(object);
}

Interestingly obj-c runtime's "object_getClass" works with Swift objects (is this accidentally?)

I have two questions about this implementation:

  1. is it ok to use object_getClass on Swift classes (neither marked @objc nor the subclasses of NSObject.) This seems to work in quick tests (like above).

  2. How do I get rid of my C wrapper and use "object_getClass" directly from Swift?

These are the failed attempts:

    // πŸ€”πŸ€”πŸ€”
    if let objectClass = object_getClass(UnsafeRawPointer(bitPattern: address)) {
    // prints:
    instance      : __SwiftValue : NSObject

    // πŸ€”πŸ€”πŸ€”
    if let objectClass = object_getClass(address) {
    // prints:
    instance      : __NSCFNumber : NSNumber : NSValue : NSObject

    //Desired Output (like in the working example above):
    instance      : SubClass : BaseClass : _TtCs12_SwiftObject
1 Like

Yep, all Swift classes are subclasses of Swift._SwiftObject (Whose mangled name is _TtCs12_SwiftObject). Not sure if that's publically documented anywhere, though.

To get object_getClass to work, you'll need to convert your integer to an Any in a way that doesn't involve boxing it into a __SwiftValue or `NSNumber.

Here's an example:

import ObjectiveC

class C {}

let someObject = C()

// An integer storing a pointer to an object
let someInt = Int(bitPattern: Unmanaged.passRetained(someObject).toOpaque())

// Converting the integer to an object reference, without accidentally boxing it:
let objectPointer = unsafeBitCast(someInt, to: AnyObject.self)
print(object_getClass(objectPointer) as Any) // => Optional(Untitled.C)
Here's a fixed example of your code
import Foundation

func debugPrintAddressInfo(_ address: Int) {
	print(String(format: "address       : %016lx", address))
	let p = UnsafeRawPointer(bitPattern: address)! // check nil later
	let mallocSize = malloc_size(p)
	var instanceSize = 0
	print("malloc_size   : \(mallocSize)")
	
	let objectReference = unsafeBitCast(address, to: AnyObject.self)
	
	if let objectClass = object_getClass(objectReference) {
		instanceSize = class_getInstanceSize(objectClass)
		print("instance size : \(instanceSize)")
		
		print("instance     ", terminator: "")
		var c: AnyClass? = objectClass
		while c != nil {
			print(" : \(c!)", terminator: "")
			c = class_getSuperclass(c!)
		}
		print()
	}
	print("bytes         : ", terminator: "")
	let size = instanceSize != 0 ? instanceSize : mallocSize != 0 ? mallocSize : 16
	let bytes = p.assumingMemoryBound(to: UInt8.self)
	for i in 0 ..< size {
		print(String(format: "%02x ", bytes[i]), terminator: "")
	}
	print("\n")
}

class BaseClass {
	var values: (UInt64, UInt64, UInt64) = (0x1111111111111111, 0x2222222222222222, 0x3333333333333333)
}

class SubClass: BaseClass {}

let c = SubClass()
let intRepresentation = unsafeBitCast(c, to: Int.self)
debugPrintAddressInfo(intRepresentation)
2 Likes

Technically, object_getClass2 is imported as taking an Any. When it is called from Swift, that Any argument is bridged to an Objective-C object, which is then passed to the underlying function. In the general case, that bridging can work by boxing the value up into an instance of the SwiftValue class. So the fact that you received a valid Objective-C object in your C function is not surprising but also not necessarily informative.

In this case, yes, β€œnative” Swift classes (classes that don’t inherit from ObjC superclasses) still do use the ObjC object model on Apple platforms, so the bridging is always a no-op.

1 Like

Good to know!

Although now I can't use this call with a bogus (but readable) memory:

let a = SubClass()
let intRepresentationA = unsafeBitCast(a, to: Int.self)
debugPrintAddressInfo(intRepresentationA)

let b = malloc(13)
let intRepresentationB = unsafeBitCast(b, to: Int.self)
debugPrintAddressInfo(intRepresentationB)

// some readble address
debugPrintAddressInfo(unsafeBitCast(malloc(1)!, to: Int.self) + 1)

With object_getClass2 this prints:

address       : 0000600000c00c30
malloc_size   : 48
instance size : 40
instance      : SubClass : BaseClass : _TtCs12_SwiftObject
bytes         : 60 82 00 00 01 00 00 00 03 00 00 00 00 00 00 00 11 11 11 11 11 11 11 11 22 22 22 22 22 22 22 22 33 33 33 33 33 33 33 33 

address       : 0000600000008040
malloc_size   : 16
bytes         : 00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 

address       : 0000600000008061
malloc_size   : 0
bytes         : 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 70 

while with unsafeBitCast(address, to: AnyObject.self) the casts crash with the last two examples.

How come? This is the prototype in the bridging header:

Class object_getClass2(long);

Should match "Int" on 64-bit platforms, no?

So this won't work on non Apple platforms? Not that I need it now, but good to know upfront.

Well non-apple platforms don't have the Objective C APIs anyways (such as object_getClass). There's GNUStep, but Swift isn't compatible with it IIRC.

No, but that's pretty hard to do, anyway. See Testing if an arbitrary pointer is a valid object pointer

Yep, indeed. Also found this one.

Somehow I overlooked that detail, sorry. Yes, that definitely relies on the object models interoperating and is not guaranteed on all targets. For example, even if we we made interoperation with the GNU ObjC runtime work on Linux and enabled it by default, we would not probably not force Swift objects to use that object model, so imported ObjC class types would probably not be compatible with AnyObject.

Thank you both.

Is it possible to implement the following to drill into Swift's objects dynamically (not necessarily in Swift – if needed in C / Obj-C):

// Pseudo code, any language. Will be called with native Swift objects:

func printInstance(address: Int, objectClass: AnyClass) {
    let instance = unsafeBitCast(address, to: UnsafePointer<UInt8>)
    
    for i in objectClass.fieldCount {
        let field = objectClass.field[i]
        print("field[\(i)].type: ", field.type)
        print("field[\(i)].offset: ", field.offset)
        print("field[\(i)].size: ", field.size)
        printBytes("field[\(i)].bytes:", instance + field.offset, size: field.size)
    }
}

Like a non-generic version of MemoryLayout or like Objective-c's "object_getIvar" / "ivar_getOffset"

You might be interested in checking out GitHub - Azoy/Echo: A complete reflection library for Swift

Azoy/Echo didn't work for "type erased" instances:

    let c = MyClass()
    printGenericArgs(with: [c]) // [RefTest.MyClass]
    
    let someInt = Int(bitPattern: Unmanaged.passRetained(c).toOpaque())
    let object = unsafeBitCast(someInt, to: AnyObject.self)
    printGenericArgs(with: [object]) // [Swift.AnyObject]

However I cooked something that actually does work :partying_face: using obj-c APIs (class_copyIvarList, ivar_getOffset, etc) even on native Swift class instances. For the following Swift class instance, completely type erased first (down to its Int address):

class MyClass {
    var foo: UInt8 = 0x11
    let bar: UInt8 = 0x12
    var baz: UInt16 = 0x2222
    var qux: UInt32 = 0x33333333
    var quux: UInt64 = 0x4444444444444444
    var corge: Int16 = 0x1234
    var grault = NSObject()
    var garply: UInt32 = 0xdeadbeef
}
let c = MyClass()
let someInt = unsafeBitCast(c, to: Int.self)

I'm getting this output:

MyClass {
    pad: 16 bytes // [f0 c3 b9 02 01 00 00 00 03 00 00 00 04 00 00 00]
    var: foo: 1 bytes // [11]  (to figure out: 0x0 0x5800000000000000)
    var: bar: 1 bytes // [12]  (to figure out: 0x0 0x5800000000000000)
    var: baz: 2 bytes // [22 22]  (to figure out: 0x1 0x5800000000000000)
    var: qux: 4 bytes // [33 33 33 33]  (to figure out: 0x2 0x5800000000000000)
    var: quux: 8 bytes // [44 44 44 44 44 44 44 44]  (to figure out: 0x3 0x5800000000000000)
    var: corge: 2 bytes // [34 12]  (to figure out: 0x1 0x5800000000000000)
    pad: 6 bytes // [b8 db 01 00 00 00]
    var: grault: 8 bytes // [00 03 a5 02 00 60 00 00]  (to figure out: 0x3 0x5800000000000000)
    var: garply: 4 bytes // [ef be ad de]  (to figure out: 0x2 0x5800000000000000)
    pad: 4 bytes // [00 00 00 00]
}

Using "pad" field for padding bytes – they expectedly contain some garbage, different from run to run.

Some notes and open questions on the implementation:

  1. I was gladly surprised to see "ivar_getName" getting field names. Coming from C never thought those would be put into resulting binary (including in release builds) making the binary bigger - beware of using obscene language and long variable names!
  2. Will need to figure out how to decipher the first two longs of class instances (presumably isa pointer plus something else). What are the 0x00000003 and 0x00000004 words after the "isa" pointer? [Edit: the second word (in this example it is 0x00000004) is a retainCount].
  3. "ivar_getTypeEncoding" didn't return anything useful (not even for NSObject subclasses). Would it be possible to figure out the type of fields somehow else? E.g. to distinguish Int8 and UInt8.
  4. Any way to distinguish "var" from "let"?
  5. Can see "ivar_getOffset" but not "ivar_getSize" API, had to implement the latter myself (using the struct below). Is there a better way?
    6. The fields "smth" and "smth2" (of the struct below) are a mystery.

Using this reversed engineered structure for "ivar":

// reversed engineered structure:
// Edit: found the most recent version of this struct here:
// https://opensource.apple.com/source/objc4/objc4-818.2/runtime/objc-runtime-new.h.auto.html
//
struct objc_ivar { // size of objc_ivar is 32
    uint64_t   *offset; // or int32? I'm on little endian computer, should not matter
    const char *name;
    char       *type;   // but see below
    uint32_t    alignment;
    uint32_t    size;
};

The fields "smth" and "smth2" are a mystery to me at this point, would appreciate if anyone can help deciphering them. The log above shows them in "to figure out" section ("smth1" field first, followed by 8 bytes pointed to by "smth2" field, but I am not sure how many bytes "smth2" field actually points to).

Nah, those are fun easter eggs

Make them long if it helps clarity. Shortening them you might save what, a kilobyte? Maybe?

See Type Encodings - NSHipster

Type encoding are more to do with memory-layout, reference-counting and structural compatibility (e.g. for knowing how to handle the result of an objc_msgSend or NSInvocation). All objects are just @, for example. It doesn't have higher-level type information like you would see in the static type system.

Ivars are completely hidden from you in Swift. There's just a notion of properties (and stored properties happen to create ivars behind the scenes). You can distinguish let foo vs var foo by whether here's foo and setFoo:, or just foo.

The type encoding tells you the size of the value. For objects, it's like MemoryLayout<SomeClass>.size. It'll always return the size of the pointer, not the pointee (the heap-allocated object). See swift - Get the size (in bytes) of an object on the heap - Stack Overflow

Don't introspect that struct. It's layout is not stable. Not merely theoretical, it was heavily overhauled a few year ago. See WWDC 2020, Session 10163, Advancements in the Objective-C runtime.

Stick to the ivar_* public APIs.

1 Like

Yeah, I use public APIs, when they work...

class MyClass: NSObject {
    @objc var fugazi: NSView = NSView()
    @objc let bar: Int = 0xdeadbeef
    @objc let baz: Int32 = 0x77777777
}
// ditto for swift native class

func printIvar(_ name: String, objectClass: AnyClass) {
    let ivar = class_getInstanceVariable(objectClass, name)!
    print("ivar offset: \(ivar_getOffset(ivar))")

    let chars1 = ivar_getName(ivar)!
    let name1 = String(cString: chars1)
    print("ivar name: '\(name1)' (len = \(name1.count))")
    precondition(name == name1)

    let chars2 = ivar_getTypeEncoding(ivar)!
    let name2 = String(cString: chars2)
    print("ivar typeEncoding: '\(name2)' (len = \(name2.count))")
    print()
}
func test2() {
    let c = MyClass()
    let address = unsafeBitCast(c, to: Int.self)
    let obj = unsafeBitCast(address, to: AnyObject.self)
    let objectClass: AnyClass = object_getClass(obj)!
    printIvar("fugazi", objectClass: objectClass)
    printIvar("bar", objectClass: objectClass)
    printIvar("baz", objectClass: objectClass)
    print()
}

test2()

Outputs:

ivar offset: 8                  βœ…
ivar name: 'fugazi' (len = 6)   βœ…
ivar typeEncoding: '' (len = 0) πŸ›‘

ivar offset: 16                 βœ…
ivar name: 'bar' (len = 3)      βœ…
ivar typeEncoding: '' (len = 0) πŸ›‘

ivar offset: 24                 βœ…
ivar name: 'baz' (len = 3)      βœ…
ivar typeEncoding: '' (len = 0) πŸ›‘

Switch to properties.

I can use property_get methods on @objc marked variables in addition to "ivar_get" methods that I use on vars that are not obj-c (when property_get methods don't work). For the following class instance:

class MyClass {
    @objc var foo: UInt8 = 0x11
    @objc let foo2: Int8 = 0x22
    var bar: UInt16 = 0x3333
    let bar2: Int16 = -0x3333
    var baz: UInt32 = 0x44444444
    var baz2: Int32 = -0x44444444
    var qux: UInt64 = 0x5555555555555555
    var qux2: Int64 = -0x5555555555555555
    let quux: (Int, UInt8) = (0x6666666666666666, 0x77)
}
let c = MyClass()
let someInt = unsafeBitCast(c, to: Int.self)

I'm now getting this auto-generated output:

class MyClass {
    pad                 // 16 bytes, [98 c5 42 00 01 00 00 00 03 00 00 00 02 00 00 00]
    var foo: UInt8      // 1 bytes, [11]  😁
    let foo2: Int8      // 1 bytes, [22]  😁
    var bar             // 2 bytes, [33 33] 
    var bar2            // 2 bytes, [cd cc] 
    pad                 // 2 bytes, [00 00]
    var baz             // 4 bytes, [44 44 44 44] 
    var baz2            // 4 bytes, [bc bb bb bb] 
    var qux             // 8 bytes, [55 55 55 55 55 55 55 55] 
    var qux2            // 8 bytes, [ab aa aa aa aa aa aa aa] 
    var quux            // 9 bytes, [66 66 66 66 66 66 66 66 77] 
    pad                 // 7 bytes, [00 00 00 00 00 00 00]
}

Note that for "objc" marked variables I was able to decipher both "Int" vs "Uint" and "var" vs "let".

Why this was exposed:

ptrdiff_t ivar_getOffset(Ivar ivar) {
    if (!ivar) return 0;
    return *ivar->offset;
}

const char * ivar_getName(Ivar ivar) {
    if (!ivar) return nil;
    return ivar->name;
}

const char * ivar_getTypeEncoding(Ivar ivar) {
    if (!ivar) return nil;
    return ivar->type;
}

but not the "ivar_getSize" is somewhat beyond me. Without that call I can not show the above output reliably (e.g. for the "bar2" variable I'd have to show "4 bytes [33 33 cd cc]" instead of the correct "2 bytes [33 33]") or if I peek into those fields directly I'm indeed risking crashing on future (and past) OS systems.

Interesting fact: if the first 8 / 16 bytes of an arbitrary structure happen to be the bit representation of some class isa pointer plus low numbers (for retain count and whatever the other word is) - this memory would be treated by my code as a valid class instance :rofl: Example:

func test_bogus() {
    let c = MyClass()
    let object = unsafeBitCast(c, to: AnyObject.self)
    let cls = object_getClass(object)!
    let size = class_getInstanceSize(cls)
    let block = malloc(size)!
    let p = block.assumingMemoryBound(to: UInt8.self)
    for i in 0 ..< size { p[i] = 0xAD }
    memmove(block, unsafeBitCast(c, to: UnsafeRawPointer.self), 16) // copy the first 16 bytes
    let address = unsafeBitCast(block, to: Int.self)
    printObject(at: address)
}

Outputs:

class MyClass {
    pad                 // 16 bytes, [98 85 1e 00 01 00 00 00 03 00 00 00 04 00 00 00]
    var foo: UInt8      // 1 bytes, [ad] 
    let foo2: Int8      // 1 bytes, [ad] 
    var bar             // 2 bytes, [ad ad] 
    var bar2            // 2 bytes, [ad ad] 
    pad                 // 2 bytes, [ad ad]
    var baz             // 4 bytes, [ad ad ad ad] 
    var baz2            // 4 bytes, [ad ad ad ad] 
    var qux             // 8 bytes, [ad ad ad ad ad ad ad ad] 
    var qux2            // 8 bytes, [ad ad ad ad ad ad ad ad] 
    var quux            // 9 bytes, [ad ad ad ad ad ad ad ad ad] 
    pad                 // 7 bytes, [ad ad ad ad ad ad ad]
}

as an extra precaution I could validate malloc_size and make sure it is not less (and not much greater) than the matching class instance size.