Embedded Swift

Very important point about internationalization.

Although I wouldn't be quite that dismissive of ASCII.

I don't always use it. I frequently use a LEDS for debugging because Serial isn't always an option, and Advent of Code on a micro... now that would be a challenge! Interesting. Maybe 2025 :)

When serial is an option and I'm designing a protocol for it I use ASCII as a way to confirm that I'm sending what I think I'm sending even if the actual characters don't matter semantically to humans (Am I finding the right boundaries, are my bits in the order, are my bytes in the right order, does it indeed create the "sentence" I thought it would. Message integrity though sending too much. sending the value of 2 as "02" in ascii is way more reliant than one lone little flicker on the Tx pin.)

A limited but robust byte-based number space that has two possible representations (visual and numeric) that one can make grammars out of is super useful for serialized information. Which is why I have a bunch of extensions on UInt8... and so does Apple.

That's not the same need as a fully inclusive human facing language string. Swift Strings are geared toward people. Rightfully, I would say, as it is the harder problem. There's a role for something that isn't. Do we overload the role of String or...

On the other hand, smaller and smaller devices are expected to be able to play nicely with people.

MicroPython makes subsets of stdlib available in the core and then offers "The Rest" in a separate library. That's a nice option. Graduated buy in. (I know the video is an hour, but its very good.)

So what I want is the hard thing.

#5 A well documented path where embedded objects can get ASCII support early as needed (called String or something else) but buy in to graduated Swift String support as their devices support with more HID devices and languages.

2 Likes

Agree, and you could see #1 as one of the steps in your #5 path.

1 Like

That's already more Unicode than what I want for 90% of what I personally do. (on embedded)

I do want people to be able to opt-in. (Id prefer that to opt-out because it's a teachable moment, but I go along with what ever.)

Strings are VERY COSTLY. Even ASCII Strings. People moving into embedded from other areas don't have that internalized sense yet, ("text files are cheap." LOL.) and some of the cushy cushy chips available now mean they'll maybe never have to, which will be lovely for them. I want them to have a good time focusing on the things they want to make.

And that does mean little bitty screens with unicode on them a lot of the time now or custom controllers in multiple languages...

Progressive capability increases will be wanted.

ETA (Although you probably didn't mean it as the first step, so yes. I agree! :sweat_smile:)

1 Like

I'd like to highlight this new repo, which does cover several of the MCUs that were mentioned here in this thread, I think folks here might find it useful :)

10 Likes

It's a bit unclear to me from the previous replies here, will Embedded Swift support inline and global assembly? I'm trying Swift for wasm32, and with it I would be able to implement the runtime required for allocation entirely without C (I need to use memory.grow) :slight_smile:

Having to create a module just for that one instruction (or to use the corresponding llvm builtin) is very hard to organize in an elegant way. I ended up just writing all of the allocation code in the C module to avoid having that tiny bit of code randomly split from everything else.


Another more noticeable missing feature compared to C or Zig is the inability to precisely define memory as arrays and/or packed structs (with control over the alignment of individual fields).


And lastly, could I replace the implementation of just the Swift runtime, like swift_retain? It would be useful in general, but I'm most curious about wasm gc — it would be nice not to waste time (and binary size) on implementing something the browser does already :thinking:

Edit 2:
Very weird bug, using @_cdecl("free") stops my code from compiling with no output at all. Thankfully it seems that make can detect that somehow, I would have been very confused otherwise. Not sure if this is the embedded mode or just Swift.

1 Like

Wasm32 is just the architecture so it depends what environment you try to run the binary with. AFAIK Wasm32 in Embedded doesn’t include a standard library so there’s no memory-allocation functionality by default. However, if you take a look at this example project which used Wasm on the web, you’ll see that the authors took the memory allocation parts of Wasi LibC. In essence, the project has to implement memory allocation itself (but you can copy it for any of your personal projects). I suspect that the reason compilation fails for you is due to a linker error, which you can usually verify by passing the -Xlinker -v flags IIRC. I have also faced some weird behavior from the linker before where linker errors aren’t shown; they just stop compilation.

1 Like

Excuse the nitpick, but it does include parts of the standard library that are not excluded in the embedded mode, just as other embedded platforms. And in the same way, you have to provide your own allocator if you're not building with -no-allocations flag. wasm32-unknown-none-wasm is not somehow a special triple, for all intents and purposes the build process is the same as for other embedded triples.

2 Likes

I did implement allocation with a basic bump allocator, the issue is moving it out of C into Swift which would make it easier for me to improve.

I commented out free in C

void free(void *_Nullable ptr) {}

and replaced it with

@_cdecl("free")
func free(ptr: UnsafeMutableRawPointer?) {}

after this swiftc started to fail silently, unless I name the Swift function differently, such as _free and call it from the C free — which is not ideal.

I don't think it's the linker, I am only emitting the object file and later linking everything together with wasm-ld. It's failing to compile the object file before I even get to the linker, and only specifically with @_cdecl for free, posix_memalign etc.
I tried the flag just in case but it shows no output.

The failing command is

swiftc -target wasm32-none-wasm -enable-experimental-feature Embedded -wmo -Xcc -fdeclspec -Xcc -fmodule-map-file=runtime/module.modulemap -parse-as-library -Osize $(SWIFT_SOURCES) -c -o build/wasm.o

I suppose even if the compiler didn't silently exit it wouldn't work, I tried to define free and posix_memalign with different names and forward calls to them from C. It works but only with -Ounchecked, and unfortunately @_optimize can't do that for just those functions.

I assume it has something to do with size_t being imported as Int even though it is unsigned, and pointer math uses Int too, so something overflows when calculating the new address.

I would have to write unreadable code like this to do pointer math correctly (this works now):

memory_end = UnsafeMutableRawPointer(bitPattern: UInt(bitPattern: memory_end) + UInt(memory_size()) * pageSize)

because .advanced(by: ) overflows, I guess.

Is this just an unfixable language/stdlib design flaw, along with signed array indices?


Would it be possible to "fix" this with generics, allowing both signed and unsigned indices? Or at least a compiler flag to override this, because any code built on top of this incorrect size_t = Int assumption is inherently unsafe.

It feels wrong that the language itself breaks type safety so fundamentally. My code was correct, but because it was built on an incorrect assumption I didn't even make, it didn't work :confused:

free and posix_memalign being C and POSIX standard library functions, they're subject to a lot of special treatment by the optimizer, and it could well be that defining them in Swift directly exposes a bug in LLVM that would normally be obscured by free usually being an opaque, externally-defined function. That might explain why the compiler is crashing/exiting silently without succeeding. As for size_t being imported as Int, that's not so much a flaw as an intentional tradeoff. C's ability to work with memory objects containing more than PTRDIFF_MAX elements is at best a half-truth, since it is extremely difficult to write C code that correctly handles large spans like that, since basic things like pointer arithmetic are signed operations so they UB on overflow. Swift imports these as Int because it is usually more important to be able to mix with signed quantities, and you can still bit-cast to UInt explicitly when you do need to work with large spans (which you would have to do in C also to do it correctly, the compiler just doesn't help you).

5 Likes

Is being neither better nor worse than C a good thing?
It's not just size_t. It's weird everywhere, like having a method on pointers specifically intended for advancing the pointer, only for that method to be less correct than casting to UInt and back. It might as well not exist, it's misleading.

In C, ptr[-1] is valid syntax that relies on the parser detecting -1 as a signed literal. In Swift, pointer offsetting is done with a method rather than bespoke syntax, so Swift has to choose a type. Rather than “advanced by” and “decremented by”, Swift chooses a single “advanced by” method that takes a signed integer.

2 Likes

It might be reasonable to add advanced(byAbsolute: UInt)/decremented(byAbsolute: UInt) methods to the pointer types that do the right overflow-aware thing for very large offsets.

1 Like

I think that something else is going on here; pointer.advanced(by:) does not do overflow checking. We would need to see more of what you're really trying to do to say what the problem might be.

Even if we don't do overflow checks, it seems possible that the optimizer sees an advance by a too-large-to-be-signed value and deletes the whole thing for being UB or something like that.

1 Like

FWIW this C approach works in Swift:

var array: [UInt8] = [0, 1, 2, 3, 4]
array.withUnsafeBytes { bp in
    let ptr = bp.baseAddress!
        .assumingMemoryBound(to: UInt8.self) // we are at 0
        .advanced(by: 2) // we are at 2
    print(ptr[-1]) // 1
}

It seems like it, I don't know why it crashes exactly as wasm just fails with "unreachable code should not be executed"

Working C code setting up the memory pointer:

extern unsigned char memory;
void *memory_end = &memory;

size_t memory_size() {
    return __builtin_wasm_memory_size(0);
}

void memory_grow(size_t page_count) {
    __builtin_wasm_memory_grow(0, page_count);
}

void initialize() {
    memory_end += memory_size() * WASM_PAGE;
}

the equivalent (failing) function when written in Swift:

func initialize() {
    memory_end += Int(bitPattern: UInt(memory_size()) * pageSize)
}

alternatively:

func initialize() {
    memory_end = memory_end.advanced(by: Int(bitPattern: UInt(memory_size()) * pageSize))
}

and this works:

func initialize() {
    memory_end = .init(bitPattern: UInt(bitPattern: memory_end) + UInt(memory_size()) * pageSize)
}

wasm from the advanced by version:

(func $$s4main10initializeyyF (type 1) (param i32 i32)
    (local i32 i32)
    block  ;; label = @1
      block  ;; label = @2
        global.get $GOT.data.internal.memory_end
        i32.load
        local.tee 2
        i32.eqz
        br_if 0 (;@2;)
        call $memory_size
        local.tee 3
        i32.const 65535
        i32.gt_u
        br_if 1 (;@1;)
        global.get $GOT.data.internal.memory_end
        local.get 2
        local.get 3
        i32.const 16
        i32.shl
        i32.add
        i32.store
        return
      end
      unreachable
      unreachable
    end
    unreachable
    unreachable)

wasm from the working version:

(func $$s4main10initializeyyF (type 1) (param i32 i32)
    (local i32)
    block  ;; label = @1
      call $memory_size
      local.tee 2
      i32.const 65535
      i32.gt_u
      br_if 0 (;@1;)
      global.get $GOT.data.internal.memory_end
      local.get 2
      i32.const 16
      i32.shl
      i32.store
      return
    end
    unreachable
    unreachable)

The difference is that in Swift, the argument to Array’s subscript is explicitly typed as Int, while in C, “The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).” [N1256 6.5.2.1 ¶2]. E2 can be either signed or unsigned; an unsigned expression won’t be converted to a signed expression.

1 Like

Interesting. Does this mean that in C this won't work (or possibly would in practice but "trigger" UB):

// assuming 64-bit computer:
char *p = somePointer + 1;
char b = p[-1];
char a = p[0xFFFFFFFFFFFFFFFF]; 
unsigned long u = 0xFFFFFFFFFFFFFFFF;
char c = p[u];
// a, b, c are referring to the same memory

Is there a reason why Swift can't overload the subscript of array to work on either?