Dreaming of a “harmless” language mode

sspringer · March 19, 2023, 6:50am

I was thinking about a language mode where any operations that could make the application crash are forbidden (or result in warnings during compilation).

Background: In Java or C# you can wrap code into try/catch catching “any“ errors, which is e.g. a good thing when you do not want a part of the program (e.g. the processing of a work item or request) to bring down the whole program. The idea would be to be able to enforce such a situation for Swift.

Notes: 1) Of course, that does not save you from e.g. infinite recursion or compiler errors.
2) There then is the question what about packages that you use.
3) …and maybe this idea goes too far for Swift? (That‘s why I put the word “dreaming” into the title…)

I identified the following “dangerous” operations which could cause a crash:

forced unwrapping of optionals
integer division (by 0)
number over-/underflow when using the standard operators (the overflow operators &+ etc. do not crash and you have addingReportingOverflow etc.)
accessing array members by index (see dependent types as a related forums topic)
unsafe operations (like using e.g. UnsafeMutableRawPointer; the definition of “unsafe” is: unsafe operations have an undefined behavior for some inputs)

So this mode would forbid an “integral” part of the language such as forced unwrapping of optionals — but in most cases, when being tempted to use this “!” for a variable, in almost all cases my code is simply not good.

This list might be incomplete or the whole idea even not sensible for a language like Swift — but exactly this is the question, so what do you think?

tera · March 19, 2023, 10:15am

Maybe it would be possible to create a special swift lint rule to that account?

sspringer · March 19, 2023, 12:07pm

Yes this could be a good idea and very feasible as Swift is already "mostly harmless"

To recognize unsafe operations it maybe suffices to recognize "unsafe" and "unmanaged" in the names.

mfilonen2 · March 20, 2023, 8:26am

Swift / ObjC boundaries are particularly dangerous. each returned value under NS_ASSUME_NONNULL is implicitly unwrapped optional, and without that macro it is even worse – non-optional type, that could easily be nil and crash. Swift would be much safer if there was a version of this macro that marks input parameters as nonnull and output values as nullable.

lukasa · March 20, 2023, 1:15pm

What's the bounds of your harmless language mode? That is, what code is included, and what is excluded, in the analysis?

For example, accessing array members by index is dangerous because there is (semantically) a precondition in there that can fire. Either the language mode special-cases that behaviour, or can "discover" that precondition. Neither is ideal. If the language mode can "discover" the precondition then it can also "discover" the use of pointers inside Array (and indeed all data structures), thereby shaving all of the language down to nothing. If you special-case the indexing operation, that works better, but it omits all other data structures from protection without special-case code.

This is worth thinking through, because it raises some important questions about how you make this kind of thing work, and what trade-offs you might have to make.

sspringer · March 20, 2023, 3:17pm

I think it would be a valuable feature to be able to control our own code in this regard, guaranteeing (by linter rules?) that you do not introduce "dangerous" code yourself. For libraries that you use (could I subsume Swift / ObjC boundaries here?), you would of course then have to assume a certain quality. For a bigger solution indeed many questions could be raised. So the situation for the simpler solution would be in the direction of "this code is checked to be harmless, and only proven libraries are used". (There have been situations when I would have liked to be able to say so, think e.g. of less experienced programmers who have to program using a DSL you provide.)

For the use of index operations in this context: I consider accessing an array member by index as an integral part of the language and not as a library call (no matter how it is actually implemented). The index operation for a dictionary returns an optional value, as it should be the case in a library that you would like to use in a "harmless" way (i.e. assuming "a certain quality"). When you program your own index operation, this index operation should not be able to operate like arrays (the "dangerous" way) if you implement it with "harmless" code.

For the more technical part: I do not know how this would be implemented as a linter rule, but my gut feeling is that is should be feasible.

lukasa · March 20, 2023, 3:31pm

Great, I understand this mode of thinking. The unfortunate reality though is that Swift's Collection protocol fundamentally disagrees with you. All Swift Collection types have an Index, and also have a subscript that accepts that Index and returns Element. This is a well-documented part of the protocol.

That means every Collection inherits Array's behaviour. Dictionary does too: it has an Index-taking subscript too. So does every other Swift Collection either inside or outside the standard library.

And those Index operations are very real: the default Iterator for a Collection is IndexingIterator, which uses those indices.

The pattern you propose here essentially implies that you will never write a type that conforms to Collection. I don't think that's a particularly tenable rule in Swift.

sspringer · March 20, 2023, 3:46pm

So this mode could be useful for code that does not formulate a Collection. So this would be a good fit for code that (simply speaking) only uses an API but does not formulate one itself (at least not one with Collection). This is indeed less then I first had imagined, but this quite exactly the use case I have given as an example.

Update: …and also no substring… would need to use an alternative (giving an optional or throwing), or generally for Collection.

Karl · March 20, 2023, 4:58pm

It seems to me that what you are looking for is a way to isolate a certain piece of code, so that you can not only guarantee failure when preconditions are violated, but also guarantee graceful failure.

It is very difficult to achieve that kind of isolation between code that shares memory, but you can if you separate the code in to separate processes which communicate via an XPC mechanism.

That said, you shouldn't have to rely on this kind of isolation very often - the contained code should be written in such a way that it is possible for callers to ensure they use it correctly (otherwise that's an API design defect). If you are regularly violating an API's preconditions, it is a sign that your code needs to improve its own validation logic.

Karl · March 20, 2023, 5:05pm

Technically, Collection does not require that subscripts must trap if given an invalid Index, though that is often the best thing to do.

Usability and ease-of-debugging aside - if you had a collection of integers (including bytes), you could return 0 for invalid indexes without violating Collection's semantics. Valid Collection algorithms should, of course, never even attempt to access an invalid index.

John_McCall · March 20, 2023, 5:30pm

Swift's language and library design are heavily based around enforcing preconditions by taking down the process. If we revisit that, it would likely just be to offer a mode that allows you to recover from those panics, effectively creating the sort of isolation that Karl is talking about within the process. Trying to create an alternative world where preconditions silently turn into dynamic failures would be swimming upstream against a lot of powerful currents, and frankly it's likely to produce less reliable software, not more.

patrickgoley · March 20, 2023, 5:31pm

I think it’s worth noting that any code that allocates memory is potentially unsafe because you might not have enough available memory and crash. So would any code that allocates or copies memory be considered unsafe?

Also worth noting that Java has a similar problem to Swift/Obj-C in that it can call out to C++ via the JNI and potentially do all sorts of unsafe stuff. I believe that trapping in the native code will bring down the process and not be converted to a catchable Java exception.

tera · March 20, 2023, 5:38pm

ideally – "static" (compile time) errors! (e.g. enabled via a special compiler flag set on a given file). Swift-lint looks promising, if not that - a custom standard library which redefines the state of affairs – but that's huge undertaking.

indeed. ditto for an infinite loop (or a finite loop that takes ages to complete), or stack overflow errors - those errors are out of scope here (as rightly noted in the headline post).

sspringer · March 20, 2023, 5:50pm

I guess in the foreseeable future you would use distributed actors, maybe with some additional tooling that could make the start of a separate processing easy…?

John_McCall · March 20, 2023, 6:01pm

That is impossible. Any operation of any real complexity will have preconditions on its meaningful behavior. Basic math has preconditions on its meaningful behavior — for example, addition on fixed-width integers can overflow. You can decide to do something "harmless", but you're just turning preconditions into quiet misbehavior.

tera · March 20, 2023, 6:15pm

If I understand the intent of this topic right – in this mode switched ON, additions and other failing operations will be either prohibited (at compile time) or "effectively prohibited" at run time – in favour of non failing operations like &+ or addingReportingOverflow (whichever is appropriate for the task at hand).

Here's a sketch how to catch some failing operations at runtime:

extension Array {
    subscript(_ index: Int) -> Element {
        get { failingCall("array subscript") }
        set { failingCall("array subscript") }
    }
}
extension Int {
    static func + (lhs: Self, rhs: Self) -> Self {
        failingCall("+ prohibited, use &+ or addingReportingOverflow instead")
    }
}

func test() {
    print("start")
    catchFailingCalls({
        print("before 'failing' array access")
        _ = [0][0]
        print("after 'failing' array access")
    }, {
        print("Some failing calls caught")
    })
    print("Afterwards")
}

test()

which uses this C++ helper:

// in C.cpp:

typedef void (^Block)(void);

extern "C" {
    void catchFailingCalls(Block execute, Block onErrors) {
        try {
            execute();
        } catch (...) {
            onErrors();
        }
    }

    void failingCall(NSString* title) {
        NSLog(@"%@", title);
        throw "failingCall";
    }
}

johnno1962 · March 20, 2023, 6:35pm

I wrote an "interesting" Swift Package looking at this problem. It won't work in the debugger and I haven't tested it for a long time so your milage may vary.

sspringer · March 20, 2023, 6:43pm

This is an important design decision, and it certainly deserves an own introductory section in the Language Guide (or have I overlooked something?). I did not fully grasp it until this discussion. The use of + in Java/C# (where it overflows silently) vs. Swift is a very good example.

Well, I guess you could use e.g. addingReportingOverflow, so I guess the decision for allowing code to fail was also a decision to avoid extremely inefficient or extremely complex code (think e.g. substrings). Swift tries to be as “harmless” as possible (think optionals) while also being very efficient (I think something like this is stated in the Language Guide) [update: and correct]. (I like to say “Swift is mostly harmless”.)

Even considering that I did not fully grasp the topic at the beginning and now see the limitations, I still like the idea of (of course configurable, off by default) some linter rules warnings. I would be interested to try this out.

I will also play around with distributed actors.

…and something off-topic: The Swift community is again and again very nice and helpful, even with such a “speculative” topic, I learned a lot. Thank you.

sspringer · March 20, 2023, 6:58pm

Thank you for the link, very interesting article (link in the README). It is stated that the solution is not perfect if I understand it correctly (“… the undesirable effects should be limited to leaking of memory and system resources”), and I guess a “perfect“ implementation would be what @John_McCall meant with “If we revisit that, it would likely just be to offer a mode that allows you to recover from those panics”.

tera · March 20, 2023, 7:45pm

The following code:

tera:

extension Array {
    subscript(_ index: Int) -> Element {
        get {
            if modeEnabled { failingCall("array subscript") }
            else {
                // somehow use the original subscript operation 🤔
            }
        }
        set { ... }
    }
}

surfaces the question: how do I call through the original implementation?

A side note about arithmetics:

The C++ exceptions hack above aside, you can do another set of wrappers with either throwing operations:

let c = try a + b

or operations returning optionals:

let c: Int? = a + b

Neither is overly convenient and I remember I did another set of wrappers, basically:

struct SmartInt {
    var value: Int
    var overflow: Bool
}

with + / -, etc conveniently defined and implemented via xxxReportingOverflow. Naturally once happened the overflow flag is preserved and propagated across subsequent operations (similar to propagation of floating point nan or infinity).

In many algorithms the most negative signed integer value and the most positive unsigned integer value are not important and could be used to represent the overflow flag.

struct SmartInt8 {
    var value: Int8 // -127 ... 127, -128 == overflow
}