SE-0330: Conditionals in Collections

I hear you but this leaves us precisely where we were three years ago. Given these constraints only a tactical, incremental change such as that in this proposal (fleshed out and rebased) is a way to take a step forward at this point. This would not preclude moving to the lexer at a later date as it would be functionality the same but there also seems to be a constituency of tooling users who would not be in favour of that anyway so why hold a proposal like this in limbo in perpetuity?

Returned for Revision

This review is circling around two points:

  1. Within the current syntax-based model for condition code, the proposal cannot be reasonably evaluated because the implications are not fully clear in both the written proposal and in the implementation (which is not up-to-date on main). Because the implementation is not up-to-date, the impact of the complexity of the implementation on the compiler is difficult to assess, and the behavior of various corner cases is not clear.

  2. Whether or not the syntax-based approach for conditional code should be abandoned entirely for a different approach, such as a lexer-based one. This creates and existential question of whether this proposed change should even be considered.

The review has waded into debating the second point. To that end, @allevato raised the following reasonable concern:

The core team concurs that such a significant change in direction should have its own properly considered pitch/discussion given the potentially seismic implications. A decision along these lines will not be made as part of this review.

However, the points raised in favor of a lexer-based approach contend that the complexity of the current approach continues to stack up, and is potentially unpredictable to the user. @Chris_Lattner3 framed this perspective succinctly:

In order for this proposal to be considered, the implications of the proposed change needs to be clear. There have been several questions that have been raised in the review that would need to be answered before accepting this proposal as a refinement to the current conditional code model. Thus, the proposal is returned for revision, with the following request:

  1. Update the implementation to work with main so that the implications of the implementation can be understood.

  2. The proposal speaks to the cases brought up here, and sorts through the broader concerns brought up. An updated implementation will be helpful to this end.

With an updated proposal and implementation, the community and core team can properly weigh the cost/benefit tradeoff of this refinement to the current approach, while also providing signal for a potential discussion of moving the conditional compilation to a lexer-based approach. To be clear, it is not a forgone conclusion that a lexer-based approach is an inevitable outcome. As @allevato points out, this deserves sufficient discourse to make such a call.

On behalf of the core team, I'd like to thank both @johnno1962 as the proposal author and everyone who participated in this discussion.

12 Likes

Thank you @tkremenek for bringing this to review even if the outcome wasn't entirely conclusive. In order to avoid situations like this in future could I suggest with appropriate temerity that we introduce a "pre-review" state into the Swift Evolution pipeline. There are 34 open PRs on the swift-evolution repo that could come to review at any time. Would it be possible to schedule proposals of interest to the core team formally by the act of merging the PR (automatically notifying the proposal author and allocating the SE-NNNN number) at least two weeks ahead of time. This would create a sort of "pre-review lobby" which could be on a dashboard somewhere where authors would be motivated to fine tune their proposal, ensure the implementation is up to date and receive final comments from the community before the proposal text is finalised.

All that said, I was able to reproduce a toolchain for the original implementation last night and can give definitive if rather belated answers to some of the questions that came up during the review. I'll not post the toolchain as it is from a time before ABI stability and only builds/works with Xcode 10.1.

This was marked as an error by the implementation. Internally, parsing of collection literals read ahead an expression and a token (skipping any conditionals) to see if the token was ':' or ',' to decide the dictionary/array-ness of the literal and goes down two distinct code paths once that has been determined.

In the original implementation which you can see from my reply to @rintaro's post it was decided to code it to require the trailing ',' inside conditionals. This wasn't strictly necessary and is probably a decision I'd revisit as it creates the need to add commas inside a conditional even if it is defines the last element of the collection.

3 Likes

Hi Ted,

Could the core team provide some guidance of whether the lexer based approach should also be considered? My read of your summary here is that you're asking for a better formed version of the array-only proposal. Is the core team interested in evaluating the larger problem?

-Chris

@Ben_Cohen @Douglas_Gregor Is there any guidance available from the core team here? I'm curious which way you want to see this go.

2 Likes

The review of SE-0335 exhibits another case where lexer-based conditional compilation would help: SE-0335: Introduce existential `any` - #24 by gwendal.roue (with the exhibition of an interesting use case: sharing one doc comment for multiple (conditionalized) declarations of a single documented item).

There is existing guidance from the SE-0308 acceptance:

The core team believes that additional refinement of conditional compilation is something that the language would benefit from. As such, proposals extending support towards conditional compilation of repeated lexical constructs, e.g. elements within an array literal, would be welcome. For non-repeated constructs, there are potential issues with the parsing, for example in the removal of a binary operator may change the operator precedence of subsequent expressions, and this would require careful, deliberate handling. Such a change is not unreasonable, but would demand an appropriate level of design and care towards an implementation.

This guidance stands, and SE-0330 aligns with it. The Core Team is open to considering the lexer-based model, but it would need to presented as its own standalone pitch/proposal. The possibility that such a proposal could be written and accepted should not be used as a reason to reject proposals like SE-0330 that would be subsumed by the lexer model.

(Removes Core Team hat)

I personally would rather see a comprehensive proposal than a lot of piecemeal proposals. For the grammar-based approach, that would be a walk through the whole grammar with updates to allow #if's everywhere they make sense. For the lexer-based approach, it's a model change that removes #if handling from the grammar. I am personally inclined toward the grammar-based approach because I appreciate the ability to build basic syntax trees for the branch not taken, but I'm willing to be convinced.

Doug

8 Likes

To follow-up a bit, the grammar-based approach is about creating syntax-trees in a context-free manner (independent of build settings). The lexer-based approach will practically kill this valuable property of the language.

5 Likes

How is this handled in other languages? :thinking:

Many other mainstream languages these days, outside of C/C++/Objective-C, don't have a preprocessor that allows for the exclusion of arbitrary regions of text. The closest ones to Swift that I'm remotely familiar with that offer a form of conditional compilation are:

C#

C# has #if-style conditional compilation directives that fall somewhere in the middle of C's and Swift's. The untaken branches aren't stripped out completely by a true preprocessor pass; instead, their parsing API appears to take a set of compilation conditions as its input so that it can parse only the branches that are taken into nodes. But the #if directives themselves are preserved in the parsed AST, and the untaken branches are represented as trivia (unparsed raw text).

So, C# allows bizarre constructs like this that Swift would forbid, because it only actually parses the branches taken (notice how Func wouldn't be syntactically correct if NOT_DEFINED was defined):

using System;
					
public class Program
{
	public static void Main()
	{
		double x = 100.0;
		Func(
#if NOT_DEFINED
			x +
#else
			x -
#endif
			10.5);
	}
	
	public static void Func(
#if NOT_DEFINED
		int x
	) {}
#else
		double x
#endif
	) { Console.WriteLine(x); }
}

Rust

I'm far less familiar with Rust, but it looks like it offers a couple different approaches to conditional compilation:

  • A #[cfg()] attribute, which tells the compiler to ignore the language element it's attached to.
  • A cfg! macro that evaluates to true/false at compile-time.

The attribute can handle situations similar to Swift wrapping an entire declaration in #if/#endif, but also applies to lower-level language elements. For example, it supports array literal exclusion:

fn main() {
    println!(
      "{:?}",
      [1, 2,
       #[cfg(target_os = "macos")]
       3,
       4, 5]);
    // prints [1, 2, 4, 5] on a non-Mac system
}

By only being a prefix instead of wrapping the beginning and end of an element, the Rust solution appears to skirt the issue of "should a trailing comma go inside or outside the #if".

It also works nicely for the example @gwendal.roue linked to, although I'm not sure if it's possible to conditionalize the entire function signature—you can conditionalize individual arguments though, which works better in their example anyway:

fn blah(
  #[cfg(target_os = "macos")]
  x: SomeType,
  #[cfg(not(target_os = "macos"))]
  x: SomeDifferentType
) {}

While the previous example highlighted an advantage of Rust's approach being a prefix instead of wrapping, this example highlights a drawback; the condition has to be repeated and inverted instead of just using #else.

But what's also really interesting here is that since this is just a parameter list where each parameter has an attribute attached, rather than excluding a wrapped region of code, the first one must be terminated by a comma, but the second one may not be.

Separately, the cfg! macro can be used in expression contexts, but AFAICT it just evaluates to a true/false boolean, so both branches of an if/else have to be syntactically and semantically valid. For example, you can't mix types:

fn main() {
    println!("{:?}", if cfg!(target_os = "macos") {
        50
    } else {
        "x"
     // ^^^ expected integer, found `&str`
    });
}

Unfortunately, I have no idea what Rust's standard parsing solution is (if it has one), so I don't know if/how these elements are reflected in their syntax tree.

Also, someone please correct me if I've misspoken about any of Rust's capabilities here.


Despite the similarities in C#'s overall parsing and syntax tree API, their approach would lose the benefits that @Douglas_Gregor and @akyrtzi mentioned—being able to parse the entire file irrespective of build conditions is a major benefit, and it makes tools like swift-format possible at all (in their current implementation).

Having researched Rust's approach some more, I'm really fond of it, but I imagine it's a non-starter for Swift since it would be too much of a departure from the #if syntax we already have. But conceptually, it seems like it solves the problems that folks here want solved. I think @Douglas_Gregor's suggestion of just identifying the places in the grammar where #ifs make sense, and then figure out how to adapt it to a model based on open/close delimiters instead of being prefix-only-based.

Then, it's a matter of figuring out how to update the SwiftSyntax APIs to make it easier to peer through the #ifs to get at the actual nodes for each branch. Right now, if a node can be optionally surrounded by an #if, its type degrades to the base Syntax type, and you have to do runtime type-checking/casting to figure out what it actually is. This isn't ideal because losing the strong typing makes it hard to reason about what the tree content is; you can't guarantee that you've covered every possibility exhaustively. If we went this route, then maybe there's a way to represent this with a generic container instead—a IfConfigsAreAllowedHereContainer<ActualNodeType> wrapper that provides accessors for all of its branches, or for the single "null" branch if it's not actually an #if but just regular language elements.

9 Likes

Got it. To summarize my understanding, it sounds like you're comfortable taking micro proposals that expand the support for #ifdef even at the expense of language complexity, but you're also open to a unified theory that subsumes them all.

-Chris

I could quibble that "additional grammar productions" is not necessarily indicative of language complexity in a pragmatic sense, but... yes ;)

Doug

I wasn't referring to implementation complexity. I see it as an accumulation of special cases that make the language more complicated for end users to know "what will work" and "what will not".

The fundamental problem we face (as we project forward further into the future) is that some things will be #if'able and others won't, and the difference will come down to difficult to understand history of how Swift evolved. Why can't I ifdef out a function attribute, or a function parameter, or ....

-Chris

2 Likes

Yes, agreed, which I why I said:

I personally would rather see a comprehensive proposal than a lot of piecemeal proposals.

Doug

It could be possible to use good old C preprocessor before swift compilation phase :rofl:

Tried using C preprocessor with Swift – it is possible, although not very convenient.

  1. I added the to be preprocessed "FileName.swift" file to Xcode project but not included it in the target †
  2. I have a build script that before Compile sources phase. This build script is doing the following
  3. cd's to the proper directory with sources
  4. duplicates the to be preprocessed FileName.swift file with the name ending with ".c" ††
  5. clang -P -CC -E FileName.swift.c -o FileName-preprocessed.swift
  6. ideally this script needs to be changed to also add some comment to the file top ("DO NOT EDIT") and ideally lock this file from editing – I am not doing this yet.
  7. Once I run this script once → the preprocessed file is created → I add it to Xcode project near the original to be preprocessed FileName.swift file.
  8. This is it. Next time I change the original file and press cmd+R the preprocess file is rebuilt and the app runs the new version.
  9. There are some obvious major inconveniences and limitations with this approach: two files in the project instead of one, you edit one file, but then you debug the other, you can't change the second file directly as all changes need to go via the first file, empty lines are removed from the preprocessed output †††, #file+#line if used gives the line number in the preprocessed file, not in the original file, etc. Interestingly, syntax highlighting and code completion work even in the original file (which is not included in the target).

† – if you add a swift file to a target and it contains errors - Xcode will show and complain about those errors even after you remove this swift file from the target, so the trick is not to add the source to any target in the first place

†† – For some reason C understands certain extensions only (like .c, cpp, .cxx, .h, etc) but it can't be used with arbitrary named files (e.g. .swift or files without extension). Any secret option to use it with arbitrary files? e.g. pass file content in a pipe manner?

††† – any way to preserve empty lines in the preprocessed output?

Overall, I am not recommending this approach. Use it only as a last resort hack when you can't think of anything better and only when you need the superpowers of C-style preprocessor beyond what could be achieved by standard swift means.

I think that's because you're invoking it as clang so it expects to be running on C source code. If you instead use cpp -P -CC foo it ignores the extension. It also outputs to standard out and will generate odd error messages if you try to use -o to save to a file, so you'll need to redirect stdout to wherever you want to save it, e.g., cpp -P -CC foo.swift.cpp > foo.swift

I think that should work but I really can't recommend it. (Among other things, the C preprocessor will probably be confused by Swift's string quoting rules… multiline strings, for example, would most likely trip it up. I'm sure there are other places too where Swift syntax diverges too much from C to keep cpp happy.)

Thank you, using "cpp" instead of "clang -E" kills two birds with one stone, both the file extension and preserving line ends issues fixed. The command line is:

cpp -P FileName.swift >FileName-preprocessed.swift

I checked swift multiline scenario – preprocessor gave a warning but generated correct output, so that particular scenario works.

There was a famous saying that "C preprocessor doesn't know C (language)". I guess it works in favour in this case – the less C preprocessor knows about C the better in this case.

How does #if swift play into this?

For some context, #if swift works differently than other #ifs:

#if arch(arm)
(
#endif
// Error, even on x86: no close paren

#if swift(>=6)
(
#endif
// Swift 5 is okay with this

If I understand correctly how that works, that means the compiler won't be able to reject code like this:

let x = [
#if swift(>=6)
  "five":
#endif
  5
]

Not that that's necessarily a dealbreaker, but it is something to be aware of.

Sidenote about this failing in older versions of Swift

This was fixed in Swift 5.7, but before then this was accepted:

#if swift(>=0)
print("Hello, World!")
#else
(
#endif
"""
Closing quotes? Who needs those?

You are right, swift's usage of #if is not supported.

I managed to cook a surrogate version of it.
print(SWIFT_VERSION) // 5.0

#if SWIFT_MAJOR_VERSION >= 5
    print("RECENT") // RECENT
#else
    print("OLD")
#endif

#if defined(__aarch64__)
    print("ARM") // ARM
#else
    print("NOT ARM")
#endif

which works as expected. For that to work I modified the command line:

cpp -P -DSWIFT_MAJOR_VERSION=5 -DSWIFT_VERSION=$SWIFT_VERSION FileName.swift >FileName-preprocessed.swift

For it to work automatically I somehow need to grab the first number (in this case 5) out of $SWIFT_VERSION which can probably be a string like "5.1" or "5.1.1b2" (not sure about the latter format variation). TBD. How do I check against >5.3? Perhaps with this:

#if SWIFT_NUMERIC_VERSION >= 5_3
    print("RECENT") // RECENT
#endif

where SWIFT_NUMERIC_VERSION would be, say, 501 and 5_3 defined as 503

NB, "elseif" spelling in C preprocessor is "elif":

#if SWIFT_MAJOR_VERSION >= 5
    print("RECENT")
#elif SWIFT_MAJOR_VERSION >= 4
    print("OLD")
#else
    print("ANCIENT")
#endif

but realistically #if should better not be at all in the to-be-preprocessed swift file.

Interestingly, if #available (iOS 14, *) {...} works fine (C preprocessor doesn't complain about it, leaves it as is), unless I write it on two lines:

if
    #available (iOS 14, *) {...}

in this case preprocessor complains about it with an error.

With the surrogate version it looks like this:

    let x = [
    #if SWIFT_MAJOR_VERSION >= 5
      "five":
    #endif
      5
    ]
    
    print(type(of: x)) // Dictionary<String, Int>
    // or Array<Int> is I change the condition

And it does give either a dictionary or an array depending upon swift version, so no, the wanted rejection is not happening here, although realistically in the actual user code there would be an immediate compilation error if array is treated like a dictionary or vice versa, perhaps not a big concern and normally it would be more sane:

    let x = [
    #if SWIFT_MAJOR_VERSION >= 5
      "five" : 5
    #else
       :
    #endif
    ]