Questions about ARC

Hi,

I am new to Swift, and I have several questions about how ARC works in
Swift.

1. I read from one of the previous discussions in the swift-evolution list (
https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\)
that ARC operations are currently not atomic as Swift has no memory model
and concurrency model. Does it mean that the compiler generates non-atomic
instructions for updating reference counts (e.g. using incrementNonAtomic()
instead of increment() in RefCount.h)?

2. If not, when does it use non-atomic ARC operations? Is there an
optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g.
all Swift benchmark applications), I assume Swift applications are
single-threaded. Then, I think we can safely use non-atomic ARC
operations. Am I right?

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler
flag to disable ARC)?

Thanks,
Jiho

Hi,

I am new to Swift, and I have several questions about how ARC works in Swift.

1. I read from one of the previous discussions in the swift-evolution list (https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\) that ARC operations are currently not atomic as Swift has no memory model and concurrency model. Does it mean that the compiler generates non-atomic instructions for updating reference counts (e.g. using incrementNonAtomic() instead of increment() in RefCount.h)?

No. We have the ability to do non-atomic reference counting as an optimization, but we only trigger it when we can prove that an object hasn't escaped yet. Therefore, at the user level, retain counts are atomic.

Swift ARC is non-atomic in the sense that a read/write or write/write race on an individual property/variable/whatever has undefined behavior and can lead to crashes or leaks. This differs from Objective-C ARC only in that a (synthesized) atomic strong or weak property in Objective-C does promise correctness even in the face of race conditions. But this guarantee is not worth much in practice because a failure to adequately synchronize accesses to a class's instance variables is likely to have all sorts of other unpleasant effects, and it is quite expensive, so we decided not to make it in Swift.

2. If not, when does it use non-atomic ARC operations? Is there an optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g. all Swift benchmark applications), I assume Swift applications are single-threaded. Then, I think we can safely use non-atomic ARC operations. Am I right?

When we say that we don't have a concurrency model, we mean that (1) we aren't providing a more complete language solution than the options available to C programmers and (2) like C pre-C11/C++11, we have not yet formalized a memory model for concurrency that provides formal guarantees about what accesses are guaranteed to not conflict if they do race. (For example, we are unlikely to guarantee that accesses to different properties of a struct can occur in parallel, but we may choose to make that guarantee for different properties of a class.)

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler flag to disable ARC)?

No, because ARC is generally necessary for correctness.

John.

···

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org> wrote:

Well, that would be the basic approach, but you still have to define what accesses are non-interfering. C, for example, says that ordinary fields can be independently accessed without racing, but bitfields generally cannot. That rule (among others) prohibits certain layout optimizations that we'd like to still do in Swift, e.g. packing Bool properties together.

John.

···

On Nov 30, 2016, at 11:58 AM, Alexis <abeingessner@apple.com> wrote:

On Nov 30, 2016, at 12:41 PM, John McCall via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

When we say that we don't have a concurrency model, we mean that (1) we aren't providing a more complete language solution than the options available to C programmers and (2) like C pre-C11/C++11, we have not yet formalized a memory model for concurrency that provides formal guarantees about what accesses are guaranteed to not conflict if they do race. (For example, we are unlikely to guarantee that accesses to different properties of a struct can occur in parallel, but we may choose to make that guarantee for different properties of a class.)

I’d actually been intending to ask about this. Is there any reason why the lowest level won’t just be a fairly vanilla copy-paste of the C11 concurrency model. That is, there’s non-atomic/relaxed/acquire/release/seqcst loads and stores, and all the happens-before graph stuff that comes along with it. I imagine doing anything else would be fairly challenging for LLVM?

From: John McCall via swift-dev <swift-dev@swift.org>
To: Jiho Choi <jray319@gmail.com>
Cc: swift-dev@swift.org
Date: 11/30/2016 12:41 PM
Subject: Re: [swift-dev] Questions about ARC
Sent by: swift-dev-bounces@swift.org

>
>
4. Lastly, is there a way to measure the overhead of ARC (e.g. a
compiler flag to disable ARC)?

No, because ARC is generally necessary for correctness.

It is imperfect, but you can get a good sense of the direct overhead of ARC
for a particular workload by using a profiling tool (eg perf on Linux) and
seeing what fraction of CPU cycles are spent in swift_retain and
swift_release. The actual overhead of ARC is almost certainly higher,
since the CPU samples don't account for lost optimization opportunities,
but the profile data is easy to get and I have found it to be a useful
lower bound.

--dave

···

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org > > wrote:

I’d actually been intending to ask about this. Is there any reason why the lowest level won’t just be a fairly vanilla copy-paste of the C11 concurrency model. That is, there’s non-atomic/relaxed/acquire/release/seqcst loads and stores, and all the happens-before graph stuff that comes along with it. I imagine doing anything else would be fairly challenging for LLVM?

···

On Nov 30, 2016, at 12:41 PM, John McCall via swift-dev <swift-dev@swift.org> wrote:

When we say that we don't have a concurrency model, we mean that (1) we aren't providing a more complete language solution than the options available to C programmers and (2) like C pre-C11/C++11, we have not yet formalized a memory model for concurrency that provides formal guarantees about what accesses are guaranteed to not conflict if they do race. (For example, we are unlikely to guarantee that accesses to different properties of a struct can occur in parallel, but we may choose to make that guarantee for different properties of a class.)

Thanks for clarifications. I have a couple of follow-up questions.

1. Could you please provide more information (e.g. source code location)
about the optimization applying non-atomic reference counting? What's the
scope of the optimization? Is it method-based?

2. Looking at the source code, I assume Swift implements immediate
reference counting (i.e. immediate reclamation of dead objects) without
requiring explicit garbage collection phase for techniques, such as
deferred reference counting or coalescing multiple updates. Is it right?
If so, is there any plan to implement such techniques?

···

On Wed, Nov 30, 2016 at 11:41 AM John McCall <rjmccall@apple.com> wrote:

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org> > wrote:
Hi,

I am new to Swift, and I have several questions about how ARC works in
Swift.

1. I read from one of the previous discussions in the swift-evolution list
(
https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\)
that ARC operations are currently not atomic as Swift has no memory model
and concurrency model. Does it mean that the compiler generates non-atomic
instructions for updating reference counts (e.g. using incrementNonAtomic()
instead of increment() in RefCount.h)?

No. We have the ability to do non-atomic reference counting as an
optimization, but we only trigger it when we can prove that an object
hasn't escaped yet. Therefore, at the user level, retain counts are atomic.

Swift ARC is non-atomic in the sense that a read/write or write/write race
on an individual property/variable/whatever has undefined behavior and can
lead to crashes or leaks. This differs from Objective-C ARC only in that a
(synthesized) atomic strong or weak property in Objective-C does promise
correctness even in the face of race conditions. But this guarantee is not
worth much in practice because a failure to adequately synchronize accesses
to a class's instance variables is likely to have all sorts of other
unpleasant effects, and it is quite expensive, so we decided not to make it
in Swift.

2. If not, when does it use non-atomic ARC operations? Is there an
optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g.
all Swift benchmark applications), I assume Swift applications are
single-threaded. Then, I think we can safely use non-atomic ARC
operations. Am I right?

When we say that we don't have a concurrency model, we mean that (1) we
aren't providing a more complete language solution than the options
available to C programmers and (2) like C pre-C11/C++11, we have not yet
formalized a memory model for concurrency that provides formal guarantees
about what accesses are guaranteed to not conflict if they do race. (For
example, we are unlikely to guarantee that accesses to different properties
of a struct can occur in parallel, but we may choose to make that guarantee
for different properties of a class.)

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler
flag to disable ARC)?

No, because ARC is generally necessary for correctness.

John.

Thanks for clarifications. I have a couple of follow-up questions.

1. Could you please provide more information (e.g. source code location) about the optimization applying non-atomic reference counting? What's the scope of the optimization? Is it method-based?

The optimization itself is not merged yet. But all the required machinery, e.g. non-atomic versions of the ARC operations, special non-atomic flag on SIL instructions, etc is in place already.

As for the prototype implementation, you can find it here, on my local branch:

2. Looking at the source code, I assume Swift implements immediate reference counting (i.e. immediate reclamation of dead objects) without requiring explicit garbage collection phase for techniques, such as deferred reference counting or coalescing multiple updates. Is it right? If so, is there any plan to implement such techniques?

Yes. It is a correct understanding.
Different extensions like deferred reference counting were discussed, but there are no short-term plans to implement it anytime soon.

-Roman

···

On Nov 30, 2016, at 6:25 PM, Jiho Choi via swift-dev <swift-dev@swift.org> wrote:

On Wed, Nov 30, 2016 at 11:41 AM John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
Hi,

I am new to Swift, and I have several questions about how ARC works in Swift.

1. I read from one of the previous discussions in the swift-evolution list (https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\) that ARC operations are currently not atomic as Swift has no memory model and concurrency model. Does it mean that the compiler generates non-atomic instructions for updating reference counts (e.g. using incrementNonAtomic() instead of increment() in RefCount.h)?

No. We have the ability to do non-atomic reference counting as an optimization, but we only trigger it when we can prove that an object hasn't escaped yet. Therefore, at the user level, retain counts are atomic.

Swift ARC is non-atomic in the sense that a read/write or write/write race on an individual property/variable/whatever has undefined behavior and can lead to crashes or leaks. This differs from Objective-C ARC only in that a (synthesized) atomic strong or weak property in Objective-C does promise correctness even in the face of race conditions. But this guarantee is not worth much in practice because a failure to adequately synchronize accesses to a class's instance variables is likely to have all sorts of other unpleasant effects, and it is quite expensive, so we decided not to make it in Swift.

2. If not, when does it use non-atomic ARC operations? Is there an optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g. all Swift benchmark applications), I assume Swift applications are single-threaded. Then, I think we can safely use non-atomic ARC operations. Am I right?

When we say that we don't have a concurrency model, we mean that (1) we aren't providing a more complete language solution than the options available to C programmers and (2) like C pre-C11/C++11, we have not yet formalized a memory model for concurrency that provides formal guarantees about what accesses are guaranteed to not conflict if they do race. (For example, we are unlikely to guarantee that accesses to different properties of a struct can occur in parallel, but we may choose to make that guarantee for different properties of a class.)

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler flag to disable ARC)?

No, because ARC is generally necessary for correctness.

John.
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Thanks for clarifications. I have a couple of follow-up questions.

1. Could you please provide more information (e.g. source code location) about the optimization applying non-atomic reference counting? What's the scope of the optimization? Is it method-based?

2. Looking at the source code, I assume Swift implements immediate reference counting (i.e. immediate reclamation of dead objects) without requiring explicit garbage collection phase for techniques, such as deferred reference counting or coalescing multiple updates. Is it right? If so, is there any plan to implement such techniques?

We do coalesce multiple updates although late and at the LLVM level and in a very limited way. The current work in ARC that is being done is creating the ability to represent ARC pairings in SIL. This will allow us to perform ARC optimizations in a much simpler manner with a much simpler algorithm. This is the Semantic SIL work.

Michael

···

On Nov 30, 2016, at 6:25 PM, Jiho Choi via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Wed, Nov 30, 2016 at 11:41 AM John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
Hi,

I am new to Swift, and I have several questions about how ARC works in Swift.

1. I read from one of the previous discussions in the swift-evolution list (https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\) that ARC operations are currently not atomic as Swift has no memory model and concurrency model. Does it mean that the compiler generates non-atomic instructions for updating reference counts (e.g. using incrementNonAtomic() instead of increment() in RefCount.h)?

No. We have the ability to do non-atomic reference counting as an optimization, but we only trigger it when we can prove that an object hasn't escaped yet. Therefore, at the user level, retain counts are atomic.

Swift ARC is non-atomic in the sense that a read/write or write/write race on an individual property/variable/whatever has undefined behavior and can lead to crashes or leaks. This differs from Objective-C ARC only in that a (synthesized) atomic strong or weak property in Objective-C does promise correctness even in the face of race conditions. But this guarantee is not worth much in practice because a failure to adequately synchronize accesses to a class's instance variables is likely to have all sorts of other unpleasant effects, and it is quite expensive, so we decided not to make it in Swift.

2. If not, when does it use non-atomic ARC operations? Is there an optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g. all Swift benchmark applications), I assume Swift applications are single-threaded. Then, I think we can safely use non-atomic ARC operations. Am I right?

When we say that we don't have a concurrency model, we mean that (1) we aren't providing a more complete language solution than the options available to C programmers and (2) like C pre-C11/C++11, we have not yet formalized a memory model for concurrency that provides formal guarantees about what accesses are guaranteed to not conflict if they do race. (For example, we are unlikely to guarantee that accesses to different properties of a struct can occur in parallel, but we may choose to make that guarantee for different properties of a class.)

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler flag to disable ARC)?

No, because ARC is generally necessary for correctness.

John.
_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev

Thanks for providing the pointer.
Do you have any preliminary result or goal (e.g. the replacement ratio) of
the optimization? Is it going to replace all ARC operations with
non-atomic ones for single-threaded applications?

···

On Wed, Nov 30, 2016 at 8:50 PM Roman Levenstein <rlevenstein@apple.com> wrote:

On Nov 30, 2016, at 6:25 PM, Jiho Choi via swift-dev <swift-dev@swift.org> > wrote:

Thanks for clarifications. I have a couple of follow-up questions.

1. Could you please provide more information (e.g. source code location)
about the optimization applying non-atomic reference counting? What's the
scope of the optimization? Is it method-based?

The optimization itself is not merged yet. But all the required machinery,
e.g. non-atomic versions of the ARC operations, special non-atomic flag on
SIL instructions, etc is in place already.

As for the prototype implementation, you can find it here, on my local
branch:

https://github.com/swiftix/swift/blob/30409865ff49a4268363cd359f82f29c9a90cce8/lib/SILOptimizer/Transforms/NonAtomicRC.cpp

2. Looking at the source code, I assume Swift implements immediate
reference counting (i.e. immediate reclamation of dead objects) without
requiring explicit garbage collection phase for techniques, such as
deferred reference counting or coalescing multiple updates. Is it right?
If so, is there any plan to implement such techniques?

Yes. It is a correct understanding.
Different extensions like deferred reference counting were discussed, but
there are no short-term plans to implement it anytime soon.

-Roman

On Wed, Nov 30, 2016 at 11:41 AM John McCall <rjmccall@apple.com> wrote:

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org> > wrote:
Hi,

I am new to Swift, and I have several questions about how ARC works in
Swift.

1. I read from one of the previous discussions in the swift-evolution list
(
https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\)
that ARC operations are currently not atomic as Swift has no memory model
and concurrency model. Does it mean that the compiler generates non-atomic
instructions for updating reference counts (e.g. using incrementNonAtomic()
instead of increment() in RefCount.h)?

No. We have the ability to do non-atomic reference counting as an
optimization, but we only trigger it when we can prove that an object
hasn't escaped yet. Therefore, at the user level, retain counts are atomic.

Swift ARC is non-atomic in the sense that a read/write or write/write race
on an individual property/variable/whatever has undefined behavior and can
lead to crashes or leaks. This differs from Objective-C ARC only in that a
(synthesized) atomic strong or weak property in Objective-C does promise
correctness even in the face of race conditions. But this guarantee is not
worth much in practice because a failure to adequately synchronize accesses
to a class's instance variables is likely to have all sorts of other
unpleasant effects, and it is quite expensive, so we decided not to make it
in Swift.

2. If not, when does it use non-atomic ARC operations? Is there an
optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g.
all Swift benchmark applications), I assume Swift applications are
single-threaded. Then, I think we can safely use non-atomic ARC
operations. Am I right?

When we say that we don't have a concurrency model, we mean that (1) we
aren't providing a more complete language solution than the options
available to C programmers and (2) like C pre-C11/C++11, we have not yet
formalized a memory model for concurrency that provides formal guarantees
about what accesses are guaranteed to not conflict if they do race. (For
example, we are unlikely to guarantee that accesses to different properties
of a struct can occur in parallel, but we may choose to make that guarantee
for different properties of a class.)

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler
flag to disable ARC)?

No, because ARC is generally necessary for correctness.

John.

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Thanks for providing the pointer.
Do you have any preliminary result or goal (e.g. the replacement ratio) of the optimization?
Is it going to replace all ARC operations with non-atomic ones for single-threaded applications?

In the ideal world, it would be nice to replace all ARC operations with non-atomic ones for single-threaded applications.

But in reality, it is way more difficult as it may seem at the first glance.

If this needs to happen without any hints from the developer, just by means of a static analysis of a program, then it is rather difficult. The main problem is that the compiler needs to reason whether a given reference may escape to another thread. For references created inside a function, we have rather good chances to figure out if a reference escapes the thread. But if the origin (i.e. how it was created or if it has escaped before) of a given reference is unknown, which is a typical case with function parameters or references inside class instances, then the compiler has to assume that any such reference has escaped its original thread and thus it needs to use atomic ARC-operations. Some sort of a global, whole-module/whole-program analysis may help here somewhat. But even if we would introduce such kind of analysis, it is likely to remain a problem for dynamic libraries and frameworks, because they don’t know and cannot reason which parameters required by their exposed APIs escaped in the user-code.

Alternatively , a developer could provide a hint and assure that compiler that the app is single-threaded. One simple possibility could be to have a special -single-threaded compiler option, which would basically claim that the app being developed is single threaded and thus there is no need for performing the atomic ARC operations. In this case, all ARC operations would be marked non-atomic by default in the code emitted from the user-code. The problem with this option could be that if a user app starts multiple threads directly or indirectly (e.g. it calls a library API, which starts a new thread), even though the option claimed the app would not do it, and some references will be shared between threads, then the execution of such an app may become unpredictable and end up with hard to find crashes. Mixing object files and libraries where a subset is compiled with this option and another part without is another receipt for a disaster. So, one would need to be extremely cautious when using this option.

There could be also something in between, where one would use special attributes indicating something related to thread-safety of a given reference/type/function/etc. These hints could help a compiler to reason about references and check if they may escape to a different thread.

-Roman

···

On Nov 30, 2016, at 9:40 PM, Jiho Choi <jray319@gmail.com> wrote:

On Wed, Nov 30, 2016 at 8:50 PM Roman Levenstein <rlevenstein@apple.com <mailto:rlevenstein@apple.com>> wrote:

On Nov 30, 2016, at 6:25 PM, Jiho Choi via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Thanks for clarifications. I have a couple of follow-up questions.

1. Could you please provide more information (e.g. source code location) about the optimization applying non-atomic reference counting? What's the scope of the optimization? Is it method-based?

The optimization itself is not merged yet. But all the required machinery, e.g. non-atomic versions of the ARC operations, special non-atomic flag on SIL instructions, etc is in place already.

As for the prototype implementation, you can find it here, on my local branch:
https://github.com/swiftix/swift/blob/30409865ff49a4268363cd359f82f29c9a90cce8/lib/SILOptimizer/Transforms/NonAtomicRC.cpp

2. Looking at the source code, I assume Swift implements immediate reference counting (i.e. immediate reclamation of dead objects) without requiring explicit garbage collection phase for techniques, such as deferred reference counting or coalescing multiple updates. Is it right? If so, is there any plan to implement such techniques?

Yes. It is a correct understanding.
Different extensions like deferred reference counting were discussed, but there are no short-term plans to implement it anytime soon.

-Roman

On Wed, Nov 30, 2016 at 11:41 AM John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
Hi,

I am new to Swift, and I have several questions about how ARC works in Swift.

1. I read from one of the previous discussions in the swift-evolution list (https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\) that ARC operations are currently not atomic as Swift has no memory model and concurrency model. Does it mean that the compiler generates non-atomic instructions for updating reference counts (e.g. using incrementNonAtomic() instead of increment() in RefCount.h)?

No. We have the ability to do non-atomic reference counting as an optimization, but we only trigger it when we can prove that an object hasn't escaped yet. Therefore, at the user level, retain counts are atomic.

Swift ARC is non-atomic in the sense that a read/write or write/write race on an individual property/variable/whatever has undefined behavior and can lead to crashes or leaks. This differs from Objective-C ARC only in that a (synthesized) atomic strong or weak property in Objective-C does promise correctness even in the face of race conditions. But this guarantee is not worth much in practice because a failure to adequately synchronize accesses to a class's instance variables is likely to have all sorts of other unpleasant effects, and it is quite expensive, so we decided not to make it in Swift.

2. If not, when does it use non-atomic ARC operations? Is there an optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g. all Swift benchmark applications), I assume Swift applications are single-threaded. Then, I think we can safely use non-atomic ARC operations. Am I right?

When we say that we don't have a concurrency model, we mean that (1) we aren't providing a more complete language solution than the options available to C programmers and (2) like C pre-C11/C++11, we have not yet formalized a memory model for concurrency that provides formal guarantees about what accesses are guaranteed to not conflict if they do race. (For example, we are unlikely to guarantee that accesses to different properties of a struct can occur in parallel, but we may choose to make that guarantee for different properties of a class.)

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler flag to disable ARC)?

No, because ARC is generally necessary for correctness.

John.

_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev

Thanks for the explanation.
One last question is why non-atomic ARC operations still use atomic load
and store. Wouldn't regular memory operations be enough?

···

On Thu, Dec 1, 2016 at 11:46 AM Roman Levenstein <rlevenstein@apple.com> wrote:

On Nov 30, 2016, at 9:40 PM, Jiho Choi <jray319@gmail.com> wrote:

Thanks for providing the pointer.
Do you have any preliminary result or goal (e.g. the replacement ratio) of
the optimization?

Is it going to replace all ARC operations with non-atomic ones for
single-threaded applications?

In the ideal world, it would be nice to replace all ARC operations with
non-atomic ones for single-threaded applications.

But in reality, it is way more difficult as it may seem at the first
glance.

If this needs to happen without any hints from the developer, just by
means of a static analysis of a program, then it is rather difficult. The
main problem is that the compiler needs to reason whether a given reference
may escape to another thread. For references created inside a function, we
have rather good chances to figure out if a reference escapes the thread.
But if the origin (i.e. how it was created or if it has escaped before) of
a given reference is unknown, which is a typical case with function
parameters or references inside class instances, then the compiler has to
assume that any such reference has escaped its original thread and thus it
needs to use atomic ARC-operations. Some sort of a global,
whole-module/whole-program analysis may help here somewhat. But even if we
would introduce such kind of analysis, it is likely to remain a problem for
dynamic libraries and frameworks, because they don’t know and cannot reason
which parameters required by their exposed APIs escaped in the user-code.

Alternatively , a developer could provide a hint and assure that compiler
that the app is single-threaded. One simple possibility could be to have a
special -single-threaded compiler option, which would basically claim that
the app being developed is single threaded and thus there is no need for
performing the atomic ARC operations. In this case, all ARC operations
would be marked non-atomic by default in the code emitted from the
user-code. The problem with this option could be that if a user app starts
multiple threads directly or indirectly (e.g. it calls a library API, which
starts a new thread), even though the option claimed the app would not do
it, and some references will be shared between threads, then the execution
of such an app may become unpredictable and end up with hard to find
crashes. Mixing object files and libraries where a subset is compiled with
this option and another part without is another receipt for a disaster. So,
one would need to be extremely cautious when using this option.

There could be also something in between, where one would use special
attributes indicating something related to thread-safety of a given
reference/type/function/etc. These hints could help a compiler to reason
about references and check if they may escape to a different thread.

-Roman

On Wed, Nov 30, 2016 at 8:50 PM Roman Levenstein <rlevenstein@apple.com> > wrote:

On Nov 30, 2016, at 6:25 PM, Jiho Choi via swift-dev <swift-dev@swift.org> > wrote:

Thanks for clarifications. I have a couple of follow-up questions.

1. Could you please provide more information (e.g. source code location)
about the optimization applying non-atomic reference counting? What's the
scope of the optimization? Is it method-based?

The optimization itself is not merged yet. But all the required machinery,
e.g. non-atomic versions of the ARC operations, special non-atomic flag on
SIL instructions, etc is in place already.

As for the prototype implementation, you can find it here, on my local
branch:

https://github.com/swiftix/swift/blob/30409865ff49a4268363cd359f82f29c9a90cce8/lib/SILOptimizer/Transforms/NonAtomicRC.cpp

2. Looking at the source code, I assume Swift implements immediate
reference counting (i.e. immediate reclamation of dead objects) without
requiring explicit garbage collection phase for techniques, such as
deferred reference counting or coalescing multiple updates. Is it right?
If so, is there any plan to implement such techniques?

Yes. It is a correct understanding.
Different extensions like deferred reference counting were discussed, but
there are no short-term plans to implement it anytime soon.

-Roman

On Wed, Nov 30, 2016 at 11:41 AM John McCall <rjmccall@apple.com> wrote:

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org> > wrote:
Hi,

I am new to Swift, and I have several questions about how ARC works in
Swift.

1. I read from one of the previous discussions in the swift-evolution list
(
https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\)
that ARC operations are currently not atomic as Swift has no memory model
and concurrency model. Does it mean that the compiler generates non-atomic
instructions for updating reference counts (e.g. using incrementNonAtomic()
instead of increment() in RefCount.h)?

No. We have the ability to do non-atomic reference counting as an
optimization, but we only trigger it when we can prove that an object
hasn't escaped yet. Therefore, at the user level, retain counts are atomic.

Swift ARC is non-atomic in the sense that a read/write or write/write race
on an individual property/variable/whatever has undefined behavior and can
lead to crashes or leaks. This differs from Objective-C ARC only in that a
(synthesized) atomic strong or weak property in Objective-C does promise
correctness even in the face of race conditions. But this guarantee is not
worth much in practice because a failure to adequately synchronize accesses
to a class's instance variables is likely to have all sorts of other
unpleasant effects, and it is quite expensive, so we decided not to make it
in Swift.

2. If not, when does it use non-atomic ARC operations? Is there an
optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g.
all Swift benchmark applications), I assume Swift applications are
single-threaded. Then, I think we can safely use non-atomic ARC
operations. Am I right?

When we say that we don't have a concurrency model, we mean that (1) we
aren't providing a more complete language solution than the options
available to C programmers and (2) like C pre-C11/C++11, we have not yet
formalized a memory model for concurrency that provides formal guarantees
about what accesses are guaranteed to not conflict if they do race. (For
example, we are unlikely to guarantee that accesses to different properties
of a struct can occur in parallel, but we may choose to make that guarantee
for different properties of a class.)

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler
flag to disable ARC)?

No, because ARC is generally necessary for correctness.

John.

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Thanks for the explanation.
One last question is why non-atomic ARC operations still use atomic load and store. Wouldn't regular memory operations be enough?

They use atomic operations with __ATOMIC_RELAXED, which basically means that they should load/store the counts as one single entity (i.e. a single 64bit word) and not as multiple parts. Aside from that, it does not involve any further overheads known from CAS/fetch-and-add type of atomic operations.

You can actually see the assembly for this functions to see which machine instructions are generated.

···

On Dec 1, 2016, at 4:55 PM, Jiho Choi <jray319@gmail.com> wrote:

On Thu, Dec 1, 2016 at 11:46 AM Roman Levenstein <rlevenstein@apple.com <mailto:rlevenstein@apple.com>> wrote:

On Nov 30, 2016, at 9:40 PM, Jiho Choi <jray319@gmail.com <mailto:jray319@gmail.com>> wrote:

Thanks for providing the pointer.
Do you have any preliminary result or goal (e.g. the replacement ratio) of the optimization?
Is it going to replace all ARC operations with non-atomic ones for single-threaded applications?

In the ideal world, it would be nice to replace all ARC operations with non-atomic ones for single-threaded applications.

But in reality, it is way more difficult as it may seem at the first glance.

If this needs to happen without any hints from the developer, just by means of a static analysis of a program, then it is rather difficult. The main problem is that the compiler needs to reason whether a given reference may escape to another thread. For references created inside a function, we have rather good chances to figure out if a reference escapes the thread. But if the origin (i.e. how it was created or if it has escaped before) of a given reference is unknown, which is a typical case with function parameters or references inside class instances, then the compiler has to assume that any such reference has escaped its original thread and thus it needs to use atomic ARC-operations. Some sort of a global, whole-module/whole-program analysis may help here somewhat. But even if we would introduce such kind of analysis, it is likely to remain a problem for dynamic libraries and frameworks, because they don’t know and cannot reason which parameters required by their exposed APIs escaped in the user-code.

Alternatively , a developer could provide a hint and assure that compiler that the app is single-threaded. One simple possibility could be to have a special -single-threaded compiler option, which would basically claim that the app being developed is single threaded and thus there is no need for performing the atomic ARC operations. In this case, all ARC operations would be marked non-atomic by default in the code emitted from the user-code. The problem with this option could be that if a user app starts multiple threads directly or indirectly (e.g. it calls a library API, which starts a new thread), even though the option claimed the app would not do it, and some references will be shared between threads, then the execution of such an app may become unpredictable and end up with hard to find crashes. Mixing object files and libraries where a subset is compiled with this option and another part without is another receipt for a disaster. So, one would need to be extremely cautious when using this option.

There could be also something in between, where one would use special attributes indicating something related to thread-safety of a given reference/type/function/etc. These hints could help a compiler to reason about references and check if they may escape to a different thread.

-Roman

On Wed, Nov 30, 2016 at 8:50 PM Roman Levenstein <rlevenstein@apple.com <mailto:rlevenstein@apple.com>> wrote:

On Nov 30, 2016, at 6:25 PM, Jiho Choi via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Thanks for clarifications. I have a couple of follow-up questions.

1. Could you please provide more information (e.g. source code location) about the optimization applying non-atomic reference counting? What's the scope of the optimization? Is it method-based?

The optimization itself is not merged yet. But all the required machinery, e.g. non-atomic versions of the ARC operations, special non-atomic flag on SIL instructions, etc is in place already.

As for the prototype implementation, you can find it here, on my local branch:
https://github.com/swiftix/swift/blob/30409865ff49a4268363cd359f82f29c9a90cce8/lib/SILOptimizer/Transforms/NonAtomicRC.cpp

2. Looking at the source code, I assume Swift implements immediate reference counting (i.e. immediate reclamation of dead objects) without requiring explicit garbage collection phase for techniques, such as deferred reference counting or coalescing multiple updates. Is it right? If so, is there any plan to implement such techniques?

Yes. It is a correct understanding.
Different extensions like deferred reference counting were discussed, but there are no short-term plans to implement it anytime soon.

-Roman

On Wed, Nov 30, 2016 at 11:41 AM John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Nov 30, 2016, at 8:33 AM, Jiho Choi via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
Hi,

I am new to Swift, and I have several questions about how ARC works in Swift.

1. I read from one of the previous discussions in the swift-evolution list (https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160208/009422.html\) that ARC operations are currently not atomic as Swift has no memory model and concurrency model. Does it mean that the compiler generates non-atomic instructions for updating reference counts (e.g. using incrementNonAtomic() instead of increment() in RefCount.h)?

No. We have the ability to do non-atomic reference counting as an optimization, but we only trigger it when we can prove that an object hasn't escaped yet. Therefore, at the user level, retain counts are atomic.

Swift ARC is non-atomic in the sense that a read/write or write/write race on an individual property/variable/whatever has undefined behavior and can lead to crashes or leaks. This differs from Objective-C ARC only in that a (synthesized) atomic strong or weak property in Objective-C does promise correctness even in the face of race conditions. But this guarantee is not worth much in practice because a failure to adequately synchronize accesses to a class's instance variables is likely to have all sorts of other unpleasant effects, and it is quite expensive, so we decided not to make it in Swift.

2. If not, when does it use non-atomic ARC operations? Is there an optimization pass to recognize local objects?

3. Without the concurrency model in the language, if not using GCD (e.g. all Swift benchmark applications), I assume Swift applications are single-threaded. Then, I think we can safely use non-atomic ARC operations. Am I right?

When we say that we don't have a concurrency model, we mean that (1) we aren't providing a more complete language solution than the options available to C programmers and (2) like C pre-C11/C++11, we have not yet formalized a memory model for concurrency that provides formal guarantees about what accesses are guaranteed to not conflict if they do race. (For example, we are unlikely to guarantee that accesses to different properties of a struct can occur in parallel, but we may choose to make that guarantee for different properties of a class.)

4. Lastly, is there a way to measure the overhead of ARC (e.g. a compiler flag to disable ARC)?

No, because ARC is generally necessary for correctness.

John.

_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev

According to the C++ memory model, all operations on an atomic<T> must be atomic. It is undefined behavior to cast a pointer to the atomic into a T* and perform normal loads and stores. "Relaxed" loads and stores however imply no ordering with other threads, so compile down to plain nonatomic CPU instructions.

-Joe

···

On Dec 1, 2016, at 4:55 PM, Jiho Choi via swift-dev <swift-dev@swift.org> wrote:

Thanks for the explanation.
One last question is why non-atomic ARC operations still use atomic load and store. Wouldn't regular memory operations be enough?