The way to understand what happens under the hood is to understand the difference between node.data = Data()
and node.data = i
.
In the first case, the program has to do the following things:
- Allocate a new
Data
- Reduce the reference count of the old
Data
stored in node.data
, potentially freeing it if it is no longer referenced.
- Move the new
Data
into the node.data
field.
In the second case, the program has to do the following:
- Move the integer
i
into node.data
.
The important difference here is step (2) in the class
based model. Here we have to modify the reference count of node.data
and potentially free that object. To do so, we have to dereference the pointer in node.data
.
The problem here is that we are racing, doing this algorithm in multiple threads. That means we may encounter a situation where the two threads interleave operations like this:
THREAD 1 | THREAD 2
-------------------------------------
Allocate new Data |
|
| Allocate new Data
|
Load pointer to |
old data |
|
Reduce reference |
count of old Data |
|
| Load pointer to
| old data
|
Free old Data |
|
| Reduce reference
| count of old data
| (!)
|
| Free old Data (!)
|
Store new Data |
|
| Store new Data (!)
There are a number of issues with that set of operations. In particular, thread 2 is holding a dangling pointer: a pointer to memory that thread 1 has already operated on and freed. Any number of problems may happen here, but the most common one is that you will get a segmentation fault because thread 2 tries to dereference that pointer after thread 1 has already freed it. Some other issues can occur too. Notice also that we are at risk of leaking one of the new Data
objects, because thread 2 may not correctly reduce the reference count of the one stored by thread 1.
Compare this to the operations with an integer:
THREAD 1 | THREAD 2
-------------------------------------
Store integer |
|
| Store integer
Lots of problems can happen here: you can get tearing, you can end up with an unexpected final value. However, on an intel CPU this interleaving of operations will never cause a crash: you will just end up with unexpected (and potentially invalid) data.
Note that I said "on an Intel CPU" because many CPUs do not promise that doing this will not cause a crash. More generally, you cannot assume that just because a type is trivial, you cannot cause crashes when writing it from multiple threads. To be clear: you should not assume that the code is ok, or that it could never crash, or anything else. By sheer good luck the code you wrote does not crash today, but it might crash tomorrow, or next week, or in a different machine, or in bad weather. The compiler is even allowed to assume what you wrote cannot happen and so rewrite your code entirely to avoid the operation.