statically initialized arrays

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.
It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::

···

+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Thanks,
Erik

1 Like

Please do not use a global constructor. :-) Globals are already set up to handle one-time initialization; the fact that that initialization is now cheaper is still a good thing.

To be clear, this sort of operation is only safe when the layout of the instance is statically known. The layout of an array buffer is especially brittle, since we use trailing storage, so this kind of operation really will be hardcoded in that case. I think that's fine.

Jordan

···

On Jun 14, 2017, at 11:24, Erik Eckstein via swift-dev <swift-dev@swift.org> wrote:

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.
It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Hi,

I’m about implementing statically initialized arrays. It’s about
allocating storage for arrays in the data section rather than on the
heap.

W00t! I'd like to do the same for String, i.e. encode the entire buffer
in the data section. I was looking for Array example code to follow but
couldn't find it.

Info: the array storage is a heap object. So in the following I’m
using the general term “object” but the optimization will (probably)
only handle array buffers.

This optimization can be done for array literals containing only other
literals as elements. Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically
initialized global llvm-variable with a reference count of 2 to make
it immortal.

Why not 1?

It avoids heap allocations for array literals in cases stack-promotion
can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal
initialization patterns and “outlines” those into a statically
initialized global variable

2) A representation of statically initialized global variables in SIL

3) IRGen to create statically initialized objects as global
llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit
function which has to match a very specific pattern (e.g. must contain a single store to the
global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation
of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this
“instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object
header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in
case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the
metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

I just ask that you keep in mind that we'll eventually want the same
capability for other types, and try to write the code to make that
feasible.

Thanks,

···

on Wed Jun 14 2017, Erik Eckstein <swift-dev-AT-swift.org> wrote:

--
-Dave

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.

I was thinking about this a little bit. IMO, we probably actually want a bit in the header. The reason why is that even though setting the ref count to be unbalanced makes an object immortal, retain/release will still modify the reference count meaning that the statically initialized constant can not be in read only memory. On the other hand, if we had a bit in the header, we could use read only memory.

Michael

···

On Jun 14, 2017, at 11:24 AM, Erik Eckstein via swift-dev <swift-dev@swift.org> wrote:

It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Thanks,
Erik
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

I did some work along these lines already so that KeyPaths could be immortal heap objects. I added an entry point _swift_instantiateInertHeapObject that does exactly this. We could un-underscore it and promote it to a SWIFT_RUNTIME_EXPORT.

-Joe

···

On Jun 14, 2017, at 11:24 AM, Erik Eckstein via swift-dev <swift-dev@swift.org> wrote:

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.
It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.
It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

This sounds reasonable.

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

The designs here both sound good to me. If we need to introduce a higher-level global access function in `global_object`, perhaps we should instead raise the abstraction level of global_addr and sil_global to hide the details of the once-style access, so that the `sil_global` declaration for a runtime-initialized variable declares its relationship to an initialization token, and global_addr includes whatever access protocol is needed to access the referenced global. It might be nice for SIL not to have to pattern-match the details of how we `once`-initialize global values, and it might make it easier to use specialized access patterns for single- or double-word globals like Greg mentioned.

-Joe

···

On Jun 14, 2017, at 11:24 AM, Erik Eckstein via swift-dev <swift-dev@swift.org> wrote:

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.
It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

I just talked to MichaelG in person. He pointed out that the static initializer should not look like a function. Also the implicit convention that the last value is the actual top-level value is not obvious.
I think it makes sense to add some syntactic sugar to make this more clear (adding ‘=‘ and ‘init_value’):

+ sil_global hidden @_T04test3varSiv : $Int = {
+ %0 = integer_literal $Builtin.Int64, 27
+ init_value = struct $Int (%0 : $Builtin.Int64)
+ }

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Please do not use a global constructor. :-) Globals are already set up to handle one-time initialization; the fact that that initialization is now cheaper is still a good thing.

To be clear, this sort of operation is only safe when the layout of the instance is statically known. The layout of an array buffer is especially brittle, since we use trailing storage, so this kind of operation really will be hardcoded in that case. I think that's fine.

I assume you are referring to the fact that the tail allocated array buffer layout was implemented in the stdlib originally. But this is not the case anymore. It’s already hard-coded in the compiler (we have a dedicated instruction for this).

···

On Jun 14, 2017, at 12:03 PM, Jordan Rose <jordan_rose@apple.com> wrote:

On Jun 14, 2017, at 11:24, Erik Eckstein via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Jordan

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.

I was thinking about this a little bit. IMO, we probably actually want a bit in the header. The reason why is that even though setting the ref count to be unbalanced makes an object immortal, retain/release will still modify the reference count meaning that the statically initialized constant can not be in read only memory. On the other hand, if we had a bit in the header, we could use read only memory.

Except that we currently have no way to initialize the object header other than through a runtime call.

···

On Jun 14, 2017, at 4:04 PM, Michael Gottesman <mgottesman@apple.com> wrote:

On Jun 14, 2017, at 11:24 AM, Erik Eckstein via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

Michael

It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Thanks,
Erik
_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
  return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.

I was thinking about this a little bit. IMO, we probably actually want a bit in the header. The reason why is that even though setting the ref count to be unbalanced makes an object immortal, retain/release will still modify the reference count meaning that the statically initialized constant can not be in read only memory. On the other hand, if we had a bit in the header, we could use read only memory.

Michael

This is also surprisingly perf-relevant in concurrent situations. We ran into a scenario in Foundation this year where contention on the refcount of a global object was enough to cause an order of magnitude slowdown when the test was run on 10 threads.

  David

···

On Jun 14, 2017, at 4:04 PM, Michael Gottesman via swift-dev <swift-dev@swift.org> wrote:

On Jun 14, 2017, at 11:24 AM, Erik Eckstein via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

   decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
   sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Thanks,
Erik
_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev

_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev

1 Like

Hi,

I’m about implementing statically initialized arrays. It’s about
allocating storage for arrays in the data section rather than on the
heap.

W00t! I'd like to do the same for String, i.e. encode the entire buffer
in the data section. I was looking for Array example code to follow but
couldn't find it.

We have support for constant string buffers as of PR 8701 and PR 8692. The former PR shows the protocol that has to be implemented.

(The implementation currently exposes the ref count ABI. This can/needs to be fixed when we move to a stable abi by running an once initializer)

Info: the array storage is a heap object. So in the following I’m
using the general term “object” but the optimization will (probably)
only handle array buffers.

This optimization can be done for array literals containing only other
literals as elements. Example:

func createArray() -> [Int] {
return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically
initialized global llvm-variable with a reference count of 2 to make
it immortal.

Why not 1

Mutation must force copying.

···

On Jun 14, 2017, at 2:56 PM, Dave Abrahams via swift-dev <swift-dev@swift.org> wrote:

on Wed Jun 14 2017, Erik Eckstein <swift-dev-AT-swift.org> wrote:

It avoids heap allocations for array literals in cases stack-promotion
can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal
initialization patterns and “outlines” those into a statically
initialized global variable

2) A representation of statically initialized global variables in SIL

3) IRGen to create statically initialized objects as global
llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit
function which has to match a very specific pattern (e.g. must contain a single store to the
global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation
of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

  decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
  sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this
“instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object
header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in
case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the
metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

I just ask that you keep in mind that we'll eventually want the same
capability for other types, and try to write the code to make that
feasible.

Thanks,

--
-Dave

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Please do not use a global constructor.

What’s the objection to a global constructor about? We’re worried about dyld performance in this case?

:-)

:-/

Globals are already set up to handle one-time initialization; the fact that that initialization is now cheaper is still a good thing.

These array literals aren’t Swift globals to begin with so I’m not sure what that means. Introducing a swift-once accessor everywhere they’re used means we can’t do the global optimizations that we normally do for globals with constant initializers. That might be the right choice but it would be good to understand why we’re making it.

-Andy

···

On Jun 14, 2017, at 12:03 PM, Jordan Rose via swift-dev <swift-dev@swift.org> wrote:

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.
It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

  decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
  sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

I did some work along these lines already so that KeyPaths could be immortal heap objects. I added an entry point _swift_instantiateInertHeapObject that does exactly this. We could un-underscore it and promote it to a SWIFT_RUNTIME_EXPORT.

sounds good. If I understood it correctly, _swift_instantiateInertHeapObject currently initializes the object with a reference count of 1. For array buffers we would need 2. But I think for KeyPaths you are fine with 2 as well.

···

On Jun 19, 2017, at 8:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Jun 14, 2017, at 11:24 AM, Erik Eckstein via swift-dev <swift-dev@swift.org> wrote:

-Joe

I know, but I assumed each reference formed to this buffer would
increment the reference count.

···

on Thu Jun 15 2017, Arnold <aschwaighofer-AT-apple.com> wrote:

On Jun 14, 2017, at 2:56 PM, Dave Abrahams via swift-dev <swift-dev@swift.org> wrote:

on Wed Jun 14 2017, Erik Eckstein <swift-dev-AT-swift.org> wrote:

Hi,

I’m about implementing statically initialized arrays. It’s about
allocating storage for arrays in the data section rather than on the
heap.

W00t! I'd like to do the same for String, i.e. encode the entire buffer
in the data section. I was looking for Array example code to follow but
couldn't find it.

We have support for constant string buffers as of PR 8701 and PR
8692. The former PR shows the protocol that has to be implemented.

(The implementation currently exposes the ref count ABI. This
can/needs to be fixed when we move to a stable abi by running an once
initializer)

Info: the array storage is a heap object. So in the following I’m
using the general term “object” but the optimization will (probably)
only handle array buffers.

This optimization can be done for array literals containing only other
literals as elements. Example:

func createArray() -> [Int] {
return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically
initialized global llvm-variable with a reference count of 2 to make
it immortal.

Why not 1

Mutation must force copying.

--
-Dave

Great, I'll have to look into that when/if I get back to this.

···

on Thu Jun 15 2017, Arnold <aschwaighofer-AT-apple.com> wrote:

W00t! I'd like to do the same for String, i.e. encode the entire buffer
in the data section. I was looking for Array example code to follow but
couldn't find it.

We have support for constant string buffers as of PR 8701 and PR
8692. The former PR shows the protocol that has to be implemented.

(The implementation currently exposes the ref count ABI. This
can/needs to be fixed when we move to a stable abi by running an once
initializer)

--
-Dave

Hi,

I’m about implementing statically initialized arrays. It’s about
allocating storage for arrays in the data section rather than on the
heap.

W00t! I'd like to do the same for String, i.e. encode the entire buffer
in the data section. I was looking for Array example code to follow but
couldn't find it.

We have support for constant string buffers as of PR 8701 and PR 8692. The former PR shows the protocol that has to be implemented.

(The implementation currently exposes the ref count ABI. This can/needs to be fixed when we move to a stable abi by running an once initializer)

It would be reasonable to arrange for a specific bit pattern to be a guaranteed "do not reference-count this" pattern even under a stable ABI. We just have to be careful about picking it.

If we didn't want to do that, the best solution would be an absolute symbol — although we'd have to use an entire word for the refcount, even on 64-bit.

John.

···

On Jun 15, 2017, at 10:52 AM, Arnold via swift-dev <swift-dev@swift.org> wrote:

On Jun 14, 2017, at 2:56 PM, Dave Abrahams via swift-dev <swift-dev@swift.org> wrote:

on Wed Jun 14 2017, Erik Eckstein <swift-dev-AT-swift.org> wrote:

Info: the array storage is a heap object. So in the following I’m
using the general term “object” but the optimization will (probably)
only handle array buffers.

This optimization can be done for array literals containing only other
literals as elements. Example:

func createArray() -> [Int] {
return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically
initialized global llvm-variable with a reference count of 2 to make
it immortal.

Why not 1

Mutation must force copying.

It avoids heap allocations for array literals in cases stack-promotion
can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal
initialization patterns and “outlines” those into a statically
initialized global variable

2) A representation of statically initialized global variables in SIL

3) IRGen to create statically initialized objects as global
llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit
function which has to match a very specific pattern (e.g. must contain a single store to the
global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation
of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this
“instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object
header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in
case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the
metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

I just ask that you keep in mind that we'll eventually want the same
capability for other types, and try to write the code to make that
feasible.

Thanks,

--
-Dave

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Erik answered this. I forgot we need to initialize metadata. I agree we don’t want to do that in a global constructor. The fast path won’t go through swift-once, so the accessor will be just a bit of extra code size.

-Andy

···

On Jun 15, 2017, at 4:26 PM, Andrew Trick <atrick@apple.com> wrote:

On Jun 14, 2017, at 12:03 PM, Jordan Rose via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Please do not use a global constructor.

What’s the objection to a global constructor about? We’re worried about dyld performance in this case?

:-)

:-/

Globals are already set up to handle one-time initialization; the fact that that initialization is now cheaper is still a good thing.

These array literals aren’t Swift globals to begin with so I’m not sure what that means. Introducing a swift-once accessor everywhere they’re used means we can’t do the global optimizations that we normally do for globals with constant initializers. That might be the right choice but it would be good to understand why we’re making it.

Yeah, 2 is probably a more semantically-correct value for global objects anyway to ward off uniqueness checks, at least till we get a real "inert" bit pattern.

-Joe

···

On Jun 19, 2017, at 1:45 PM, Erik Eckstein <eeckstein@apple.com> wrote:

On Jun 19, 2017, at 8:53 AM, Joe Groff <jgroff@apple.com> wrote:

On Jun 14, 2017, at 11:24 AM, Erik Eckstein via swift-dev <swift-dev@swift.org> wrote:

Hi,

I’m about implementing statically initialized arrays. It’s about allocating storage for arrays in the data section rather than on the heap.

Info: the array storage is a heap object. So in the following I’m using the general term “object” but the optimization will (probably) only handle array buffers.

This optimization can be done for array literals containing only other literals as elements.
Example:

func createArray() -> [Int] {
return [1, 2, 3]
}

The compiler can allocate the whole array buffer as a statically initialized global llvm-variable with a reference count of 2 to make it immortal.
It avoids heap allocations for array literals in cases stack-promotion can’t kick in. It also saves code size.

What’s needed for this optimization?

1) An optimization pass (GlobalOpt) which detects such array literal initialization patterns and “outlines” those into a statically initialized global variable
2) A representation of statically initialized global variables in SIL
3) IRGen to create statically initialized objects as global llvm-variables

ad 2) Changes in SIL:

Currently a static initialized sil_global is represented by having a reference to a globalinit function which has to match a very specific pattern (e.g. must contain a single store to the global).
This is somehow quirky and would get even more complicated for statically initialized objects.

I’d like to change that so that the sil_global itself contains the initialization value.
This part is not yet related to statically initialized objects. It just improves the representation of statically initialized global in general.

@@ -1210,7 +1210,9 @@ Global Variables
::

  decl ::= sil-global-variable
+ static-initializer ::= '{' sil-instruction-def* '}'
  sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
+ (static-initializer)?

SIL representation of a global variable.

@@ -1221,6 +1223,19 @@ SIL instructions. Prior to performing any access on the global, the
Once a global's storage has been initialized, ``global_addr`` is used to
project the value.

+A global can also have a static initializer if it's initial value can be
+composed of literals. The static initializer is represented as a list of
+literal and aggregate instructions where the last instruction is the top-level
+value of the static initializer::
+
+ sil_global hidden @_T04test3varSiv : $Int {
+ %0 = integer_literal $Builtin.Int64, 27
+ %1 = struct $Int (%0 : $Builtin.Int64)
+ }
+
+In case a global has a static initializer, no ``alloc_global`` is needed before
+it can be accessed.
+

Now to represent a statically initialized object, we need a new instruction. Note that this “instruction" can only appear in the initializer of a sil_global.

+object
+``````
+::
+
+ sil-instruction ::= 'object' sil-type '(' (sil-operand (',' sil-operand)*)? ')'
+
+ object $T (%a : $A, %b : $B, ...)
+ // $T must be a non-generic or bound generic reference type
+ // The first operands must match the stored properties of T
+ // Optionally there may be more elements, which are tail-allocated to T
+
+Constructs a statically initialized object. This instruction can only appear
+as final instruction in a global variable static initializer list.

Finally we need an instruction to use such a statically initialized global object.

+global_object
+`````````````
+::
+
+ sil-instruction ::= 'global_object' sil-global-name ':' sil-type
+
+ %1 = global_object @v : $T
+ // @v must be a global variable with a static initialized object
+ // $T must be a reference type
+
+Creates a reference to the address of a global variable which has a static
+initializer which is an object, i.e. the last instruction of the global's
+static initializer list is an ``object`` instruction.

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

I did some work along these lines already so that KeyPaths could be immortal heap objects. I added an entry point _swift_instantiateInertHeapObject that does exactly this. We could un-underscore it and promote it to a SWIFT_RUNTIME_EXPORT.

sounds good. If I understood it correctly, _swift_instantiateInertHeapObject currently initializes the object with a reference count of 1. For array buffers we would need 2. But I think for KeyPaths you are fine with 2 as well.

I don't think the absolute symbol size is a problem. We are unlikely to ever make that part of the object header less than a full word. The refcount itself can be a subset of that word, as long as the number of different initial values for the entire word is reasonably small.

There are some other possible tricks. One is to hardcode the initial value and emit all such constant objects into a dedicated section. Then any future runtime that wants to use a new ABI can find all of the old objects and update them when they are loaded. A dedicated section would also help tools like the linker and shared cache to find these objects for optimization purposes. (That's probably less important for arrays than it is for strings. Note that C strings have their own section for just this reason.) And if these objects are read-only when the ABI is unchanged then the section helps keep the clean objects away from dirty pages.

···

On Jun 15, 2017, at 9:24 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:

On Jun 15, 2017, at 10:52 AM, Arnold via swift-dev <swift-dev@swift.org> wrote:

On Jun 14, 2017, at 2:56 PM, Dave Abrahams via swift-dev <swift-dev@swift.org> wrote:

on Wed Jun 14 2017, Erik Eckstein <swift-dev-AT-swift.org> wrote:

I’m about implementing statically initialized arrays. It’s about
allocating storage for arrays in the data section rather than on the
heap.

W00t! I'd like to do the same for String, i.e. encode the entire buffer
in the data section. I was looking for Array example code to follow but
couldn't find it.

We have support for constant string buffers as of PR 8701 and PR 8692. The former PR shows the protocol that has to be implemented.

(The implementation currently exposes the ref count ABI. This can/needs to be fixed when we move to a stable abi by running an once initializer)

It would be reasonable to arrange for a specific bit pattern to be a guaranteed "do not reference-count this" pattern even under a stable ABI. We just have to be careful about picking it.

If we didn't want to do that, the best solution would be an absolute symbol — although we'd have to use an entire word for the refcount, even on 64-bit.

--
Greg Parker gparker@apple.com <mailto:gparker@apple.com> Runtime Wrangler

swift_once is not the most efficient solution here - we don't need separate once token storage in this case - but it's fine for now. If necessary we can get rid of the once token requirement in a backwards-compatible way.

(You can do without a separate token if (1) your storage is two pointers or less in size, (2) your storage is sufficiently well-aligned that it does not cross cache line boundaries, (3) your uninitialized value is zero, and (4) your initialized value is not zero. If all of these are true then you can perform the same trick that swift-once and dispatch-once use on the once token, but apply it directly to the data instead. And if your data is a single pointer you can do it faster than dispatch-once on weakly-ordered platforms that are not Alpha.)

···

On Jun 15, 2017, at 4:39 PM, Andrew Trick via swift-dev <swift-dev@swift.org> wrote:

On Jun 15, 2017, at 4:26 PM, Andrew Trick <atrick@apple.com <mailto:atrick@apple.com>> wrote:

On Jun 14, 2017, at 12:03 PM, Jordan Rose via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

ad 3) IRGen support

Generating statically initialized globals is already done today for structs and tuples.
What’s needed is the handling of objects.
In addition to creating the global itself, we also need a runtime call to initialize the object header. In other words: the object is statically initialized, except the header.

HeapObject *swift::swift_initImmortalObject(HeapMetadata const *metadata, HeapObject *object)

There are 2 reasons for that: first, the object header format is not part of the ABI. And second, in case of a bound generic type (e.g. array buffers) the metadata is not statically available.

One way to call this runtime function is dynamically at the global_object instruction whenever the metadata pointer is still null (via swift_once).
Another possibility would be to call it in a global constructor.

If you have any feedback, please let me know

Please do not use a global constructor.

What’s the objection to a global constructor about? We’re worried about dyld performance in this case?

:-)

:-/

Globals are already set up to handle one-time initialization; the fact that that initialization is now cheaper is still a good thing.

These array literals aren’t Swift globals to begin with so I’m not sure what that means. Introducing a swift-once accessor everywhere they’re used means we can’t do the global optimizations that we normally do for globals with constant initializers. That might be the right choice but it would be good to understand why we’re making it.

Erik answered this. I forgot we need to initialize metadata. I agree we don’t want to do that in a global constructor. The fast path won’t go through swift-once, so the accessor will be just a bit of extra code size.

--
Greg Parker gparker@apple.com <mailto:gparker@apple.com> Runtime Wrangler

That’s a good point. I’ll switch the runtime’s TLS key initialization over to that to save a load.

I assume the “two pointers or less” requirement is for an atomic store. And the single pointer requirement is for a barrier-free load? Although couldn’t you just atomically load two pointers?

-Andy

···

On Jun 15, 2017, at 8:30 PM, Greg Parker <gparker@apple.com> wrote:

(You can do without a separate token if (1) your storage is two pointers or less in size, (2) your storage is sufficiently well-aligned that it does not cross cache line boundaries, (3) your uninitialized value is zero, and (4) your initialized value is not zero. If all of these are true then you can perform the same trick that swift-once and dispatch-once use on the once token, but apply it directly to the data instead. And if your data is a single pointer you can do it faster than dispatch-once on weakly-ordered platforms that are not Alpha.)