SIL proposal: remove multiple results from SILValue

Hi,

currently some instructions, like alloc_stack and alloc_box (I think they are the only ones) return 2 values. For this we have the "result number" in SILValue, e.g. %3#0, %3#1

We could represent the same by using a projection instruction, e.g.

%3 = alloc_stack $Type // == the old %3#0
%4 = project_stack %1 // == the old %3#1

And actually we already have the project_box instruction.

This would simplify SILValues and many places in optimization passes/analysis. Yesterday I spent several hours to debug a problem where in some pass the wrong result number was used.

What do you think?

Erik

This would mean generating a lot of extra instructions for a lot of very common cases, like alloc_stack.

Was your bug related to the implicit conversion of a SILInstruction* -> SILValue that silently uses result 0? I’ve been thinking for a while that we should make that conditional on the SILInstruction having exactly one result.

John.

···

On Dec 4, 2015, at 9:06 AM, Erik Eckstein <eeckstein@apple.com> wrote:
Hi,

currently some instructions, like alloc_stack and alloc_box (I think they are the only ones) return 2 values. For this we have the "result number" in SILValue, e.g. %3#0, %3#1

We could represent the same by using a projection instruction, e.g.

%3 = alloc_stack $Type // == the old %3#0
%4 = project_stack %1 // == the old %3#1

And actually we already have the project_box instruction.

This would simplify SILValues and many places in optimization passes/analysis. Yesterday I spent several hours to debug a problem where in some pass the wrong result number was used.

What do you think?

Like John said, this would cause us to generate a lot more SIL IR, and potentially more LLVM IR as well, since the projected address is produced as part of the lowering of alloc_stack and alloc_box. Multiple values are handy for things like alloc_stack and alloc_box where both values are immediately used in the common case. Looking forward, I think we'd also like function applications to be able to produce arbitrarily many values at the SIL level, which would save a bunch of tuple imploding and exploding that's currently necessary.

-Joe

···

On Dec 4, 2015, at 9:06 AM, Erik Eckstein <eeckstein@apple.com> wrote:

Hi,

currently some instructions, like alloc_stack and alloc_box (I think they are the only ones) return 2 values. For this we have the "result number" in SILValue, e.g. %3#0, %3#1

We could represent the same by using a projection instruction, e.g.

%3 = alloc_stack $Type // == the old %3#0
%4 = project_stack %1 // == the old %3#1

And actually we already have the project_box instruction.

This would simplify SILValues and many places in optimization passes/analysis. Yesterday I spent several hours to debug a problem where in some pass the wrong result number was used.

What do you think?

Hi,

currently some instructions, like alloc_stack and alloc_box (I think they are the only ones) return 2 values. For this we have the "result number" in SILValue, e.g. %3#0, %3#1

We could represent the same by using a projection instruction, e.g.

%3 = alloc_stack $Type // == the old %3#0
%4 = project_stack %1 // == the old %3#1

And actually we already have the project_box instruction.

This would simplify SILValues and many places in optimization passes/analysis. Yesterday I spent several hours to debug a problem where in some pass the wrong result number was used.

What do you think?

This would mean generating a lot of extra instructions for a lot of very common cases, like alloc_stack.

A quick grep on the optimized stdlib sil shows that we would need only about 5% more instructions. And probably a smaller fraction of more instruction-related memory, because the project_* instructions are small.
About compile time, I'm not sure. On the one hand we will have more instructions, on the other hand we would not have to mask out the ValueDef/ResultNumbers from SILValue every time we use it.
IMO this is a reasonable tradeoff for making the SIL simpler.

Was your bug related to the implicit conversion of a SILInstruction* -> SILValue that silently uses result 0? I’ve been thinking for a while that we should make that conditional on the SILInstruction having exactly one result.

Some sort of: it was a value.getDef() which should just be the value.

···

On Dec 4, 2015, at 9:34 AM, John McCall <rjmccall@apple.com> wrote:

On Dec 4, 2015, at 9:06 AM, Erik Eckstein <eeckstein@apple.com> wrote:

John.

Hi,

currently some instructions, like alloc_stack and alloc_box (I think they are the only ones) return 2 values. For this we have the "result number" in SILValue, e.g. %3#0, %3#1

We could represent the same by using a projection instruction, e.g.

%3 = alloc_stack $Type // == the old %3#0
%4 = project_stack %1 // == the old %3#1

And actually we already have the project_box instruction.

This would simplify SILValues and many places in optimization passes/analysis. Yesterday I spent several hours to debug a problem where in some pass the wrong result number was used.

What do you think?

This would mean generating a lot of extra instructions for a lot of very common cases, like alloc_stack.

A quick grep on the optimized stdlib sil shows that we would need only about 5% more instructions.

I have no idea what that’s supposed to tell us, though, because that’s purely the end result of an optimization pipeline that’s heavily focused on removing alloc_stack, alloc_box, etc.

About compile time, I'm not sure. On the one hand we will have more instructions, on the other hand we would not have to mask out the ValueDef/ResultNumbers from SILValue every time we use it.

If that’s important (and masking is really cheap, but maybe it’s important), there are other representational differences we could make that would optimize that. For example, we could split the use lists, so that each result stores its type and use list (meaning that instructions without results don’t need use lists at all).

IMO this is a reasonable tradeoff for making the SIL simpler.

Was your bug related to the implicit conversion of a SILInstruction* -> SILValue that silently uses result 0? I’ve been thinking for a while that we should make that conditional on the SILInstruction having exactly one result.

Some sort of: it was a value.getDef() which should just be the value.

Okay, but that’s what I mean: we should go through and make sure that its generally not possible to accidentally use a SILInstruction with multiple (or zero) results in any position where you really need a specific SILValue.

John.

···

On Dec 4, 2015, at 9:51 AM, Erik Eckstein <eeckstein@apple.com> wrote:

On Dec 4, 2015, at 9:34 AM, John McCall <rjmccall@apple.com> wrote:

On Dec 4, 2015, at 9:06 AM, Erik Eckstein <eeckstein@apple.com> wrote:

ad alloc_box:

We have the project_box instruction (which is needed for Joe's upcoming change).
So it's actually a redundancy in SIL that we can obtain the same "value" in 2 completely different ways:

%1 = alloc_box
%2 = project_box %1#0
// %2 == %1#1

ad alloc_stack:

We need the 2 result values only because of dealloc_stack.
AFAIK we don't use dealloc_stack currently anywhere in SIL and also not in IRGen.
We verify that the dealloc_stacks are on the right place, i.e. are properly nested, so we could also dynamically compute this information.
So let me ask this question: do we really need dealloc_stack?
If no, we also don't need a project_stack for alloc_stack.

Hi,

currently some instructions, like alloc_stack and alloc_box (I think they are the only ones) return 2 values. For this we have the "result number" in SILValue, e.g. %3#0, %3#1

We could represent the same by using a projection instruction, e.g.

%3 = alloc_stack $Type // == the old %3#0
%4 = project_stack %1 // == the old %3#1

And actually we already have the project_box instruction.

This would simplify SILValues and many places in optimization passes/analysis. Yesterday I spent several hours to debug a problem where in some pass the wrong result number was used.

What do you think?

This would mean generating a lot of extra instructions for a lot of very common cases, like alloc_stack.

A quick grep on the optimized stdlib sil shows that we would need only about 5% more instructions.

I have no idea what that’s supposed to tell us, though, because that’s purely the end result of an optimization pipeline that’s heavily focused on removing alloc_stack, alloc_box, etc.

The optimized SIL is much larger than the SILGen'd SIL, so it's more relevant for memory consumption. But yes, it does not tell us what's the status in between.

···

On Dec 4, 2015, at 10:26 AM, John McCall <rjmccall@apple.com> wrote:

On Dec 4, 2015, at 9:51 AM, Erik Eckstein <eeckstein@apple.com> wrote:

On Dec 4, 2015, at 9:34 AM, John McCall <rjmccall@apple.com> wrote:

On Dec 4, 2015, at 9:06 AM, Erik Eckstein <eeckstein@apple.com> wrote:

About compile time, I'm not sure. On the one hand we will have more instructions, on the other hand we would not have to mask out the ValueDef/ResultNumbers from SILValue every time we use it.

If that’s important (and masking is really cheap, but maybe it’s important), there are other representational differences we could make that would optimize that. For example, we could split the use lists, so that each result stores its type and use list (meaning that instructions without results don’t need use lists at all).

IMO this is a reasonable tradeoff for making the SIL simpler.

Was your bug related to the implicit conversion of a SILInstruction* -> SILValue that silently uses result 0? I’ve been thinking for a while that we should make that conditional on the SILInstruction having exactly one result.

Some sort of: it was a value.getDef() which should just be the value.

Okay, but that’s what I mean: we should go through and make sure that its generally not possible to accidentally use a SILInstruction with multiple (or zero) results in any position where you really need a specific SILValue.

John.

ad alloc_box:

We have the project_box instruction (which is needed for Joe's upcoming change).
So it's actually a redundancy in SIL that we can obtain the same "value" in 2 completely different ways:

%1 = alloc_box
%2 = project_box %1#0
// %2 == %1#1

ad alloc_stack:

We need the 2 result values only because of dealloc_stack.
AFAIK we don't use dealloc_stack currently anywhere in SIL and also not in IRGen.

dealloc_stack actually expands to code when the variable is dynamically-sized, and it will also be used for lifetime markers when somebody gets around to it. Lifetime markers are a surprisingly important optimization.

We verify that the dealloc_stacks are on the right place, i.e. are properly nested, so we could also dynamically compute this information.

Er, no, if we removed all the dealloc_stacks, there wouldn’t be anything that allowed us to recreate that short of dominance frontiers.

So let me ask this question: do we really need dealloc_stack?
If no, we also don't need a project_stack for alloc_stack.

It is certainly true that we could implement a project_stack operation.

Please take the idea about giving each result an independent use list seriously, though. How many of these problems would that address?

John.

···

On Dec 4, 2015, at 10:58 AM, Erik Eckstein <eeckstein@apple.com> wrote:

On Dec 4, 2015, at 10:26 AM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Dec 4, 2015, at 9:51 AM, Erik Eckstein <eeckstein@apple.com <mailto:eeckstein@apple.com>> wrote:

On Dec 4, 2015, at 9:34 AM, John McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:

On Dec 4, 2015, at 9:06 AM, Erik Eckstein <eeckstein@apple.com <mailto:eeckstein@apple.com>> wrote:
Hi,

currently some instructions, like alloc_stack and alloc_box (I think they are the only ones) return 2 values. For this we have the "result number" in SILValue, e.g. %3#0, %3#1

We could represent the same by using a projection instruction, e.g.

%3 = alloc_stack $Type // == the old %3#0
%4 = project_stack %1 // == the old %3#1

And actually we already have the project_box instruction.

This would simplify SILValues and many places in optimization passes/analysis. Yesterday I spent several hours to debug a problem where in some pass the wrong result number was used.

What do you think?

This would mean generating a lot of extra instructions for a lot of very common cases, like alloc_stack.

A quick grep on the optimized stdlib sil shows that we would need only about 5% more instructions.

I have no idea what that’s supposed to tell us, though, because that’s purely the end result of an optimization pipeline that’s heavily focused on removing alloc_stack, alloc_box, etc.

The optimized SIL is much larger than the SILGen'd SIL, so it's more relevant for memory consumption. But yes, it does not tell us what's the status in between.

About compile time, I'm not sure. On the one hand we will have more instructions, on the other hand we would not have to mask out the ValueDef/ResultNumbers from SILValue every time we use it.

If that’s important (and masking is really cheap, but maybe it’s important), there are other representational differences we could make that would optimize that. For example, we could split the use lists, so that each result stores its type and use list (meaning that instructions without results don’t need use lists at all).

IMO this is a reasonable tradeoff for making the SIL simpler.

Was your bug related to the implicit conversion of a SILInstruction* -> SILValue that silently uses result 0? I’ve been thinking for a while that we should make that conditional on the SILInstruction having exactly one result.

Some sort of: it was a value.getDef() which should just be the value.

Okay, but that’s what I mean: we should go through and make sure that its generally not possible to accidentally use a SILInstruction with multiple (or zero) results in any position where you really need a specific SILValue.

John.