[RFC] SIL syntax for debug information Part 2: Scopes and Locations

Hi Everybody,

I’d like to solicit comments on extending the textual .sil assembler
language with even more debug information.

At the moment it is only possible to test the effects that SIL
optimization passes have on debug information by observing the
effects of a full .swift -> LLVM IR compilation. To enable us
writing targeted testcases for single SIL optimization passes,
I'd like to propose a serialization format for scope and location
information for textual SIL.

The format is inspired by LLVM IR's metadata representation, but
with a couple of improvements that reduce the amount of
individual metadata records and improve the overall readability
by moving the metadata definition closer to their (first) use.

Each SIL instruction is extended by a location and scope reference.
  sil-instruction-def ::= (sil-value-name '=')? sil-instruction, sil-loc, sil-scope
  sil-loc ::= 'loc' md-name
  sil-scope ::= 'scope' md-name
  md-name ::= '!' [0-9]+

The individual metadata nodes are defined in a global context, as
they may be shared between individual SIL functions.

  decl ::= md-def
  md-def ::= md-name '=' md-node
  md-node ::= 'loc' ',' 'line' [0-9]+ ',' 'column' [0-9]+ ',' 'file' string-literal
  md-node ::= 'scope' ',' 'loc' md-name ',' 'parent' scope-parent (',' 'inlinedCallSite' md-name )?
  scope-parent ::= sil-function-name
  scope-parent ::= md-name

Let me know what you think!

-- adrian

PS:
Below is an example of what this would look like in practice for the following program:

Swift source code

···

-----------------

#line 100 "abc.swift"
@inline(__always)
public func h(k : Int) -> Int { // 101
  return k // 102
}

#line 200 "abc.swift"
@inline(__always)
public func g(j : Int) -> Int { // 201
  return h(j) // 202
}

#line 301 "abc.swift"
public func f(i : Int) -> Int { // 301
  return g(i) // 302
}

Verbose SIL output
------------------

Note that metadata is defined before its first use:

!1 = loc, line 0, column 0
!2 = scope, loc !1, parent @main
[...]
!3 = loc, line 101, column 15, file "abc.swift"
!4 = loc, line 101, column 13, file "abc.swift"
!5 = scope, loc !4, parent @_TF9inlinedAt1hFSiSi
!6 = loc, line 102, column 3, file "abc.swift"
!7 = loc, line 103, column 1, file "abc.swift"
!8 = scope, loc !7, parent !5

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
  debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !5 // id: %1 line:101:15:in_prologue
  return %0 : $Int, loc !6, scope !8 // id: %2 line:102:3:return
}

!9 = loc, line 201, column 15, file "abc.swift"
!10 = loc, line 201, column 13, file "abc.swift"
!11 = scope, loc !10, parent @_TF9inlinedAt1gFSiSi
!12 = loc, line 203, column 1, file "abc.swift"
!13 = scope, loc !12, parent !11
!14 = loc, line 202, column 13, file "abc.swift"
!15 = scope, loc !14, parent !13
!16 = scope, loc !4, parent !5, inlinedCallSite !15
!17 = loc, line 202, column 3, file "abc.swift"

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
  debug_value %0 : $Int, let, name "j", argno 1, loc !9, scope !11 // id: %1 line:201:15:in_prologue
  debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !16 // id: %2 line:101:15:in_prologue: h(Int) -> Int perf_inlined_at /Volumes/Data/swift/swift/test/DebugInfo/inlinedAt.swift:202:10
  return %0 : $Int, loc !17, scope !13 // id: %3 line:202:3:return
}

!18 = loc, line 301, column 15, file "abc.swift"
!19 = loc, line 301, column 13, file "abc.swift"
!20 = scope, loc !19, parent @_TF9inlinedAt1fFSiSi
!21 = loc, line 303, column 1, file "abc.swift"
!22 = scope, loc !21, parent !20
!23 = loc, line 302, column 13, file "abc.swift"
!24 = scope, loc !23, parent !22
!25 = scope, loc !10, parent !11, inlinedCallSite !24
!26 = scope, loc !4, parent !16, inlinedCallSite !24
!27 = loc, line 302, column 3, file "abc.swift"

I'm not a fan of generic metadata syntax. I think there's little enough information here that we should just print and parse it inline, something like this maybe:

debug_value %0 : $Int, let, name "k", argno 1,
  loc "foo.swift":line:column,
  scope @foo -> "foo.swift":parentLine:column -> "foo.swift":line:column

which I think is easier to read and test. I always have trouble mentally piecing together the DAG when updating LLVM debug info tests. If you don't think that's practical, I would still prefer proper declarations to metadata syntax:

debug_value <...>, scope 123, loc 456

sil_scope 123 { parent_scope 124 loc 457 }
sil_loc 456 { "foo.swift":line:column }

-Joe

···

On Feb 10, 2016, at 11:34 AM, Adrian Prantl <aprantl@apple.com> wrote:

Hi Everybody,

I’d like to solicit comments on extending the textual .sil assembler
language with even more debug information.

At the moment it is only possible to test the effects that SIL
optimization passes have on debug information by observing the
effects of a full .swift -> LLVM IR compilation. To enable us
writing targeted testcases for single SIL optimization passes,
I'd like to propose a serialization format for scope and location
information for textual SIL.

The format is inspired by LLVM IR's metadata representation, but
with a couple of improvements that reduce the amount of
individual metadata records and improve the overall readability
by moving the metadata definition closer to their (first) use.

Each SIL instruction is extended by a location and scope reference.
sil-instruction-def ::= (sil-value-name '=')? sil-instruction, sil-loc, sil-scope
sil-loc ::= 'loc' md-name
sil-scope ::= 'scope' md-name
md-name ::= '!' [0-9]+

The individual metadata nodes are defined in a global context, as
they may be shared between individual SIL functions.

decl ::= md-def
md-def ::= md-name '=' md-node
md-node ::= 'loc' ',' 'line' [0-9]+ ',' 'column' [0-9]+ ',' 'file' string-literal
md-node ::= 'scope' ',' 'loc' md-name ',' 'parent' scope-parent (',' 'inlinedCallSite' md-name )?
scope-parent ::= sil-function-name
scope-parent ::= md-name

Let me know what you think!

-- adrian

PS:
Below is an example of what this would look like in practice for the following program:

Swift source code
-----------------

#line 100 "abc.swift"
@inline(__always)
public func h(k : Int) -> Int { // 101
return k // 102
}

#line 200 "abc.swift"
@inline(__always)
public func g(j : Int) -> Int { // 201
return h(j) // 202
}

#line 301 "abc.swift"
public func f(i : Int) -> Int { // 301
return g(i) // 302
}

Verbose SIL output
------------------

Note that metadata is defined before its first use:

!1 = loc, line 0, column 0
!2 = scope, loc !1, parent @main
[...]
!3 = loc, line 101, column 15, file "abc.swift"
!4 = loc, line 101, column 13, file "abc.swift"
!5 = scope, loc !4, parent @_TF9inlinedAt1hFSiSi
!6 = loc, line 102, column 3, file "abc.swift"
!7 = loc, line 103, column 1, file "abc.swift"
!8 = scope, loc !7, parent !5

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !5 // id: %1 line:101:15:in_prologue
return %0 : $Int, loc !6, scope !8 // id: %2 line:102:3:return
}

!9 = loc, line 201, column 15, file "abc.swift"
!10 = loc, line 201, column 13, file "abc.swift"
!11 = scope, loc !10, parent @_TF9inlinedAt1gFSiSi
!12 = loc, line 203, column 1, file "abc.swift"
!13 = scope, loc !12, parent !11
!14 = loc, line 202, column 13, file "abc.swift"
!15 = scope, loc !14, parent !13
!16 = scope, loc !4, parent !5, inlinedCallSite !15
!17 = loc, line 202, column 3, file "abc.swift"

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "j", argno 1, loc !9, scope !11 // id: %1 line:201:15:in_prologue
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !16 // id: %2 line:101:15:in_prologue: h(Int) -> Int perf_inlined_at /Volumes/Data/swift/swift/test/DebugInfo/inlinedAt.swift:202:10
return %0 : $Int, loc !17, scope !13 // id: %3 line:202:3:return
}

!18 = loc, line 301, column 15, file "abc.swift"
!19 = loc, line 301, column 13, file "abc.swift"
!20 = scope, loc !19, parent @_TF9inlinedAt1fFSiSi
!21 = loc, line 303, column 1, file "abc.swift"
!22 = scope, loc !21, parent !20
!23 = loc, line 302, column 13, file "abc.swift"
!24 = scope, loc !23, parent !22
!25 = scope, loc !10, parent !11, inlinedCallSite !24
!26 = scope, loc !4, parent !16, inlinedCallSite !24
!27 = loc, line 302, column 3, file "abc.swift"

I'm not a fan of generic metadata syntax. I think there's little enough information here that we should just print and parse it inline, something like this maybe:

debug_value %0 : $Int, let, name "k", argno 1,
loc "foo.swift":line:column,
scope @foo -> "foo.swift":parentLine:column -> "foo.swift":line:column

which I think is easier to read and test. I always have trouble mentally piecing together the DAG when updating LLVM debug info tests. If you don't think that's practical,

The main problem is that the inline information in the scopes form a tree and explicitly printing it would quickly explode. It would make ingesting the SIL complicated because we’d need to unique the scopes before building the instructions.

I would still prefer proper declarations to metadata syntax:

debug_value <...>, scope 123, loc 456

Right, the ‘!’ prefix is redundant because the parser already knows that the next token is a scope.

sil_scope 123 { parent_scope 124 loc 457 }

I tried to model the syntax after how SIL instructions are represented. That said, using curly braces here instead of a comma-separated list of operands is a lot more readable.

What about an even more dictionary-like syntax:

  sil_scope 123 { parent_scope: 124, loc: 457 }
  sil_scope 124 { parent_function: @func, loc: 457 }

?

sil_loc 456 { "foo.swift":line:column }

I like this one for its compactness! Since the locations are leaf nodes, we could also inline them everywhere. The filenames might sill cause the lines to become very long, but it’s worth trying.

thanks,
adrian

···

On Feb 10, 2016, at 11:43 AM, Joe Groff <jgroff@apple.com> wrote:

-Joe

On Feb 10, 2016, at 11:34 AM, Adrian Prantl <aprantl@apple.com> wrote:

Hi Everybody,

I’d like to solicit comments on extending the textual .sil assembler
language with even more debug information.

At the moment it is only possible to test the effects that SIL
optimization passes have on debug information by observing the
effects of a full .swift -> LLVM IR compilation. To enable us
writing targeted testcases for single SIL optimization passes,
I'd like to propose a serialization format for scope and location
information for textual SIL.

The format is inspired by LLVM IR's metadata representation, but
with a couple of improvements that reduce the amount of
individual metadata records and improve the overall readability
by moving the metadata definition closer to their (first) use.

Each SIL instruction is extended by a location and scope reference.
sil-instruction-def ::= (sil-value-name '=')? sil-instruction, sil-loc, sil-scope
sil-loc ::= 'loc' md-name
sil-scope ::= 'scope' md-name
md-name ::= '!' [0-9]+

The individual metadata nodes are defined in a global context, as
they may be shared between individual SIL functions.

decl ::= md-def
md-def ::= md-name '=' md-node
md-node ::= 'loc' ',' 'line' [0-9]+ ',' 'column' [0-9]+ ',' 'file' string-literal
md-node ::= 'scope' ',' 'loc' md-name ',' 'parent' scope-parent (',' 'inlinedCallSite' md-name )?
scope-parent ::= sil-function-name
scope-parent ::= md-name

Let me know what you think!

-- adrian

PS:
Below is an example of what this would look like in practice for the following program:

Swift source code
-----------------

#line 100 "abc.swift"
@inline(__always)
public func h(k : Int) -> Int { // 101
return k // 102
}

#line 200 "abc.swift"
@inline(__always)
public func g(j : Int) -> Int { // 201
return h(j) // 202
}

#line 301 "abc.swift"
public func f(i : Int) -> Int { // 301
return g(i) // 302
}

Verbose SIL output
------------------

Note that metadata is defined before its first use:

!1 = loc, line 0, column 0
!2 = scope, loc !1, parent @main
[...]
!3 = loc, line 101, column 15, file "abc.swift"
!4 = loc, line 101, column 13, file "abc.swift"
!5 = scope, loc !4, parent @_TF9inlinedAt1hFSiSi
!6 = loc, line 102, column 3, file "abc.swift"
!7 = loc, line 103, column 1, file "abc.swift"
!8 = scope, loc !7, parent !5

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !5 // id: %1 line:101:15:in_prologue
return %0 : $Int, loc !6, scope !8 // id: %2 line:102:3:return
}

!9 = loc, line 201, column 15, file "abc.swift"
!10 = loc, line 201, column 13, file "abc.swift"
!11 = scope, loc !10, parent @_TF9inlinedAt1gFSiSi
!12 = loc, line 203, column 1, file "abc.swift"
!13 = scope, loc !12, parent !11
!14 = loc, line 202, column 13, file "abc.swift"
!15 = scope, loc !14, parent !13
!16 = scope, loc !4, parent !5, inlinedCallSite !15
!17 = loc, line 202, column 3, file "abc.swift"

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "j", argno 1, loc !9, scope !11 // id: %1 line:201:15:in_prologue
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !16 // id: %2 line:101:15:in_prologue: h(Int) -> Int perf_inlined_at /Volumes/Data/swift/swift/test/DebugInfo/inlinedAt.swift:202:10
return %0 : $Int, loc !17, scope !13 // id: %3 line:202:3:return
}

!18 = loc, line 301, column 15, file "abc.swift"
!19 = loc, line 301, column 13, file "abc.swift"
!20 = scope, loc !19, parent @_TF9inlinedAt1fFSiSi
!21 = loc, line 303, column 1, file "abc.swift"
!22 = scope, loc !21, parent !20
!23 = loc, line 302, column 13, file "abc.swift"
!24 = scope, loc !23, parent !22
!25 = scope, loc !10, parent !11, inlinedCallSite !24
!26 = scope, loc !4, parent !16, inlinedCallSite !24
!27 = loc, line 302, column 3, file "abc.swift"

What about the following updated syntax (with inlined locations and first-class sil_scope declarations)?

-- adrian

sil_scope 2 { loc "abc.swift":101:13, parent @_TF9inlinedAt1hFSiSi }
sil_scope 3 { loc "abc.swift":103:1, parent 2 }
                                // ^ I’m unsure about this comma, but it appears to be more readable.

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
  debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 2 // id: %1
  return %0 : $Int, loc "abc.swift":102:3, scope 3 // id: %2
}

sil_scope 4 { loc "abc.swift":201:13, parent @_TF9inlinedAt1gFSiSi }
sil_scope 5 { loc "abc.swift":203:1, parent 4 }
sil_scope 6 { loc "abc.swift":202:13, parent 5 }
sil_scope 7 { loc "abc.swift":101:13, parent 2, inlined_at 6 }

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
  debug_value %0 : $Int, let, name "j", argno 1, loc "abc.swift":201:15, scope 4 // id: %1
  debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 7 // id: %2
  return %0 : $Int, loc "abc.swift":202:3, scope 5 // id: %3
}

sil_scope 8 { loc "abc.swift":301:13, parent @_TF9inlinedAt1fFSiSi }
sil_scope 9 { loc "abc.swift":303:1, parent 8 }
sil_scope 10 { loc "abc.swift":302:13, parent 9 }
sil_scope 11 { loc "abc.swift":201:13, parent 4, inlined_at 10 }
sil_scope 12 { loc "abc.swift":101:13, parent 7, inlined_at 10 }

// f(Int) -> Int
sil @_TF9inlinedAt1fFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3, %4
bb0(%0 : $Int):
  debug_value %0 : $Int, let, name "i", argno 1, loc "abc.swift":301:15, scope 8 // id: %1
  debug_value %0 : $Int, let, name "j", argno 1, loc "abc.swift":201:15, scope 11 // id: %2
  debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 12 // id: %3
  return %0 : $Int, loc "abc.swift":302:3, scope 9 // id: %4
}

···

On Feb 10, 2016, at 12:02 PM, Adrian Prantl via swift-dev <swift-dev@swift.org> wrote:

On Feb 10, 2016, at 11:43 AM, Joe Groff <jgroff@apple.com> wrote:

I'm not a fan of generic metadata syntax. I think there's little enough information here that we should just print and parse it inline, something like this maybe:

debug_value %0 : $Int, let, name "k", argno 1,
loc "foo.swift":line:column,
scope @foo -> "foo.swift":parentLine:column -> "foo.swift":line:column

which I think is easier to read and test. I always have trouble mentally piecing together the DAG when updating LLVM debug info tests. If you don't think that's practical,

The main problem is that the inline information in the scopes form a tree and explicitly printing it would quickly explode. It would make ingesting the SIL complicated because we’d need to unique the scopes before building the instructions.

I would still prefer proper declarations to metadata syntax:

debug_value <...>, scope 123, loc 456

Right, the ‘!’ prefix is redundant because the parser already knows that the next token is a scope.

sil_scope 123 { parent_scope 124 loc 457 }

I tried to model the syntax after how SIL instructions are represented. That said, using curly braces here instead of a comma-separated list of operands is a lot more readable.

What about an even more dictionary-like syntax:

sil_scope 123 { parent_scope: 124, loc: 457 }
sil_scope 124 { parent_function: @func, loc: 457 }

?

sil_loc 456 { "foo.swift":line:column }

I like this one for its compactness! Since the locations are leaf nodes, we could also inline them everywhere. The filenames might sill cause the lines to become very long, but it’s worth trying.

thanks,
adrian

-Joe

On Feb 10, 2016, at 11:34 AM, Adrian Prantl <aprantl@apple.com> wrote:

Hi Everybody,

I’d like to solicit comments on extending the textual .sil assembler
language with even more debug information.

At the moment it is only possible to test the effects that SIL
optimization passes have on debug information by observing the
effects of a full .swift -> LLVM IR compilation. To enable us
writing targeted testcases for single SIL optimization passes,
I'd like to propose a serialization format for scope and location
information for textual SIL.

The format is inspired by LLVM IR's metadata representation, but
with a couple of improvements that reduce the amount of
individual metadata records and improve the overall readability
by moving the metadata definition closer to their (first) use.

Each SIL instruction is extended by a location and scope reference.
sil-instruction-def ::= (sil-value-name '=')? sil-instruction, sil-loc, sil-scope
sil-loc ::= 'loc' md-name
sil-scope ::= 'scope' md-name
md-name ::= '!' [0-9]+

The individual metadata nodes are defined in a global context, as
they may be shared between individual SIL functions.

decl ::= md-def
md-def ::= md-name '=' md-node
md-node ::= 'loc' ',' 'line' [0-9]+ ',' 'column' [0-9]+ ',' 'file' string-literal
md-node ::= 'scope' ',' 'loc' md-name ',' 'parent' scope-parent (',' 'inlinedCallSite' md-name )?
scope-parent ::= sil-function-name
scope-parent ::= md-name

Let me know what you think!

-- adrian

PS:
Below is an example of what this would look like in practice for the following program:

Swift source code
-----------------

#line 100 "abc.swift"
@inline(__always)
public func h(k : Int) -> Int { // 101
return k // 102
}

#line 200 "abc.swift"
@inline(__always)
public func g(j : Int) -> Int { // 201
return h(j) // 202
}

#line 301 "abc.swift"
public func f(i : Int) -> Int { // 301
return g(i) // 302
}

Verbose SIL output
------------------

Note that metadata is defined before its first use:

!1 = loc, line 0, column 0
!2 = scope, loc !1, parent @main
[...]
!3 = loc, line 101, column 15, file "abc.swift"
!4 = loc, line 101, column 13, file "abc.swift"
!5 = scope, loc !4, parent @_TF9inlinedAt1hFSiSi
!6 = loc, line 102, column 3, file "abc.swift"
!7 = loc, line 103, column 1, file "abc.swift"
!8 = scope, loc !7, parent !5

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !5 // id: %1 line:101:15:in_prologue
return %0 : $Int, loc !6, scope !8 // id: %2 line:102:3:return
}

!9 = loc, line 201, column 15, file "abc.swift"
!10 = loc, line 201, column 13, file "abc.swift"
!11 = scope, loc !10, parent @_TF9inlinedAt1gFSiSi
!12 = loc, line 203, column 1, file "abc.swift"
!13 = scope, loc !12, parent !11
!14 = loc, line 202, column 13, file "abc.swift"
!15 = scope, loc !14, parent !13
!16 = scope, loc !4, parent !5, inlinedCallSite !15
!17 = loc, line 202, column 3, file "abc.swift"

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "j", argno 1, loc !9, scope !11 // id: %1 line:201:15:in_prologue
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !16 // id: %2 line:101:15:in_prologue: h(Int) -> Int perf_inlined_at /Volumes/Data/swift/swift/test/DebugInfo/inlinedAt.swift:202:10
return %0 : $Int, loc !17, scope !13 // id: %3 line:202:3:return
}

!18 = loc, line 301, column 15, file "abc.swift"
!19 = loc, line 301, column 13, file "abc.swift"
!20 = scope, loc !19, parent @_TF9inlinedAt1fFSiSi
!21 = loc, line 303, column 1, file "abc.swift"
!22 = scope, loc !21, parent !20
!23 = loc, line 302, column 13, file "abc.swift"
!24 = scope, loc !23, parent !22
!25 = scope, loc !10, parent !11, inlinedCallSite !24
!26 = scope, loc !4, parent !16, inlinedCallSite !24
!27 = loc, line 302, column 3, file "abc.swift"

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

What about the following updated syntax (with inlined locations and first-class sil_scope declarations)?

-- adrian

sil_scope 2 { loc "abc.swift":101:13, parent @_TF9inlinedAt1hFSiSi }
sil_scope 3 { loc "abc.swift":103:1, parent 2 }
                               // ^ I’m unsure about this comma, but it appears to be more readable.

Looks good. In other places, like sil_witness_table, we don't use comma separators and print with newlines and indentation:

sil_scope 3 {
  loc "abc.swift":103:1
  parent 2
}

The information here is probably compact enough not to need the newlines, but leaving off commas would be a bit more consistent.

···

On Feb 16, 2016, at 12:18 PM, Adrian Prantl <aprantl@apple.com> wrote:

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 2 // id: %1
return %0 : $Int, loc "abc.swift":102:3, scope 3 // id: %2
}

sil_scope 4 { loc "abc.swift":201:13, parent @_TF9inlinedAt1gFSiSi }
sil_scope 5 { loc "abc.swift":203:1, parent 4 }
sil_scope 6 { loc "abc.swift":202:13, parent 5 }
sil_scope 7 { loc "abc.swift":101:13, parent 2, inlined_at 6 }

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "j", argno 1, loc "abc.swift":201:15, scope 4 // id: %1
debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 7 // id: %2
return %0 : $Int, loc "abc.swift":202:3, scope 5 // id: %3
}

sil_scope 8 { loc "abc.swift":301:13, parent @_TF9inlinedAt1fFSiSi }
sil_scope 9 { loc "abc.swift":303:1, parent 8 }
sil_scope 10 { loc "abc.swift":302:13, parent 9 }
sil_scope 11 { loc "abc.swift":201:13, parent 4, inlined_at 10 }
sil_scope 12 { loc "abc.swift":101:13, parent 7, inlined_at 10 }

// f(Int) -> Int
sil @_TF9inlinedAt1fFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3, %4
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "i", argno 1, loc "abc.swift":301:15, scope 8 // id: %1
debug_value %0 : $Int, let, name "j", argno 1, loc "abc.swift":201:15, scope 11 // id: %2
debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 12 // id: %3
return %0 : $Int, loc "abc.swift":302:3, scope 9 // id: %4
}

On Feb 10, 2016, at 12:02 PM, Adrian Prantl via swift-dev <swift-dev@swift.org> wrote:

On Feb 10, 2016, at 11:43 AM, Joe Groff <jgroff@apple.com> wrote:

I'm not a fan of generic metadata syntax. I think there's little enough information here that we should just print and parse it inline, something like this maybe:

debug_value %0 : $Int, let, name "k", argno 1,
loc "foo.swift":line:column,
scope @foo -> "foo.swift":parentLine:column -> "foo.swift":line:column

which I think is easier to read and test. I always have trouble mentally piecing together the DAG when updating LLVM debug info tests. If you don't think that's practical,

The main problem is that the inline information in the scopes form a tree and explicitly printing it would quickly explode. It would make ingesting the SIL complicated because we’d need to unique the scopes before building the instructions.

I would still prefer proper declarations to metadata syntax:

debug_value <...>, scope 123, loc 456

Right, the ‘!’ prefix is redundant because the parser already knows that the next token is a scope.

sil_scope 123 { parent_scope 124 loc 457 }

I tried to model the syntax after how SIL instructions are represented. That said, using curly braces here instead of a comma-separated list of operands is a lot more readable.

What about an even more dictionary-like syntax:

sil_scope 123 { parent_scope: 124, loc: 457 }
sil_scope 124 { parent_function: @func, loc: 457 }

?

sil_loc 456 { "foo.swift":line:column }

I like this one for its compactness! Since the locations are leaf nodes, we could also inline them everywhere. The filenames might sill cause the lines to become very long, but it’s worth trying.

thanks,
adrian

-Joe

On Feb 10, 2016, at 11:34 AM, Adrian Prantl <aprantl@apple.com> wrote:

Hi Everybody,

I’d like to solicit comments on extending the textual .sil assembler
language with even more debug information.

At the moment it is only possible to test the effects that SIL
optimization passes have on debug information by observing the
effects of a full .swift -> LLVM IR compilation. To enable us
writing targeted testcases for single SIL optimization passes,
I'd like to propose a serialization format for scope and location
information for textual SIL.

The format is inspired by LLVM IR's metadata representation, but
with a couple of improvements that reduce the amount of
individual metadata records and improve the overall readability
by moving the metadata definition closer to their (first) use.

Each SIL instruction is extended by a location and scope reference.
sil-instruction-def ::= (sil-value-name '=')? sil-instruction, sil-loc, sil-scope
sil-loc ::= 'loc' md-name
sil-scope ::= 'scope' md-name
md-name ::= '!' [0-9]+

The individual metadata nodes are defined in a global context, as
they may be shared between individual SIL functions.

decl ::= md-def
md-def ::= md-name '=' md-node
md-node ::= 'loc' ',' 'line' [0-9]+ ',' 'column' [0-9]+ ',' 'file' string-literal
md-node ::= 'scope' ',' 'loc' md-name ',' 'parent' scope-parent (',' 'inlinedCallSite' md-name )?
scope-parent ::= sil-function-name
scope-parent ::= md-name

Let me know what you think!

-- adrian

PS:
Below is an example of what this would look like in practice for the following program:

Swift source code
-----------------

#line 100 "abc.swift"
@inline(__always)
public func h(k : Int) -> Int { // 101
return k // 102
}

#line 200 "abc.swift"
@inline(__always)
public func g(j : Int) -> Int { // 201
return h(j) // 202
}

#line 301 "abc.swift"
public func f(i : Int) -> Int { // 301
return g(i) // 302
}

Verbose SIL output
------------------

Note that metadata is defined before its first use:

!1 = loc, line 0, column 0
!2 = scope, loc !1, parent @main
[...]
!3 = loc, line 101, column 15, file "abc.swift"
!4 = loc, line 101, column 13, file "abc.swift"
!5 = scope, loc !4, parent @_TF9inlinedAt1hFSiSi
!6 = loc, line 102, column 3, file "abc.swift"
!7 = loc, line 103, column 1, file "abc.swift"
!8 = scope, loc !7, parent !5

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !5 // id: %1 line:101:15:in_prologue
return %0 : $Int, loc !6, scope !8 // id: %2 line:102:3:return
}

!9 = loc, line 201, column 15, file "abc.swift"
!10 = loc, line 201, column 13, file "abc.swift"
!11 = scope, loc !10, parent @_TF9inlinedAt1gFSiSi
!12 = loc, line 203, column 1, file "abc.swift"
!13 = scope, loc !12, parent !11
!14 = loc, line 202, column 13, file "abc.swift"
!15 = scope, loc !14, parent !13
!16 = scope, loc !4, parent !5, inlinedCallSite !15
!17 = loc, line 202, column 3, file "abc.swift"

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "j", argno 1, loc !9, scope !11 // id: %1 line:201:15:in_prologue
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !16 // id: %2 line:101:15:in_prologue: h(Int) -> Int perf_inlined_at /Volumes/Data/swift/swift/test/DebugInfo/inlinedAt.swift:202:10
return %0 : $Int, loc !17, scope !13 // id: %3 line:202:3:return
}

!18 = loc, line 301, column 15, file "abc.swift"
!19 = loc, line 301, column 13, file "abc.swift"
!20 = scope, loc !19, parent @_TF9inlinedAt1fFSiSi
!21 = loc, line 303, column 1, file "abc.swift"
!22 = scope, loc !21, parent !20
!23 = loc, line 302, column 13, file "abc.swift"
!24 = scope, loc !23, parent !22
!25 = scope, loc !10, parent !11, inlinedCallSite !24
!26 = scope, loc !4, parent !16, inlinedCallSite !24
!27 = loc, line 302, column 3, file "abc.swift"

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

What about the following updated syntax (with inlined locations and first-class sil_scope declarations)?

-- adrian

sil_scope 2 { loc "abc.swift":101:13, parent @_TF9inlinedAt1hFSiSi }
sil_scope 3 { loc "abc.swift":103:1, parent 2 }
                              // ^ I’m unsure about this comma, but it appears to be more readable.

Looks good. In other places, like sil_witness_table, we don't use comma separators and print with newlines and indentation:

sil_scope 3 {
loc "abc.swift":103:1
parent 2
}

The information here is probably compact enough not to need the newlines, but leaving off commas would be a bit more consistent.

Ok. I’ll leave them off to keep the grammar sane.

thanks,
adrian

···

On Feb 16, 2016, at 12:21 PM, Joe Groff <jgroff@apple.com> wrote:

On Feb 16, 2016, at 12:18 PM, Adrian Prantl <aprantl@apple.com> wrote:

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 2 // id: %1
return %0 : $Int, loc "abc.swift":102:3, scope 3 // id: %2
}

sil_scope 4 { loc "abc.swift":201:13, parent @_TF9inlinedAt1gFSiSi }
sil_scope 5 { loc "abc.swift":203:1, parent 4 }
sil_scope 6 { loc "abc.swift":202:13, parent 5 }
sil_scope 7 { loc "abc.swift":101:13, parent 2, inlined_at 6 }

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "j", argno 1, loc "abc.swift":201:15, scope 4 // id: %1
debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 7 // id: %2
return %0 : $Int, loc "abc.swift":202:3, scope 5 // id: %3
}

sil_scope 8 { loc "abc.swift":301:13, parent @_TF9inlinedAt1fFSiSi }
sil_scope 9 { loc "abc.swift":303:1, parent 8 }
sil_scope 10 { loc "abc.swift":302:13, parent 9 }
sil_scope 11 { loc "abc.swift":201:13, parent 4, inlined_at 10 }
sil_scope 12 { loc "abc.swift":101:13, parent 7, inlined_at 10 }

// f(Int) -> Int
sil @_TF9inlinedAt1fFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3, %4
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "i", argno 1, loc "abc.swift":301:15, scope 8 // id: %1
debug_value %0 : $Int, let, name "j", argno 1, loc "abc.swift":201:15, scope 11 // id: %2
debug_value %0 : $Int, let, name "k", argno 1, loc "abc.swift":101:15, scope 12 // id: %3
return %0 : $Int, loc "abc.swift":302:3, scope 9 // id: %4
}

On Feb 10, 2016, at 12:02 PM, Adrian Prantl via swift-dev <swift-dev@swift.org> wrote:

On Feb 10, 2016, at 11:43 AM, Joe Groff <jgroff@apple.com> wrote:

I'm not a fan of generic metadata syntax. I think there's little enough information here that we should just print and parse it inline, something like this maybe:

debug_value %0 : $Int, let, name "k", argno 1,
loc "foo.swift":line:column,
scope @foo -> "foo.swift":parentLine:column -> "foo.swift":line:column

which I think is easier to read and test. I always have trouble mentally piecing together the DAG when updating LLVM debug info tests. If you don't think that's practical,

The main problem is that the inline information in the scopes form a tree and explicitly printing it would quickly explode. It would make ingesting the SIL complicated because we’d need to unique the scopes before building the instructions.

I would still prefer proper declarations to metadata syntax:

debug_value <...>, scope 123, loc 456

Right, the ‘!’ prefix is redundant because the parser already knows that the next token is a scope.

sil_scope 123 { parent_scope 124 loc 457 }

I tried to model the syntax after how SIL instructions are represented. That said, using curly braces here instead of a comma-separated list of operands is a lot more readable.

What about an even more dictionary-like syntax:

sil_scope 123 { parent_scope: 124, loc: 457 }
sil_scope 124 { parent_function: @func, loc: 457 }

?

sil_loc 456 { "foo.swift":line:column }

I like this one for its compactness! Since the locations are leaf nodes, we could also inline them everywhere. The filenames might sill cause the lines to become very long, but it’s worth trying.

thanks,
adrian

-Joe

On Feb 10, 2016, at 11:34 AM, Adrian Prantl <aprantl@apple.com> wrote:

Hi Everybody,

I’d like to solicit comments on extending the textual .sil assembler
language with even more debug information.

At the moment it is only possible to test the effects that SIL
optimization passes have on debug information by observing the
effects of a full .swift -> LLVM IR compilation. To enable us
writing targeted testcases for single SIL optimization passes,
I'd like to propose a serialization format for scope and location
information for textual SIL.

The format is inspired by LLVM IR's metadata representation, but
with a couple of improvements that reduce the amount of
individual metadata records and improve the overall readability
by moving the metadata definition closer to their (first) use.

Each SIL instruction is extended by a location and scope reference.
sil-instruction-def ::= (sil-value-name '=')? sil-instruction, sil-loc, sil-scope
sil-loc ::= 'loc' md-name
sil-scope ::= 'scope' md-name
md-name ::= '!' [0-9]+

The individual metadata nodes are defined in a global context, as
they may be shared between individual SIL functions.

decl ::= md-def
md-def ::= md-name '=' md-node
md-node ::= 'loc' ',' 'line' [0-9]+ ',' 'column' [0-9]+ ',' 'file' string-literal
md-node ::= 'scope' ',' 'loc' md-name ',' 'parent' scope-parent (',' 'inlinedCallSite' md-name )?
scope-parent ::= sil-function-name
scope-parent ::= md-name

Let me know what you think!

-- adrian

PS:
Below is an example of what this would look like in practice for the following program:

Swift source code
-----------------

#line 100 "abc.swift"
@inline(__always)
public func h(k : Int) -> Int { // 101
return k // 102
}

#line 200 "abc.swift"
@inline(__always)
public func g(j : Int) -> Int { // 201
return h(j) // 202
}

#line 301 "abc.swift"
public func f(i : Int) -> Int { // 301
return g(i) // 302
}

Verbose SIL output
------------------

Note that metadata is defined before its first use:

!1 = loc, line 0, column 0
!2 = scope, loc !1, parent @main
[...]
!3 = loc, line 101, column 15, file "abc.swift"
!4 = loc, line 101, column 13, file "abc.swift"
!5 = scope, loc !4, parent @_TF9inlinedAt1hFSiSi
!6 = loc, line 102, column 3, file "abc.swift"
!7 = loc, line 103, column 1, file "abc.swift"
!8 = scope, loc !7, parent !5

// h(Int) -> Int
sil [always_inline] @_TF9inlinedAt1hFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !5 // id: %1 line:101:15:in_prologue
return %0 : $Int, loc !6, scope !8 // id: %2 line:102:3:return
}

!9 = loc, line 201, column 15, file "abc.swift"
!10 = loc, line 201, column 13, file "abc.swift"
!11 = scope, loc !10, parent @_TF9inlinedAt1gFSiSi
!12 = loc, line 203, column 1, file "abc.swift"
!13 = scope, loc !12, parent !11
!14 = loc, line 202, column 13, file "abc.swift"
!15 = scope, loc !14, parent !13
!16 = scope, loc !4, parent !5, inlinedCallSite !15
!17 = loc, line 202, column 3, file "abc.swift"

// g(Int) -> Int
sil [always_inline] @_TF9inlinedAt1gFSiSi : $@convention(thin) (Int) -> Int {
// %0 // users: %1, %2, %3
bb0(%0 : $Int):
debug_value %0 : $Int, let, name "j", argno 1, loc !9, scope !11 // id: %1 line:201:15:in_prologue
debug_value %0 : $Int, let, name "k", argno 1, loc !3, scope !16 // id: %2 line:101:15:in_prologue: h(Int) -> Int perf_inlined_at /Volumes/Data/swift/swift/test/DebugInfo/inlinedAt.swift:202:10
return %0 : $Int, loc !17, scope !13 // id: %3 line:202:3:return
}

!18 = loc, line 301, column 15, file "abc.swift"
!19 = loc, line 301, column 13, file "abc.swift"
!20 = scope, loc !19, parent @_TF9inlinedAt1fFSiSi
!21 = loc, line 303, column 1, file "abc.swift"
!22 = scope, loc !21, parent !20
!23 = loc, line 302, column 13, file "abc.swift"
!24 = scope, loc !23, parent !22
!25 = scope, loc !10, parent !11, inlinedCallSite !24
!26 = scope, loc !4, parent !16, inlinedCallSite !24
!27 = loc, line 302, column 3, file "abc.swift"

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev