Potential contributions to compilation time reporting?

Hello all!

I'm looking for a body of work to do on the Swift compiler for an upcoming
programmer retreat I'm attending in January [1]. I've read a lot of blog
posts with tips for diagnosing slow Swift compile times [2], and I was
wondering if I could contribute to tooling in this area.

(Just to make it clear: I'm talking about improving and expanding the set
of tools developers have to figure out why their projects take a long time
to compile. I'm *not* talking about working on speeding Swift compile times
-- although the tools may indirectly help with that.)

# Current state: many options, but few "supported" ones

Most of the blog posts I've been reading talk about using the
`-driver-time-compilation`, `-debug-time-compilation`,
`-debug-time-function-bodies`, `-warn-long-function-bodies=`, and
`-warn-long-expression-type-checking=` options. I saved some sample output
from each of these options here
<https://gist.github.com/modocache/3fb21f5dc7fec6f300cacbe6c74c59d2>.

Using these options, developers can find function bodies and expressions
that took longer to compile than others.

However, it should be noted that, of these options, only the first is a
user-facing "supported" option. The others are `swift -frontend` options,
and as such the Swift team has been clear that this means the options may
be changed or removed at any point in the future [3].

What's more, several contributors have noted the behavior of the options
themselves could also be improved. Here's what I gathered from reading
several JIRA bugs, commit messages, and mailing list discussions:

- SR-2910 <https://bugs.swift.org/browse/SR-2910> points out that
`-debug-time-function-bodies` prints just `get {}` and `set {}` for struct
getters and setters, and that this could be improved by printing the
variable name as well.
- The commit that added `-warn-long-function-message=` notes in its commit
message
<https://github.com/apple/swift/commit/18c75928639acf0ccf0e1fb6729eea75bc09cbd5>
that
the option only measures some aspects of type-checking, that it doesn't
provide any information on how checking a function for the first time will
take longer, doesn't report on other phases of compilation, and doesn't
catch anything being type-checked multiple times.

In part because the options don't provide the functionality described in
the commit message above, the impression I've received has been that the
Swift team is not enthusiastic about exposing these options outside of
`-frontend`. From the comments on SR-2741
<https://bugs.swift.org/browse/SR-2741>, Jordan Rose writes:

I'm leery of "productizing" `-debug-time-function-bodies` in any way

because it's already not a true story. Import and decl-checking costs that
come up for the first time during a particular function get ascribed to
that function. SILGen, optimization, and LLVM costs are ignored entirely.
Decl-checking costs that are *not* within a particular function are also
ignored. The only useful purpose of `-debug-time-function-bodies` is to see
if the type checker is spinning on a particular function.

# The question of whether to "productize" these options

Further, other contributors have expressed misgivings about "productizing"
these as well. In an email thread named "Reporting/Debugging Slow Swift
Compile Time"
<https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20161003/003099.html>,
Ben Asher writes:

I understand Jordan's response (in SR-2741) of being "leery of

'productizing'" these flags. Developing with Swift shouldn't involve
fighting the compiler to get the best compile time, so making this more
than a debug flag does seem odd/worrisome.

I sympathize with this viewpoint, but I am concerned by three things:

1. The Swift team has been very clear that these options are "unsupported",
so my impression is that they're not quite the same as "debug flags".
`-driver-print-jobs` or `-###`, for example, are handy debugging options
that are, I think, "supported". They're not hidden behind the `-frontend`
option.
2. It seems like the reality is that many professional Swift developers use
these *unsupported* options (again, not simply "debugging options", but
unsupported `-frontend` options) every day in order to measure their
compilation times and keep them under control.
3. One argument for keeping these options "unsupported" is that these
professional developers won't need these `-frontend` options once the Swift
compiler becomes fast enough, at which point they could potentially be
removed without any room for complaint. But some large mobile application
teams work on codebases that take over an hour to compile, and are compiled
and hundreds of times each day. For teams such as these, the Swift compiler
will never be "fast enough". In environments where shaving 1% or 2% off of
compilation time results in large CPU and energy consumption savings, I
think engineers will always want to narrow down compilation time
bottlenecks, and those engineers would probably want a "supported" set of
options in order to do so -- that is, they'd rather not use options that
"may be removed without notice at any future date."

So, it's my opinion that work should be done to provide a "supported" set
of options to help diagnose why a Swift compilation is taking a long time,
and I'd like to work on that in January.

# Ideas for "supported" options, and other future work?

I'm 100% in agreement that options like `-debug-time-function-bodies`
should not be brought out from behind the `-frontend` option in their
current form. As I noted above, members of the Swift team consider these
options to provide inaccurate data. Even if the data was accurate, its
output is in an ad-hoc textual format, which is more difficult to
machine-parse than other formats.

I'd like to solicit ideas for future work here:

- At the very least, I could fix SR-2910
<https://bugs.swift.org/browse/SR-2910>.
- I'd also like to address some of the issues mentioned in this commit
message
<https://github.com/apple/swift/commit/18c75928639acf0ccf0e1fb6729eea75bc09cbd5>,
but I would like to confirm they're still something the Swift team would
like work to be done on. Jordan, are these still relevant?
- I'm wondering if anyone else has some work in-flight here, or if they
have ideas. If you've been longing for a "killer feature", please reply
here and let me know!

- Brian Gesiak

[1] See the blog post
<https://www.recurse.com/blog/121-come-to-rc-for-a-one-week-retreat> for
more details. Please let me know if you're also applying to go, especially
if you're thinking of working on Swift! It would be great to have a buddy :slight_smile:
[2] Examples: 1 <https://github.com/fastred/Optimizing-Swift-Build-Times>, 2
<https://www.swiftbysundell.com/posts/improving-swift-compile-times>, and 3
<http://irace.me/swift-profiling>.
[3] This commit message
<https://github.com/apple/swift/commit/18c75928639acf0ccf0e1fb6729eea75bc09cbd5>
states
“As a frontend option, this is UNSUPPORTED and may be removed
without notice at any future date.”

Hello all!

I'm looking for a body of work to do on the Swift compiler for an upcoming programmer retreat I'm attending in January [1]. I've read a lot of blog posts with tips for diagnosing slow Swift compile times [2], and I was wondering if I could contribute to tooling in this area.

Hey, this is great! Sorry I didn't get back to you earlier. Happy to talk this over.

(Just to make it clear: I'm talking about improving and expanding the set of tools developers have to figure out why their projects take a long time to compile. I'm *not* talking about working on speeding Swift compile times -- although the tools may indirectly help with that.)

This seems plausible to me; though like others, I'm a bit hesitant too. There are often _many_ things going on in a given compilation, and one needs to be careful surfacing any particular signal as a putative "singular cause" of a slow compilation, since the user may see that signal and respond by making significant adjustments to their code, only to find the overall time isn't helped much. But I totally get the motive.

Using these options, developers can find function bodies and expressions that took longer to compile than others.

Sadly, there's not always a 1:1 mapping between source entities and time like that. Certainly _some_ cases can be so egregious (say, typechecking time on an expression that's triggering an exponential-time inference) that they dominate compile time, and can be identified-and-fixed in isolation; but often the total amount of work attributable to a given source entity is spread around the compilation, occurs in multiple phases, emerges out of interaction between entities, overlaps with others, etc.

That's not to say that one couldn't build machinery to do a better job attributing costs to source entities than what we have now; just that it'll be a bit of work to get there from here. Maybe fun work! I'd be happy to discuss more :slight_smile:

However, it should be noted that, of these options, only the first is a user-facing "supported" option. The others are `swift -frontend` options, and as such the Swift team has been clear that this means the options may be changed or removed at any point in the future [3].

Yeah, there are a few good reasons for being careful which options get surfaced; but I generally agree with you that _some_ at least partly-supported ones would be nice. IMO it's easiest and most flexible to aim to _export_ information out of the compiler in a machine-readable form, so you can post-process with other tools. I've mostly been using json and csv in recent efforts (eg. see -stats-output-dir or -trace-stats-events). But I don't have an especially strong feeling about the degree-of-support / stability of such features; I'm going to have to leave that part of your question to others.

What's more, several contributors have noted the behavior of the options themselves could also be improved. Here's what I gathered from reading several JIRA bugs, commit messages, and mailing list discussions:

- SR-2910 <https://bugs.swift.org/browse/SR-2910> points out that `-debug-time-function-bodies` prints just `get {}` and `set {}` for struct getters and setters, and that this could be improved by printing the variable name as well.
- The commit that added `-warn-long-function-message=` notes in its commit message <https://github.com/apple/swift/commit/18c75928639acf0ccf0e1fb6729eea75bc09cbd5> that the option only measures some aspects of type-checking, that it doesn't provide any information on how checking a function for the first time will take longer, doesn't report on other phases of compilation, and doesn't catch anything being type-checked multiple times.

Yes, this latter comment is the part I'm most concerned about. I think if you're going to go this way -- providing a diagnostic mode that assigns costs to source-entities rather than compiler subsystems -- the difficult/interesting parts will be (a) making sure you capture enough of the costs of compilation to approximate "total time" meaningfully and (b) making sure you attribute "causes" meaningfully. The latter is really quite tricky since a lot of the compiler's work is lazy / demand-driven: the first time some (lazy and memoized) work is demanded, it's tempting to attribute the cost of the work to the entity first requesting it. But if a dozen other entities also demand that work gets done (reusing it, whichever requests it first) then the attribution will be ... unhelpful to the user trying to change their code to avoid the cost.

I'd like to solicit ideas for future work here:

- At the very least, I could fix SR-2910 <https://bugs.swift.org/browse/SR-2910>.
- I'd also like to address some of the issues mentioned in this commit message <https://github.com/apple/swift/commit/18c75928639acf0ccf0e1fb6729eea75bc09cbd5>, but I would like to confirm they're still something the Swift team would like work to be done on. Jordan, are these still relevant?
- I'm wondering if anyone else has some work in-flight here, or if they have ideas. If you've been longing for a "killer feature", please reply here and let me know!

I'll let the others who feel strongly about the "official support" aspect speak to that; myself I'd suggest looking at one or another of two approaches:

  1. See if you can leverage the existing counters and stats-gathering / reporting machinery in the compiler; for example the -trace-stats-events infrastructure lets you bundle together work done on behalf of a given source range and any changes to compiler counters during that work, as a single virtual "stats-event" in processing, and trace those events to a .csv file. Maybe something related to that would be helpful for the task you're interested in?

  2. Consider going up a level from declarations or functions to _files_, and see if there's a useful way to visualize hot-spots in the inter-file dependency graph that the driver interprets, during incremental compilation. The units of work at this level are likely to be large (especially if they involve cascading dependencies, that invalidate "downstream" files) and often cutting or changing an edge in that graph can be a simpler matter of moving a declaration, or changing its visibility: reasonably easy changes that don't cost the user much to experiment with.

Hope that helps! Happy to discuss any of this further.

-Graydon

···

On Nov 12, 2017, at 8:30 AM, Brian Gesiak via swift-dev <swift-dev@swift.org> wrote:

Thanks for the reply, Graydon, and for your other email on the topic
<https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20171113/006001.html>
!

I need to take more time to look into some of the things you mentioned, but
I won't be able to do so in earnest for another few days. In the meantime,
I'll just reply with a few uninformed opinions -- feel free not to respond
to these :slight_smile:

Sadly, there's not always a 1:1 mapping between source entities and time
like that. Certainly _some_ cases can be so egregious (say, typechecking
time on an expression that's triggering an exponential-time inference) that
they dominate compile time, and can be identified-and-fixed in isolation;
but often the total amount of work attributable to a given source entity is
spread around the compilation, occurs in multiple phases, emerges out of
interaction between entities, overlaps with others, etc.

It's these particularly egregious cases that I had in mind when I wrote my
email. Many blog posts on "how to reduce your Swift project compile times"
suggest one of the following approaches:

1. Use `-debug-time-function-bodies` and `-debug-time-expression-checking`
to print a list of times, sort those lists in descending order of time
spent, and add explicit types to, or otherwise simplify, the slowest
functions and expressions.
2. Use `-warn-long-function-bodies=` and
`-warn-long-expression-type-checking=`.
Compile the project several times, gradually lowering the thresholds passed
to these options, in order to surface the slowest functions and
expressions. Add explicit types to or otherwise simplify the slowest
functions and expressions.

My original idea was to expand or improve these options, since I think
they'll continue to be useful for large Swift projects. But I'm certainly
open to working on something else instead.

I don't have an especially strong feeling about the degree-of-support /
stability of such features; I'm going to have to leave that part of your
question to others.

Yeah, this is definitely still an open question of mine. In the meantime
I'll try looking at some of the other approaches you've suggested (thanks!).

  1. See if you can leverage the existing counters and stats-gathering /

reporting machinery in the compiler; for example the -trace-stats-events
infrastructure lets you bundle together work done on behalf of a given
source range and any changes to compiler counters during that work, as a
single virtual "stats-event" in processing, and trace those events to a
.csv file. Maybe something related to that would be helpful for the task
you're interested in?

I wasn't aware of `-trace-stats-events`, thanks! When using
`-stats-output-dir` with a primary-file compilation mode, I can see that
the stats include the amount of time spent on each Swift module being
produced. I'll need to take a closer look at how `-trace-stats-events`
works, though -- I get an empty .csv file when I use that option in
conjunction with `-stats-output-dir`, and I'm not sure yet how to use it
with source ranges. I'll look into this further.

  2. Consider going up a level from declarations or functions to _files_,
and see if there's a useful way to visualize hot-spots in the inter-file
dependency graph that the driver interprets, during incremental
compilation. The units of work at this level are likely to be large
(especially if they involve cascading dependencies, that invalidate
"downstream" files) and often cutting or changing an edge in that graph can
be a simpler matter of moving a declaration, or changing its visibility:
reasonably easy changes that don't cost the user much to experiment with.

Hope that helps! Happy to discuss any of this further.

Yes, thank you! I'm also interested in the "improving incremental mode"
section of your other email, so thanks for writing all that down.

- Brian Gesiak

···

On Fri, Nov 17, 2017 at 12:54 AM, Graydon Hoare <ghoare@apple.com> wrote:

Thanks for the reply, Graydon, and for your other email on the topic <https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20171113/006001.html>!

I need to take more time to look into some of the things you mentioned, but I won't be able to do so in earnest for another few days. In the meantime, I'll just reply with a few uninformed opinions -- feel free not to respond to these :slight_smile:

Sadly, there's not always a 1:1 mapping between source entities and time like that. Certainly _some_ cases can be so egregious (say, typechecking time on an expression that's triggering an exponential-time inference) that they dominate compile time, and can be identified-and-fixed in isolation; but often the total amount of work attributable to a given source entity is spread around the compilation, occurs in multiple phases, emerges out of interaction between entities, overlaps with others, etc.

It's these particularly egregious cases that I had in mind when I wrote my email. Many blog posts on "how to reduce your Swift project compile times" suggest one of the following approaches:

1. Use `-debug-time-function-bodies` and `-debug-time-expression-checking` to print a list of times, sort those lists in descending order of time spent, and add explicit types to, or otherwise simplify, the slowest functions and expressions.
2. Use `-warn-long-function-bodies=` and `-warn-long-expression-type-checking=`. Compile the project several times, gradually lowering the thresholds passed to these options, in order to surface the slowest functions and expressions. Add explicit types to or otherwise simplify the slowest functions and expressions.

One of the drawbacks with the type checker timers is that they do not exclude the time spent in client code called by the type checker, e.g. in deserialization code. The result is that you can e.g. have a simple expression involving a stdlib type like String, and have it be reported as taking 500ms to type check, when in fact it’s nearly instantaneous to type check, but we spent 500ms deserializing things while type checking that expression. What gets really confusing about this is some of this deserializing only happens if you don’t use explicit types, so adding an explicit type to one of these expressions can mean that this expression is now fast, but some other expression which was previously reported as being fast now suddenly “regresses” and takes 500ms due to that deserialization happening later.

I haven’t looked closely, but offhand I don’t know of a clean, robust, maintainable way of excluding this time.

I’m a bit concerned about productizing these without solving that problem.

It’s also not necessarily reasonable that we’re doing this deserialization when an explicit type is not used so there might be something useful to investigate and fix there as well.

Mark

···

On Nov 19, 2017, at 9:02 PM, Brian Gesiak via swift-dev <swift-dev@swift.org> wrote:
On Fri, Nov 17, 2017 at 12:54 AM, Graydon Hoare <ghoare@apple.com <mailto:ghoare@apple.com>> wrote:

My original idea was to expand or improve these options, since I think they'll continue to be useful for large Swift projects. But I'm certainly open to working on something else instead.

I don't have an especially strong feeling about the degree-of-support / stability of such features; I'm going to have to leave that part of your question to others.

Yeah, this is definitely still an open question of mine. In the meantime I'll try looking at some of the other approaches you've suggested (thanks!).

  1. See if you can leverage the existing counters and stats-gathering / reporting machinery in the compiler; for example the -trace-stats-events infrastructure lets you bundle together work done on behalf of a given source range and any changes to compiler counters during that work, as a single virtual "stats-event" in processing, and trace those events to a .csv file. Maybe something related to that would be helpful for the task you're interested in?

I wasn't aware of `-trace-stats-events`, thanks! When using `-stats-output-dir` with a primary-file compilation mode, I can see that the stats include the amount of time spent on each Swift module being produced. I'll need to take a closer look at how `-trace-stats-events` works, though -- I get an empty .csv file when I use that option in conjunction with `-stats-output-dir`, and I'm not sure yet how to use it with source ranges. I'll look into this further.

  2. Consider going up a level from declarations or functions to _files_, and see if there's a useful way to visualize hot-spots in the inter-file dependency graph that the driver interprets, during incremental compilation. The units of work at this level are likely to be large (especially if they involve cascading dependencies, that invalidate "downstream" files) and often cutting or changing an edge in that graph can be a simpler matter of moving a declaration, or changing its visibility: reasonably easy changes that don't cost the user much to experiment with.

Hope that helps! Happy to discuss any of this further.

Yes, thank you! I'm also interested in the "improving incremental mode" section of your other email, so thanks for writing all that down.

- Brian Gesiak

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Thanks for the reply, Graydon, and for your other email on the topic <https://lists.swift.org/pipermail/swift-dev/Week-of-Mon-20171113/006001.html>!

I need to take more time to look into some of the things you mentioned, but I won't be able to do so in earnest for another few days. In the meantime, I'll just reply with a few uninformed opinions -- feel free not to respond to these :slight_smile:

Sadly, there's not always a 1:1 mapping between source entities and time like that. Certainly _some_ cases can be so egregious (say, typechecking time on an expression that's triggering an exponential-time inference) that they dominate compile time, and can be identified-and-fixed in isolation; but often the total amount of work attributable to a given source entity is spread around the compilation, occurs in multiple phases, emerges out of interaction between entities, overlaps with others, etc.

It's these particularly egregious cases that I had in mind when I wrote my email. Many blog posts on "how to reduce your Swift project compile times" suggest one of the following approaches:

1. Use `-debug-time-function-bodies` and `-debug-time-expression-checking` to print a list of times, sort those lists in descending order of time spent, and add explicit types to, or otherwise simplify, the slowest functions and expressions.
2. Use `-warn-long-function-bodies=` and `-warn-long-expression-type-checking=`. Compile the project several times, gradually lowering the thresholds passed to these options, in order to surface the slowest functions and expressions. Add explicit types to or otherwise simplify the slowest functions and expressions.

One of the drawbacks with the type checker timers is that they do not exclude the time spent in client code called by the type checker, e.g. in deserialization code. The result is that you can e.g. have a simple expression involving a stdlib type like String, and have it be reported as taking 500ms to type check, when in fact it’s nearly instantaneous to type check, but we spent 500ms deserializing things while type checking that expression. What gets really confusing about this is some of this deserializing only happens if you don’t use explicit types, so adding an explicit type to one of these expressions can mean that this expression is now fast, but some other expression which was previously reported as being fast now suddenly “regresses” and takes 500ms due to that deserialization happening later.

On the flip side, I would argue that if referencing a single stdlib declaration like String results in 500ms spent in deserialization, we have a problem, even if it s a one-time cost. In multiple-frontend mode, every source file will end up paying the 500ms penalty.

···

On Nov 28, 2017, at 5:01 PM, Mark Lacey via swift-dev <swift-dev@swift.org> wrote:

On Nov 19, 2017, at 9:02 PM, Brian Gesiak via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
On Fri, Nov 17, 2017 at 12:54 AM, Graydon Hoare <ghoare@apple.com <mailto:ghoare@apple.com>> wrote:

I haven’t looked closely, but offhand I don’t know of a clean, robust, maintainable way of excluding this time.

I’m a bit concerned about productizing these without solving that problem.

It’s also not necessarily reasonable that we’re doing this deserialization when an explicit type is not used so there might be something useful to investigate and fix there as well.

Mark

My original idea was to expand or improve these options, since I think they'll continue to be useful for large Swift projects. But I'm certainly open to working on something else instead.

I don't have an especially strong feeling about the degree-of-support / stability of such features; I'm going to have to leave that part of your question to others.

Yeah, this is definitely still an open question of mine. In the meantime I'll try looking at some of the other approaches you've suggested (thanks!).

  1. See if you can leverage the existing counters and stats-gathering / reporting machinery in the compiler; for example the -trace-stats-events infrastructure lets you bundle together work done on behalf of a given source range and any changes to compiler counters during that work, as a single virtual "stats-event" in processing, and trace those events to a .csv file. Maybe something related to that would be helpful for the task you're interested in?

I wasn't aware of `-trace-stats-events`, thanks! When using `-stats-output-dir` with a primary-file compilation mode, I can see that the stats include the amount of time spent on each Swift module being produced. I'll need to take a closer look at how `-trace-stats-events` works, though -- I get an empty .csv file when I use that option in conjunction with `-stats-output-dir`, and I'm not sure yet how to use it with source ranges. I'll look into this further.

  2. Consider going up a level from declarations or functions to _files_, and see if there's a useful way to visualize hot-spots in the inter-file dependency graph that the driver interprets, during incremental compilation. The units of work at this level are likely to be large (especially if they involve cascading dependencies, that invalidate "downstream" files) and often cutting or changing an edge in that graph can be a simpler matter of moving a declaration, or changing its visibility: reasonably easy changes that don't cost the user much to experiment with.

Hope that helps! Happy to discuss any of this further.

Yes, thank you! I'm also interested in the "improving incremental mode" section of your other email, so thanks for writing all that down.

- Brian Gesiak

_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev

_______________________________________________
swift-dev mailing list
swift-dev@swift.org <mailto:swift-dev@swift.org>
https://lists.swift.org/mailman/listinfo/swift-dev