Skew in test running times: break up tests/move some to validation-test?

Since Luciano brought up the time to run tests, recently, I was like, let's measure. I measured the timings for the tests (using lit's --time-tests) on my machine and here are the results. (Total time is 481.49s on a 10-core Intel Xeon).

Slowest Tests:
--------------------------------------------------------------------------
181.05s: Swift(macosx-x86_64) :: Prototypes/CollectionTransformers.swift
101.47s: Swift(macosx-x86_64) :: Interpreter/dynamic_replacement.swift
57.76s: Swift(macosx-x86_64) :: IDE/complete_value_expr.swift
45.86s: Swift(macosx-x86_64) :: AutoDiff/validation-test/forward_mode_simd.swift
45.40s: Swift(macosx-x86_64) :: SourceKit/InterfaceGen/gen_stdlib.swift
45.31s: Swift(macosx-x86_64) :: SIL/Serialization/deserialize_stdlib.sil
44.11s: Swift(macosx-x86_64) :: IDE/complete_override.swift
42.00s: Swift(macosx-x86_64) :: SourceKit/CursorInfo/cursor_info.swift
41.45s: Swift(macosx-x86_64) :: IDE/complete_at_top_level.swift
40.02s: Swift(macosx-x86_64) :: IDE/complete_call_arg.swift
39.97s: Swift(macosx-x86_64) :: IDE/complete_accessor.swift
36.74s: Swift(macosx-x86_64) :: IDE/complete_stmt_controlling_expr.swift
35.99s: Swift(macosx-x86_64) :: IDE/complete_in_accessors.swift
34.19s: Swift(macosx-x86_64) :: Syntax/round_trip_stdlib.swift
32.64s: Swift(macosx-x86_64) :: IDE/complete_unresolved_members.swift
31.03s: Swift(macosx-x86_64) :: Runtime/protocol_conformance_collision.swift
30.70s: Swift(macosx-x86_64) :: IDE/complete_type_in_func_param.swift
29.66s: Swift(macosx-x86_64) :: Casting/Casts.swift
29.26s: Swift(macosx-x86_64) :: IDE/complete_keywords.swift
29.07s: Swift(macosx-x86_64) :: Prototypes/DoubleWidth.swift.gyb

Tests Times:
--------------------------------------------------------------------------
[   Range   ] :: [               Percentage               ] :: [  Count  ]
--------------------------------------------------------------------------
[180s,190s) :: [                                        ] :: [   1/7209]
[170s,180s) :: [                                        ] :: [   0/7209]
[160s,170s) :: [                                        ] :: [   0/7209]
[150s,160s) :: [                                        ] :: [   0/7209]
[140s,150s) :: [                                        ] :: [   0/7209]
[130s,140s) :: [                                        ] :: [   0/7209]
[120s,130s) :: [                                        ] :: [   0/7209]
[110s,120s) :: [                                        ] :: [   0/7209]
[100s,110s) :: [                                        ] :: [   1/7209]
[ 90s,100s) :: [                                        ] :: [   0/7209]
[ 80s, 90s) :: [                                        ] :: [   0/7209]
[ 70s, 80s) :: [                                        ] :: [   0/7209]
[ 60s, 70s) :: [                                        ] :: [   1/7209]
[ 50s, 60s) :: [                                        ] :: [   0/7209]
[ 40s, 50s) :: [                                        ] :: [   8/7209]
[ 30s, 40s) :: [                                        ] :: [  11/7209]
[ 20s, 30s) :: [                                        ] :: [  19/7209]
[ 10s, 20s) :: [                                        ] :: [ 102/7209]
[  0s, 10s) :: [*************************************** ] :: [7066/7209]
--------------------------------------------------------------------------

Do people have thoughts on breaking up some of the larger tests into smaller ones? Maybe we can split the SourceKit tests into smaller ones and move the longer ones into the validation test suite? (cc @akyrtzi @blangmuir)

From the perspective of CI, it doesn't really matter if a test lives in test/ or validation-test/, but in case one is running the ~7k tests together (I don't know about y'all, but I try to do this before pushing to reduce back-and-forth with swift-ci), it seems a bit wasteful to have a large amount of time being spent in a handful of tests. Also, a long-running test is kinda' a liability in the sense that, in case it fails, you need to break it up so that you can iterate quickly. It also potentially adversely affects paralllelism.

Thank you for the measurements Varun!

That is exactly the reason that made me bring the topic up on the other thread.

I agree, but maybe only for long running tests. In general, there is a potential downside on spliting too much and end-up with too many test files.
In fact there is a recommendation on the documentation to avoid creating new files here

57.76s: Swift(macosx-x86_64) :: IDE/complete_value_expr.swift
32.64s: Swift(macosx-x86_64) :: IDE/complete_unresolved_members.swift
30.70s: Swift(macosx-x86_64) :: IDE/complete_type_in_func_param.swift
29.26s: Swift(macosx-x86_64) :: IDE/complete_keywords.swift
44.11s: Swift(macosx-x86_64) :: IDE/complete_override.swift
41.45s: Swift(macosx-x86_64) :: IDE/complete_at_top_level.swift
40.02s: Swift(macosx-x86_64) :: IDE/complete_call_arg.swift
39.97s: Swift(macosx-x86_64) :: IDE/complete_accessor.swift
36.74s: Swift(macosx-x86_64) :: IDE/complete_stmt_controlling_expr.swift
35.99s: Swift(macosx-x86_64) :: IDE/complete_in_accessors.swift

@rintaro could we port these to use fast completion?

34.19s: Swift(macosx-x86_64) :: Syntax/round_trip_stdlib.swift

This can't be split up, so maybe it should move to validation. @rintaro @akyrtzi what do you think?

42.00s: Swift(macosx-x86_64) :: SourceKit/CursorInfo/cursor_info.swift
45.40s: Swift(macosx-x86_64) :: SourceKit/InterfaceGen/gen_stdlib.swift

The cursor_info one at least should not be moved to validation since it's a core test for cursor_info functionality. I think they could both be split up though.

1 Like

SGTM

Sure. A few completion option e.g -code-complete-call-pattern-heuristics are not compatible with fast-completion (because they are modeled as compiler arguments). So we need to split them up (or we need to model them differently).

Varun, did you get those numbers on a Debug build? Just curious because I ran in a ReleaseAssert build the number are a little bit better in a not so good machine with only 4 cores. Which is the normal build people use for running those tests? By default I always use the same ReleaseAssert.

Those are numbers for a ReleaseAssert build, I also use that when running a large number of tests.

1 Like