Testing, validation, and many-core machines

David_Zarzycki · January 17, 2018, 4:23pm

Hello,

In docs/Testing.md, “long test” is acknowledged but never defined. When should tests be marked “long_test”? I’m asking because there are arguably a number of latent long_test tests that are exposed on many-core machines.

Speaking of many-core machines, there really isn’t a practical difference between smoke/primary testing and secondary/validation testing. Both test suites rush through 95-99% of the tests in very short order, and then wait a relatively long time for the remaining tests to finish. Can we move some of the exhaustive/stress tests in the “primary” suite to the validation suite? Likewise, can we move some of the short/functional tests in the validation directory to the primary test directory? That would match people’s expectations better, would it not?

Dave

Michael_Gottesman · January 17, 2018, 9:06pm

Hello,

In docs/Testing.md, “long test” is acknowledged but never defined. When should tests be marked “long_test”? I’m asking because there are arguably a number of latent long_test tests that are exposed on many-core machines.

I think about it this way. A long test is a test that we want to run, but would bog down PR testing or are enough of a validation test that the trade-off in between signal/PR testing delays is found to be lacking. That being said I think of it as an approximation.

Speaking of many-core machines, there really isn’t a practical difference between smoke/primary testing and secondary/validation testing. Both test suites rush through 95-99% of the tests in very short order, and then wait a relatively long time for the remaining tests to finish. Can we move some of the exhaustive/stress tests in the “primary” suite to the validation suite? Likewise, can we move some of the short/functional tests in the validation directory to the primary test directory? That would match people’s expectations better, would it not?

Do you have specific test? Again, it isn't just about test time. It is about the trade-off in between signal and length of time.

···

On Jan 17, 2018, at 8:23 AM, David Zarzycki via swift-dev <swift-dev@swift.org> wrote:

Dave
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

David_Zarzycki · January 17, 2018, 9:59pm

Hello,

In docs/Testing.md, “long test” is acknowledged but never defined. When should tests be marked “long_test”? I’m asking because there are arguably a number of latent long_test tests that are exposed on many-core machines.

I think about it this way. A long test is a test that we want to run, but would bog down PR testing or are enough of a validation test that the trade-off in between signal/PR testing delays is found to be lacking. That being said I think of it as an approximation.

Hi Michael,

How should an open source developer that doesn’t know much about the PR machinery decide when to use “long_test”?

Speaking of many-core machines, there really isn’t a practical difference between smoke/primary testing and secondary/validation testing. Both test suites rush through 95-99% of the tests in very short order, and then wait a relatively long time for the remaining tests to finish. Can we move some of the exhaustive/stress tests in the “primary” suite to the validation suite? Likewise, can we move some of the short/functional tests in the validation directory to the primary test directory? That would match people’s expectations better, would it not?

Do you have specific test? Again, it isn't just about test time. It is about the trade-off in between signal and length of time.

Right. I’m proposing that we improve the signal ratio by:

1) Moving blatant stress tests to validation. I’d start with these “maybe catch a race, maybe not” stress tests:

test/Runtime/weak-reference-racetests.swift
test/SILOptimizer/string_switch.swift
test/Sanitizers/tsan-ignores-arc-locks.swift
test/Sanitizers/tsan-type-metadata.swift
test/Sanitizers/witness_table_lookup.swift
test/stdlib/Inputs/CommandLineStressTest/CommandLineStressTest.swift

2) Move fast tests that live in validation-test into test. These tend to be “compiler crash” regression tests, which arguably fill in holes in the functional test suite until proper errors can be emitted.

Ideally after this swap, “build-script -t” should take the same amount of time but test more stuff.

Dave

···

On Jan 17, 2018, at 16:06, Michael Gottesman <mgottesman@apple.com> wrote:

On Jan 17, 2018, at 8:23 AM, David Zarzycki via swift-dev <swift-dev@swift.org> wrote: