Collection library in Swift: GSoC2019

Gumichocopengin8 · March 2, 2019, 1:38am

Hello,
This is my original proposal for Google Summer of Code 2019.
I'd happy if this proposal interests you. Please feel free to comment about this proposal.
Plus, I need mentor(s) start this proposal as a project, so I'm glad if you would be a mentor.

I'd like to implement basic collections, such as stack, queue, binary tree etc. because when I tried to use stack in swift, I couldn't use it. I know programmers can make own collections, but I think swift should have own collection library like Java or other languages.

Thank you

lorentey · March 5, 2019, 6:54pm

Hello @Gumichocopengin8,

We are definitely interested in adding new collection types to the standard library. Some of these would make great GSoC projects.

Of the collections you listed, I particularly like stack and queue. A double-ended queue type backed by a ring buffer would cover both of these, and it would have clear engineering applications. (It would also be a useful stepping stone towards tree-based collections -- in the form of in-memory B-trees, whose nodes could be represented by ring buffers.)

This wouldn't be an easy project -- I would put it somewhere between medium and hard. While the actual algorithms involved are relatively straightforward (I myself have a prototype dating back to Swift 2), successfully integrating a double-ended queue type into the standard library has some interesting challenges. It requires careful study of the API-level functionality provided by Collection protocols, as well as deep knowledge of the stdlib's internals. All functionality needs to be fully tested by newly written unit tests in the stdlib's test suite, including exhaustive tests covering baseline collection behavior. New benchmarks need to be written to allow us to track the performance of the implementation across releases. And, of course, the new public APIs must be discussed on the evolution forums, concerns raised by peers need to be addressed, culminating in a proposal that (hopefully) passes review and is accepted by the Core Team.

All this needs a level of determination to see things through, even if the lists of tasks seems daunting at first. That said, I do believe implementing a standard deque would be possible within the constraints of a GSoC project, and I'd be happy to mentor it.

(If you'd prefer a less high-profile, lower risk project, check out the project idea for "Scalability benchmarks for the Swift Standard Library"!)

Gumichocopengin8 · March 6, 2019, 4:22am

Hello @lorentey,

Thank you for being interested in my proposal.
I understand the difficulty of the project, but I am still interested in doing the project because I want to use these collections in my iOS applications and learn swift deeply.
What am I supposed to do to write a proposal for GSoC to get knowledge of stdlib and so on?

Thank you

lorentey · March 7, 2019, 4:06am

That's good motivation!

I believe the first task for a prospective GSoC student is to come up with a convincing project proposal.

But before we get to that, if I were you, I would want to make sure I'd enjoy working with the stdlib. A GSoC project takes a lot of commitment. So get some practice -- download and compile Swift, and start breaking things! Some ideas:

Can you make sense of the code? (The stdlib is written in a somewhat unusual Swift dialect, and it may take some getting used to. It deals a lot with low-level builtins and even some language-level constructs that don't typically occur in regular Swift development. Not to mention gyb!)
Try adding a new public type, or some extension method -- figure out what the editing/recompiling experience is like. (You'd probably spend a good chunk of time doing it during the project.)
Try running the test suite; can you figure out where/how it is implemented? Find and look through the tests we run for collection types. Is it easy to make sense of them? Think about how a new Collection type like a deque could be integrated into the existing tests. What tests can we reuse? What new tests will we need to add?
Make some changes to an existing stdlib method that will obviously break it; does it cause a test failure in the test suite? (If not, please submit a PR with a new test!) What does a test failure look like? Try debugging it while pretending you don't know the cause.
If you come across a bug you're able to fix, submit a PR for it! This will give you a feel for what our review process is like.
Select some stdlib algorithm and find its benchmark(s). Try to optimize the algorithm by tweaking its code, and see if it improves benchmark results. (Submit a PR if so!) If you don't find benchmarks for the algorithm you selected, try writing a PR to add one.
For a more esoteric task: can you figure out how the debugger interprets Swift's collections? It is able to print Arrays, Dictionaries etc without actually running Swift code in the target process. What would it take to add support for a new stdlib type?

(Feel free to ask me to review any PRs you might submit. If you hit a stumbling block, let me know -- I can help (or we can figure it out together), and we may want to add documentation or make changes to smooth it out for the next person. These things make great starter PRs, too.)

Obviously this isn't part of the proposal process, but these experiments will probably tell you whether you'd enjoy spending a few months (or more!) on our codebase, and they will give you a great start on scoping out the potential deliverables in the proposal. As I said, implementing a collection type is just a start -- integrating it into the standard library takes a lot more effort! You need to be aware of the areas that need to be touched, and you need to figure out how much you can reasonably achieve in the time available. Then you need to convincingly communicate this in the proposal.

It would be impressive if the proposal included a draft of the public API for the type you'd like to implement -- just as if you were writing the Motivation and Design Details parts of a Swift Evolution proposal. By doing this well, you'd demonstrate proficiency with Swift's conventions, and a sense of good taste in API design. Naturally the interface you propose would be just a rough draft -- we'd refine it together during the project, and it will evolve even further when we pitch it on the evolution forums.

Please don't hesitate to write; I'm happy to answer questions and to discuss ideas.

Gumichocopengin8 · March 7, 2019, 6:14am

Thank you very much for your advice!

I pulled swift code from GitHub, so I'll practice with the idea you gave me.

Please don't hesitate to write; I'm happy to answer questions and to discuss ideas.

I appreciate it. I will definitely ask you some questions.

Again, thank you very much.
I am really looking forward to contributing to swift with you.
I'll email you about me soon.

shreyaspapi · March 13, 2019, 6:46pm

Hey @lorentey ,

The idea of adding a stack, queue, circular linked list, 2-way linked list in collection library is great.
If possible I would also like to work on this in GSoC 2019. Is it possible for 2 people work on the same project?

Please help me proceed.

Thank You

lorentey · March 13, 2019, 9:23pm

As I understand it, projects are limited to a single student. There is a limited number of students accepted each year, and having more than one student working on a project would be excessive. It also makes it difficult to evaluate each student on their own merits, and potentially changes the one-to-one mentor--mentee dynamic into a team environment.

However, I believe it's okay to have students propose similar projects. Obviously, competing like that does add another level of difficulty -- so it's worth exploring other potential projects before settling on this. However, if you're confident you'd be the right person to work on something, then by all means, go for it!

Gumichocopengin8 · March 20, 2019, 5:44pm

Hello,

I am working on these ideas you mentioned before recently.
I have some questions for these because I couldn't find some commands or detailed instructions.

Can you make sense of the code? (The stdlib is written in a somewhat unusual Swift dialect, and it may take some getting used to. It deals a lot with low-level builtins and even some language-level constructs that don't typically occur in regular Swift development. Not to mention gyb!)

The code makes sense for me. I couldn't get some, but I understood it to google it.
I'd like to make sure the directory.
I need to work on swift/stdlib/public/ and test/stdlib, right?
I don't need to look at swift/stdlib/ except for swift/stdlib/public/, right?

Try running the test suite; can you figure out where/how it is implemented? Find and look through the tests we run for collection types. Is it easy to make sense of them? Think about how a new Collection type like a deque could be integrated into the existing tests. What tests can we reuse? What new tests will we need to add?

I run utils/build-script -Rt, but I think it's a wrong command to test stdlib test suits.
What command am I supposed to use to test test/stdlib?
I'm confused how to check test suits. I read https://github.com/apple/swift/blob/master/docs/Testing.md, but it is ambiguous for me.

Select some stdlib algorithm and find its benchmark(s). Try to optimize the algorithm by tweaking its code, and see if it improves benchmark results. (Submit a PR if so!) If you don't find benchmarks for the algorithm you selected, try writing a PR to add one.

I run utils/build-script -RtB.
Is it correct to find its benchmark?

Could you tell me the process how you implement and debug swift stdlib, please?
That would be helpful to implement quickly.

As I asked you before, Swift compile error - #5 by Gumichocopengin8, could run utils/build-script --debug --Xcode but still I cannot run utils/build-script --debug. Compiler says error: 'futimens' is only available on macOS 10.13 or newer [-Werror,-Wunguarded-availability-new], but my macOS version is the latest. Are there any problems?

I'm sorry to ask a lot of questions.

Thank you.

lorentey · March 26, 2019, 7:24pm

The standard collection types live in stdlib/public/core; the test infrastructure is implemented in modules under stdlib/private -- the StdlibCollectionUnittest folder is of particular interest there.

The standard library's tests are split between test/stdlib and validation-test/stdlib. The latter folder contains some important collection tests.

build-script -Rt runs all tests in test/, including compiler, stdlib and SDK overlay tests, amongst others. build-script -RT runs all of the tests, including tests in validation-test.

These build commands enable runtime assertions in the stdlib, so they aren't appropriate for benchmarking, and don't run tests that require an optimized stdlib. To run those, you'll need to create an optimized build using build-script -RT --no-swift-stdlib-assertions.

The Testing.md document you linked below describes how to run only the stdlib tests. This is a useful time-saving measure if you're only working on the stdlib, but there is no requirement to use it.

To understand how testing works, and to add new tests, you'll need to learn a bit about LLVM's flexible testing tools lit and FileCheck that drive Swift's testing. Every test file under test/ and validation-test/ is a standalone lit script that describes how to run the test.

// RUN: %target-run-simple-swift | %FileCheck %s
// REQUIRES: executable_test

// CHECK: Hello
print("Hello")

Lit looks at these meta-level comments embedded in the test files themselves to figure out what to do, literate-programming style. In this case, FileCheck is used to perform the actual tests based on what the program prints on its output.

LLVM's docs will tell you what RUN, REQUIRES, CHECK mean. Lit is configured by lit.cfg files in the test directories; these are Python scripts that define Swift-specific details like when the executable_test requirement is fulfilled, and how %target-run-simple-swift expands to an actual command. I think it's best to learn it by example -- look at the existing tests, and figure out how they work.

I don't believe so! With Xcode 10.2, build-script -RT should work just fine.

The "futimens" error in CMakeError.log is part of CMake's configuration process; it does not indicate an actual problem. (I can't tell you what went wrong without the relevant parts of the actual build output.) To try resolve it, I'd start by running util/update-checkout to make sure my repositories are up to date, then I'd remove the whole build folder, and I'd try running util/build-script -rT.

If I were you, I would not use --Xcode; it's not what we use to build the project, and it may or may not actually work.

Gumichocopengin8 · April 1, 2019, 6:55pm

@lorentey
Hello,

Thanks to your advices, I now understand commands to run test, benchmark etc.
Thank you!!

I submitted a pull request.
Could you please review it?
Also, I sent you the link of my draft via email and GSoC platform.
Could you also review it and provide feedback?

Thank you,
Keita