Upstreaming the Python module

rxwei · April 3, 2019, 9:23am

Swift 5.0 now has both dynamic member lookup and dynamic callable. The main motivation in those proposals was to embrace dynamic languages (e.g. Python) with syntactic sugar for interoperability, and Python interoperability was the first adopter of these features.

There are multiple implementations of Python interoperability. Having originated and evolved from @Chris_Lattner3's original prototype, the most notable ones are:

Python, a core module maintained under the tensorflow branch and released as part of Swift for TensorFlow toolchains, and
PythonKit, a Swift package maintained by @pvieito.

These two libraries are equivalent and kept in sync. The library maintainers have been working on improvements together (thanks to @pvieito) . This has worked well so far, but an issue is becoming obvious:

Users of Swift for TensorFlow, who are more familiar with machine learning and data science, are really enjoying immediate access to Python by importing Python as a core library like Foundation. In Swift official releases, Python is not part of the Swift core libraries, so users will need to add PythonKit as a package dependency, but this is less convenient for writing Swift scripts that interoperate with Python.

As such, I'd like to bring up a few questions about upstreaming.

Should the Python module be upstreamed to an official Swift.org repository (e.g. apple/swift-python)?
Should it be available as a core library like Foundation, distributed as part of the toolchain?
Should there be a formal process for considering this kind of additions?
If Python gets to become a core library...
- What should the module be called? If there's no name conflicts, the name should be obvious. However, macOS has a Python.framework, which contains Python runtime C APIs that you can import via import Python.
- How should its APIs evolve?

jayton · April 3, 2019, 9:47am

I suggest not. Instead, Python should be removed from Swift for TensorFlow in favour of using PythonKit.

The inconvenience of adding a dependency is a tooling problem that should be solved through better tools, not by packaging common or convenient libraries with the language.

lukasa · April 3, 2019, 1:14pm

I continue to want to ride the “fewer things in the main distribution” train. In particular, if the Python module gets upstreamed, feature requests for supporting in the main distribution also need to be treated reasonably.

I believe we should be investing in making obtaining dependencies easy and reliable. PythonKit seems to be easy to install, so I don’t think we have much of a worry there for now. I think we should keep investing in that support.

nuclearace · April 3, 2019, 1:27pm

I’ll echo what’s already been said. I think it’s better to keep that out as a separate dependency. IMO it’d be a shame if that Python package suddenly was burdened by the massively heavyweight process that is Swift Evolution. Having it be separate means it can evolve at a much faster rate, with possibly better/more features (even experimental ones, something that really isn’t available in Swift yet.)

Id see about centralizing the TensorFlow dependency with PythonKit if there’s fear they might drift.

But I think it’s enough that we have the machinery to build excellent wrappers around dynamic languages. I don’t see why we need to slow down the evolution of the actual wrappers by making them core Swift projects.

rxwei · April 3, 2019, 1:41pm

I personally agree with the sentiment that's been said here, while I can also see some pragmatic benefits.

I am neither for nor against making Python a core library, but just starting a discussion based on Chris's request here :)

pvieito · April 3, 2019, 1:52pm

I think we should split this question into two parts:

Should the Python interop module be part of the official Swift projects? Yes! We could create a new official Swift repo with the Python interop module (apple/swift-python, akin to apple/swift-log?).
Should the Python interop be distributed with the toolchain? I am very ambivalent to this. Swift already has built-in support for C and Objective-C interop, so built-in Python interop support would not be alien to the language. On the other hand, the Python interop module does not require any extra compiler support and can easily be distributed as separated Swift PM package, which can benefit of evolving and being developed more quickly.

In conclusion, I think the Python interop module should be moved to its own official Swift repo (as a Swift PM package), but it should not be distributed with the official toolchains for now.

Yes, I think we should be able to use Swift Evolution to propose packages/modules to become part of the official Swift project.

Python (deprecating the macOS module), PythonKit (), PythonInterop?

rxwei · April 3, 2019, 2:10pm

This is the discussion I’m looking for! I also believe it should be in its own repo regardless of whether it’s going to be distributed with the toolchain. I split the first question into two.

Chris_Lattner3 · April 3, 2019, 3:24pm

One related question: how does this all relate to the Python overlay module on the mac?

masters3d · April 3, 2019, 4:14pm

Yes to up streaming. Could it be its own repo? Or could we have a swift repo called OtherLangSupport with support for python and others via local packages vended as one?

I'm thinking keeping it a package is the best route. If included in the tool-chain, how does one switch python versions?

Lantua · April 3, 2019, 4:25pm

I’d prefer to have a default package folder (much like ~/.local/bin that REPL can look for when I do import Python

rxwei · April 3, 2019, 7:16pm

If we are to make Python an SwiftPM package instead of a core library, I don't see any problem (I think the system module gets shadowed when you have a local SwiftPM dependency). If we wanted Python to be a core library, then either we should rename Python or Apple should rename the system Python module to something like CPython. I think it's more accurate for the system module to be called CPython.

compnerd · January 21, 2020, 6:08pm

It has been a while, but, I would like to continue where this thread left off.

I think that keeping the Python interop in a repository external to the actual compiler but allowing it to be built optionally (and subsequently packaged into a distribution image) is a reasonable compromise between the two approaches.

Longer term, I think that splitting up the monolithic build and enabling different toolchains to vend a differing set of libraries is a reasonably flexible approach to enable use-cases which we may not have explicitly considered. This could both setup some of that infrastructure as well as make the Python interop module available on the master branch.

I am hoping to attempt this approach soon so we should at least have the ability to see what that workflow feels like. Due to simplicity, I am currently considering using the PythonKit repository as a starting point.

rxwei · January 21, 2020, 6:25pm

That would be great. Thanks Saleem! If I remember correctly, PythonKit is identical to what’s in the tensorflow branch, so starting with that feels reasonable.