Violet - Python VM written in Swift

Violet - Python VM written in Swift on GitHub

Features:

It also uses some advanced Swift features like:

  • Unsafe[Raw/Mutable]Pointers + a lof of pointer arithmetic - a single Python object is represented as an UnsafeRawPointer with manually calculated offsets for fields.
  • ManagedBufferPointer + tagged pointers - our custom BigInt implementation is an union (via tagged pointer) of Int32 (called Smi, after V8) and a heap allocation (magnitude + sign representation) with ARC for garbage collection.
46 Likes

woah this is awesome!

1 Like

That's pretty cool! What's the performance like? How does it compare to Python's usual VM?

2 Likes

I have not worked on performance, so I am not really sure.

Also, comparing Violet with CPython is not exactly apples-to-apples:

  • They have garbage collection, we don't.

    We allocate objects, but the only way to deallocate them is to call py.destroy() (py represents a python context - owner of all of the objects). This is why running our python tests (make pytest) will take 300mb of memory, just to release all of it when this particular test ends.

    There are also other good sides to garbage collection, for example you can reuse deallocated objects. In CPython releasing a tuple does not destroy it, instead they put it into a reuse list, so that the next time you need a tuple with the same size you can just reuse the existing object.

  • They have global interpreter lock (GIL), we don't need it, because we do not expose any concurrency primitives. Keynote on Concurrency by Raymond Hettinger is a very good introduction to GIL.

  • CPython has .pyc files, we don't.
    .pyc file is a cached compilation result, so that the next time you run the code you do not have to parse it (this is also the reason why you have to run the tests twice if you want to contribute to CPython). So, obviously there is no single Python performance measurement, because there are 2 input models (parsing or loading .pyc).

  • We use Swift primitives to build Python objects.

    This has pros and cons. For example both PyTuple and PyList store elements in [PyObject]. This is nice since we can create a tuple from a list by just coping the [PyObject] without coping the underlying buffer.

    But since tuples are immutable we could store elements in tail allocation - allocate more space after the tuple and store elements there (this is called flexible array member in C). The final memory layout would look like this:

    Current: [Object header][Pointer to array buffer]
    New:     [Object header][Tuple elements]
    

    This limits the pointer chasing and could be better for cpu cache.

    We also use Swift.String.unicodeScalars in PyString. The better way would be to have our own string storage (probably utf8, maybe with tail allocation because str is immutable). CPython has 3 string representations.

  • etc.

The same thing (but on the smaller scale) happens to our BigInt - while we do expose everything that you would expect from BigInt library, we are quite different form other BigInt libraries (I think that attaswift/BigInt is the most popular one):

  • we only have BigInt while most libraries have to support both BigInt and BigUInt. This has performance cost that we do not need (most of the time BigInt is just a BigUInt with a isNegative: Bool property), in Violet we store everything together (signAndCount: UInt followed by digits).

  • BigInts are really complicated, you have things like: what rounding method should right shift (>>) use? For example: -1932735284 >> 5 = ? (-1932735284 / 32 = -60397977.625).

    Engine Result
    attaswift/BigInt -60397977
    Wolfram Alpha -60397977
    Node v17.5.0 -60397978
    Python 3.7.4 -60397978
    Swift 5.3.2 -60397978
    Violet -60397978

    Surprisingly all of the answers are correct.

  • we optimize for small integers (Int32) and store them inside pointer to avoid heap allocation. This may or may not be the case for general purpose library (that depends on the use-cases that they target).

Arguably the better way (for Violet) could be storing digits inside PyInt with tail allocation (ints are immutable in python). Since int is a very popular type we could take this even further and make every pointer to python object either an actual pointer or small inlined integer - just like JavaScript engines (V8/SpiderMonkey) do (this holds true even with pointer compression).

Btw. I have never worked on Violet performance. From my experience those type things are quite unpredictable - the only way to check if something is a problem is to run the instruments and not guess. For example Swift-plays-Pokemon (Game Boy emulator) had a problem with following code:

public final class Cpu {
  private unowned let interrupts: Interrupts

    private func getAwaitingInterruptHandlingRoutine() -> InterruptHandlingRoutine? {
      if self.interrupts.isEnabledAndSet(.vBlank) { return .vBlank }
      if self.interrupts.isEnabledAndSet(.lcdStat) { return .lcdStat }
      if self.interrupts.isEnabledAndSet(.timer) { return .timer }
      if self.interrupts.isEnabledAndSet(.serial) { return .serial }
      if self.interrupts.isEnabledAndSet(.joypad) { return .joypad }

      return nil
  }
}

Swift compiler inserted retain/release before every self.interrupts access (which does make sense since the variable has to be available for the whole duration of the method call, even if it is unowned). The fix was quite simple: retain it just once. That was 20fps in release build.

5 Likes

Have you thought about making a version of PythonKit that supports Violet instead of CPython? I'd be happy to help, as I regularly contribute to PythonKit. Depending on how quickly it compiles, I would like to experiment with using it for Swift-Colab. That could provide tighter low-level integration with the Python interpreter and prevent a lot of conversions between Swift and Python objects.

Edit: Making a Swift-Colab backend out of this could also serve as a great validation test and result in addition of new features. That's how subclassing, lambdas, and kwargs got into PythonKit.

2 Likes

I forgot to mention that Violet only implements the core language itself without any additional modules. This means that importing anything except for most basic modules (like: builtins, sys, _imp, _os, _warnings and importlib) is not supported.

Although you can import other Python files, so technically you could import Python standard library. This is what RustPython (Python interpreter written in Rust) does. Beware: I have not tested it, and also:

  • in some cases they implement some things in C so we would have to implement them in Swift (this is the RustPython version of this).
  • they sometimes use things that we do not have, like comprehensions.

Also, Violet is not CPython API compatible, as in: we do not expose all of the CPython functions like Py_IncRef etc. This means that the special adapter would be needed for numpy etc.

PythonKit

Have you thought about making a version of PythonKit that supports Violet instead of CPython?

Most of the interpreters have a separate objects and execution (this is not exactly 'runtime') layers. In Violet all of our objects are implemented in Sources/Objects and execution in Sources/VM. In Python objects are stored in Objects and execution in Python.

The main goal of 'execution' is manipulating objects to get specific goal. So, for example this is how in stack based VMs implement 2+2:

  /// Implements `TOS = TOS1 + TOS`.
  internal func binaryAdd() -> InstructionResult {
    let right = self.stack.pop()
    let left = self.stack.top

    switch self.py.add(left: left, right: right) {
    case let .value(result):
      self.stack.top = result
      return .ok
    case let .error(e):
      return .exception(e)
    }
  }

To answer your question: Violet has already built-in PythonKit inside Objects module. You can use and create Python objects without using the VM as execution engine. This is what we do in out unit tests:

func test__eq__() {
  // Create Python context
  let py = Py(config: self.config,
                delegate: self.delegate,
                fileSystem: self.fileSystem)

  // 'asObject' converts from PyInt -> PyObject
  let leftObject = py.newInt(3).asObject
  let rightObject = py.newInt(4).asObject
  let result = py.isEqual(leftObject, rightObject)

  // Check if 'result' is correct.
}

Compilation

Depending on how quickly it compiles, I would like to experiment with using it for Swift-Colab.

While Violet implements a sizeable portion of the language, I think that CPython is much… much… better choice.

If the compilation performance is a problem then:

  • try to use newer Python versions - for example PEP 617 added PEG parser, so while it is not an holy grail, it is a tiny bit faster than the previous one (from what I have seen).
  • check if the .pyc files are used - they contain compiled bytecode, so the interpreter can skip the compilation altogether.

Conversions Swift <-> Python

That could provide tighter low-level integration with the Python interpreter and prevent a lot of conversions between Swift and Python objects.

I don't think marshaling can be avoided regardless of the choice of the underlying Python interpreter. Well… unless you have an interpreter build specially for this, but I think that Python as a language is too complex/big for this.

For example this is what lua does (based on this github mirror, side note: I have never used lua, I just looked up their sources to answer this question, I may be totally wrong):

/* Union of all Lua values */
typedef union Value {
  struct GCObject *gc;    /* collectable objects */
  void *p;         /* light userdata */
  lua_CFunction f; /* light C functions */
  lua_Integer i;   /* integer numbers */
  lua_Number n;    /* float numbers */
} Value;


/* Tagged Values. This is the basic representation of values in Lua: an actual value plus a tag with its type. */
#define TValuefields	Value value_; lu_byte tt_

typedef struct TValue {
  TValuefields;
} TValue;

#define LUA_INTEGER		long long
typedef LUA_INTEGER lua_Integer;

LUA_API void lua_pushinteger(lua_State *L, lua_Integer n) {
  lua_lock(L);

  // Expanding 'setivalue(s2v(L->top), n);' macro:
  TValue *io=((&(L->top)->val));
  val_(io).i=(n);
  settt_(io, LUA_VNUMINT);

  // Increments the top of the register stack or whatever…
  // From what I see registers are kept in the run-time stack (array).
  api_incr_top(L);
  lua_unlock(L);
}

Basically in the register they store TValue { Value.i = n, tt_ = LUA_VNUMINT }.

So, yeah… they store what is essentially C long long directly in register without the need to wrap it inside the Lua/Python object (like PyInt in Violet or PyLongObject in CPython). But the tradeoff is that (and I may be wrong on this, so Lua people don't shout at me) lua is much simpler than Python.

EDIT: 'lua is much simpler than Python' is not exactly correct. I think that the better expression would be: Lua was not designed to be (or compete with) Python. Just like Python was not designed to more-readable/better C. Python is not slow, it is just that it was not designed to be fast. In real life I would rather write a server in Python/TypeScript than in C (unless it is totally necessary).

1 Like

@LiarPrincess thanks for your well-thought out response to my question.

My concern was about compiling the Swift package itself, not about compiling Python code. My overhaul to Swift-Colab was focused on reducing first-time startup time, decreasing it from 54 seconds to 31 seconds. There were also other improvements like caching toolchains and Swift packages build products.

I had to include the PythonKit source files inside the Swift-Colab repository to chop a few seconds off of the load time. It takes more than one second to call git clone from the PythonKit repository. Compiling PythonKit by itself takes 2 seconds with -c release -Xswiftc -Onone, because it's just a few Swift files. You stated that your repo was one of the largest Swift repositories in existence, so I expect it to take orders of magnitude longer.

My primary concern about marshalling data between Python and Swift was the old mechanism for getting image data from IPython; the one used in the old google/swift-jupyter and Swift-Colab 1.0. It converted data between Python and Swift objects several times. My new Swift-Colab eliminates those data transfers, although I haven't measured whether XPC through the disk from LLDB dwarfed the cost of them. Or whether it even matters.

Although both of my questions had answers that seem discouraging, I still stand by the idea that making a Swift-Colab backend out of this is a good idea. It may be months or years away, and too slow or take too long to load for anyone to use it practically. But it would be an amazing validation test and stress test. Many Python features used in the ipythonkernel Python code base are not yet supported by Violet, such as yield and async. The effort would push Violet to greater parity with CPython.

I made it a goal to minimize the reliance on Python libraries in Swift-Colab 2.0. It should be possible to remove the dependency on sys, which is only required as a workaround for expanding paths. I could also rely on Swift's JSON decoding instead of the Python json library. The interaction with Python objects is minimal enough that I could even do with using CPython functions directly instead of the high-level PythonKit (although it would be tedious).

Long-term, do you think it would be a good idea to at least attempt to build an alternative Swift-Colab backend based on Violet instead of PythonKit?

Ohhh… Now I think I know what you mean.

Looking at swift-colab/blob/main/install_swift.sh:

  1. Install swift toolchain (or reuse existing if possible)
  2. Clone swift-colab
  3. LLDB stuff
  4. swiftc JupyterKernel
  5. call JupyterKernel.JupyterKernel_registerSwiftKernel

Few thoughts:

  • Instead of git clone --single-branch I think you can use --depth 1. (This is also a good tip for local development - nobody has time to download the whole git repository).

  • Can't you create a package with already compiled binaries? Just a tiny .zip file with everything inside. This package would be Swift version specific, but that's not a problem (you just host package-5.5.0.zip for Swift 5.5.0). Then you just need to download Swift toolchain and the .zip package. If the toolchain for a given Swift version is already present then the .zip was also already installed.

    [Offtopic] In a business setting you (almost) never want to calculate things when the user wants, you want to calculate them BEFORE. This is the main idea behind data warehouses: you use ETL process to 'compile' the data (into a cube or something) and when the user wants something the response is almost instantaneous.

    As for the %install (for example: %install '.package(url: "https://github.com/pvieito/PythonKit.git", .branch("master"))' PythonKit) - I don't know how it works (I have never used Google colab or notebooks). Maybe most common packages can also be precompiled and then just downloaded in their final form (or even included in this .zip)?

Answers

My concern was about compiling the Swift package itself, not about compiling Python code. My overhaul to Swift-Colab was focused on reducing first-time startup time, decreasing it from 54 seconds to 31 seconds. There were also other improvements like caching toolchains and Swift packages build products.

Violet compilation times will be too long. Due to language complexity Swift build times are quite long (especially for clean builds which is what we are talking about).

I still stand by the idea that making a Swift-Colab backend out of this is a good idea. It may be months or years away, and too slow or take too long to load for anyone to use it practically.

CPython is maintained by some of the best developers in the word, if there is some problem then multiple people will look into it. Also, your users will expect more-or-less up-to-date Python version with all of the goodies (like numpy). Violet has none of those.

My primary concern about marshalling data between Python and Swift was the old mechanism for getting image data from IPython; the one used in the old google/swift-jupyter and Swift-Colab 1.0. It converted data between Python and Swift objects several times. My new Swift-Colab eliminates those data transfers, although I haven't measured whether XPC through the disk from LLDB dwarfed the cost of them. Or whether it even matters.

You can't avoid this if you want to use Python libraries, though I'm not sure where does LLDB come into this.

The interaction with Python objects is minimal enough that I could even do with using CPython functions directly instead of the high-level PythonKit (although it would be tedious).

Maybe combining most common packages and installing them regardless of whether the user asked for them could be a solution? You could install PythonKit at startup.

Offtop

Btw. I'm kind of surprised that there are no Swift playgrounds in Swift language guide. For example when they are explaining the for loop there would be a small playground with code snippet that the user can run. Though, I guess this is different than your use case.

Your analysis of my script was on-point! I cache Swift toolchains so that you can switch back and forth without re-downloading them. There is also functionality for switching back to the Python Jupyter kernel (required for downloading a new toolchain without factory resetting the runtime). All of these features were introduced in Swift-Colab 2.0.

This is getting off topic, so I'm going to hide my responses to prevent from clouding the discussion.

I just incorporated that into main. Downloading the Git repo happens too quickly to measure, but even the tiniest speedup is desired.

I could, but it's going to take some time to implement. This task has such a massive scope that it will necessitate a new major release (Swift-Colab 3.0). I just got done rewriting Swift-Colab from scratch, and I'm too burnt out to finish the job (documentation is still out of date).

I would likely pull the .build directory out of a Swift package, strip unused files, and put it in a .zip. There would need to be a standardized, extremely foolproof and backwards-compatible way of doing this. You would need the ability to compile a package in one toolchain (e.g. release 5.7) and use it from a different one (e.g. release 5.8). I'm using only development toolchains to compile Swift for TensorFlow (S4TF), where AutoDiff is rife with compiler bugs and not ABI stable.

Then, we have the problem of who is going to host the compiled packages. I would likely be the only developer who compiles their own modules into binaries for Colab, and keeps those binaries up to date. I am maintaining S4TF, which takes 2 minutes to compile with -c release -Xswiftc -Onone and 3 minutes with the default debug build. Once S4TF can compile on release toolchains, I would host Colab zip files under the releases tab on GitHub.

%install is the built-in way of downloading a Swift package, building it, and inserting it into the cache. If you call the %install line with the same configuration of SwiftPM flags (%install-swiftpm-flags) and the same toolchain as you did in the previous runtime session, SwiftPM automatically skips rebuilding it. It also imports the package into LLDB, so that you can import its Swift modules. Again, it would really help if I updated the documentation like I should have a while ago.

I looked into swift-colab.

Ooo… I get why you are using LLDB: you don't know the whole Swift source code ahead of time, so you keep LLDB in process to submit user inputs. Clever, very clever.

Compilation

I think that the simplest thing would be to have a server that periodically (~30min) checks if there is a new version of a given package. If so, then it compiles it and publishes at some url. There would be a 1:1 mapping between Swift version and a module, because most of the time users want the newest version. If the module is not pre-compiled then you would compile it at runtime.

If google-colab does not allow fetching things from random domains (aka. mining bitcoins) then you can just create a github repo with branches for every Swift version and push .zip there.

At runtime (when user executes import) how do you know if the package is pre-compiled (and hosted on github)? Well… while initializing Swift-colab you can also include a .txt file with all of the modules and their urls.

Although you can also return 404 which would mean that it has to be compiled during notebook runtime. This is nice because you could store all of those 404 in database and use this information to decide which modules to pre-compile.

If for compilation you have to be in google-colab environment (and not on a Ubuntu box) then maybe Puppeteer - Headless Chrome Node.js API can help. This is basically a Chrome controlled from JavaScript. Yes, this is a hack…

As for the server cost:

  • AWS gives you EC2 free for a year - although this year will expire surprisingly quickly and then you have to pay.
  • GCP has a free tier that includes e2-micro VM free forever.

Measuring performance

When it comes to performance I would go with data-driven approach, which basically means: measure it. This is the only way of knowing what is the problem.

For example for command execution you would measure:

Then you would execute some notebooks and check measurements. Though the choice of the specific notebooks matters, this has to be the expected user input. It is tempting to go with a lot of graph/images etc, but I would assume that in real world they are <5% of the inputs.

A few years ago Google used a GameBoy emulator to optimize V8 performance. This sounds great for a JIT compiler - perfect test case. Well… they dropped it later - JavaScript on most websites has totally different characteristics than GameBoy, so they were optimizing for something that was not their use-case.

Tbh. my approach to performance is: don't worry about it. If it is bad then use Instruments/dotTrace/SQL statistics/whatever to find the real problem. Then fix this problem and don't worry about other things. There is a possibility of death by 1000 cuts (everything has 5% and you don't know what to do), but from my experience most of the time you would get 1 or 2 main culprits. But, maybe I'm just really bad at guessing where the problem is.

EDIT: Small note on measuring time on google-colab: this is a VM/docker/whatever. This may mean that during execution it was paused, time went on, and then your task was resumed. This pause time will be included in your measurement! You don't want this. to solve this you can run it multiple times. You can also capture the whole distribution of times and then analyse it.

2 Likes

Thanks for your insight! I had thought of some of those ideas before rewriting Swift-Colab, but it seemed too daunting to piece together into something tangible. If there is a Swift-Colab 3.0, it will definitely incorporate your suggestions.

I am happy to see Violet here. You made a lot of progress!

One additional comment about RustPython native modules.
We recently splited them to core builtin modules and extended stdlib.
The linked one stdlib is not related to vm itself. So skipping them results to just missing feature of the module.
On the other hand, the other collection vm/stdlib is related to vm functions. If the goal of Vielot if to fully support vm functions, this part will be necessary in future.
I wish we have more shared pure-python library of stdlib, so not to write everything natively for each implementations.

Ooo… I didn't know that you are also on Swift forum.

I am happy to see Violet here. You made a lot of progress!

The whole Python object representation has changed.

In Rust you have repr(C) which gives you C layout with all of its nice guarantees. This means that you can do following (simplified RustPython code):

pub type PyIntRef = PyRef<PyInt>;

#[repr(transparent)]
pub struct PyRef<T: PyObjectPayload> {
    ptr: Box<PyInner<T>>,
}

#[repr(C)]
struct PyInner<T> {
    ref_count: RefCount,
    typ: PyRwLock<PyTypeRef>,
    dict: Option<InstanceDict>,
    payload: T,
}

In Swift we do not have repr(C). Also struct is not guaranteed to be layout compatible with similar struct. So, for example this is not allowed:

struct PyObjectHeader { type: String }
struct PyObject { header: PyObjectHeader }
struct PyInt { header: PyObjectHeader, value: Int }

let int: Box<PyInt> = …
let intObject = int as Box<PyObject>
print(intObject.type)

I don't want to bore anyone with the alternatives, but eventually Violet settled with manually calculated layouts: allocate enough memory to fit the data type and then use calculated offsets to get specific (and hopefully properly aligned) members.

To simplify things we use Sourcery - meta-programming library, so that our type definition looks like this:

// sourcery: pytype = object
struct PyObject {
  // sourcery: storedProperty
  var type: PyType { return self.typePtr.pointee }
  // sourcery: storedProperty
  var __dict__: Lazy__dict__ { return self.__dict__Ptr.pointee
  }
}

Then Sourcery picks all of the storedProperty declarations, creates a layout and all of the xxxPtr accessors.

This is nice because from the Swift programmer side you only declare what you need and everything else is taken care of behind the scenes. However, it also means that adding a new stored property is a bit complicated:

  1. Add
    // sourcery: storedProperty
    var type: PyType { fatalError() }
    
  2. Compile
  3. Run make gen to regenerate Sourcery files
  4. Replace fatalError() with self.typePtr.pointee

Anyway, really good job on #14 Implement positional-only arguments. I am really impressed. Even for me, this would be a complicated thing to do (and I know Violet quite well).

No technical comments about Violet, but as someone who likes Python, it's cool to see another VM!

@LiarPrincess wrote:
I'm kind of surprised that there are no Swift playgrounds in Swift language guide. For example when they are explaining the for loop there would be a small playground with code snippet that the user can run.

(For context, I'm the current writer maintaining "The Swift Programming Language".)

The chapter A Swift Tour was written exactly like you mention — you can download an Xcode playground version of the chapter, to run each code listing and make changes to it. We did consider other approaches, like making every code listing downloadable as a playground, but found it didn't generalize well. Code listings in a playground seem to work best when they're written like the tour, with at least one natural change that the reader can make.

1 Like

Thoughts about garbage collection…

Currently we allocate objects, but the only way to deallocate them is to call py.destroy() (py represents a python context - owner of all of the objects). So, basically this is a really bad implementation of arena allocator.

Manual retain/release

Nope…

Writing proper garbage collection

That would be: “serial, stop-the-word, non-compacting and generational garbage collector”. This is non-trivial and does require a lot of code.

Move-only pointers

This would give us compiler-checked reference counting (no unbalanced retain/release).

Semantics

  • assign (=) is move

  • copy is explicit

  • deint is always called on lifetime end

  • methods allow specifying whether the argument is:

    Argument Semantics Syntax
    Moved Transfers ownership (calls move). elsa: consume Princess or
    elsa: Princess (better)
    Borrowed Does not call move/copy, does not affect lifetime. elsa: borrow Princess or
    elsa: Princess&
    Copied Calls some predefined method like copy(other: Self) or init(copy: Self).
    This one is optional, you can just copy before call and move into arg.
    elsa: copy Princess

Then we would define a type with following semantics:

  • move = nop
  • copy = retain
  • deinit = release, then if refCount == 0: deallocate

Collections/aggregates would be pain to deal with, but oh… well… (also, we can define our own collections with get = copy and delete = take ownership semantic).

Obviously, we would still need a garbage collector to resolve reference cycles…

Implementation

// Helper protocol to which all of the types conform to (already exists).
protocol PyObjectMixin: MoveOnly {
  var ptr: RawPtr { get }
  init(ptr: RawPtr)
}

extension PyObjectMixin {
  /// Create a new reference to this object.
  func copyRef() -> Self {
    let result = Self(ptr: self.ptr)
    result.referenceCount += 1 // Stored on the heap at 'self.ptr + someOffset'
    return result
  }
}

struct PyInt: PyObjectMixin, MoveOnly {
  let ptr: RawPtr { get }

  init(ptr: RawPtr) {
    self.ptr = ptr
  }

  // 'MoveOnly' enables 'deinit' on 'struct'
  deinit {
    self.referenceCount -= 1
    if self.referenceCount == 0 {
      // Let it go, let it go
      // Can't hold it back anymore
      // Let it go, let it go
      // Turn away and slam the door
    }
  }
}

Usage:

func foo() {
  let int: PyInt = …
  let intCopy = int.copyRef() // ref count += 1
  bar(intCopy) // consumes 'intCopy', calls 'ref count -= 1'

  print("foo: \(int)") // ok
  print("foo: \(intCopy)") // compiler error, 'intCopy' was consumed by 'bar'
}

func bar(int: consume PyInt) {
  print("bar: \(int)")
  // end of lifetime: ref count -= 1
}

Downcasting

Downcasting would be weird, because it would always retain. We can't express: "valid cast -> move", "failed cast -> nop" semantics:

let object: PyObject = …
if let int = py.cast.asInt(object) {
  // 'object' was moved into 'int'
} else {
  // 'object' is still valid
}

The only way to do this would be to return CastResult = succeeded(PyInt) | failed(PyObject) type (which always moves).

Making some types copyable

It would be tempting to opt out of the reference counting for some types (for example None - it is immortal). The problem is that they store move-only (reference counted) members, and as soon as you store at least 1 move-only member then you also become move-only. Can we convince the compiler that we know what we are doing?