Memory management without ARC?

Hi everyone,

I'm currently writing a Scheme interpreter in Swift. My original plan was
to use Swift's memory management to handle objects in Scheme.
Unfortunately, Scheme's data structures are inherently cyclic and I'm
struggling to find a good approach to manage memory via ARC.

An example that highlights the problem are arrays that contain elements
that refer to the array itself. In Scheme, such arrays can be easily
created like this:

  (define vec (vector 1 2 3))
  (vector-set! vec 1 vec)

Vector vec now refers to itself via index 1.

For representing a Scheme array, my interpreter simply uses an object that
wraps a Swift array. The array elements point at Scheme objects. So, they
could point at the array itself. Since I don't want to loose objects that
are linked off the array only, I need to use strong references in the
array. Obviously, this creates a serious memory leak. Here's some code
illustrating the design and the construction of vec:

  enum SchemeValue {
    case Nil
    case Num(Int)
    indirect case Cons(SchemeValue, SchemeValue)
    case Vector(WrappedArray<SchemeValue>)
  }

  class WrappedArray<T> {
    var array: [T]
    init(_ elem: T...) {
      self.array = elem
    }
  }

  let wa = WrappedArray<SchemeValue>(.Num(1), .Num(2), .Num(3))
  let vec = SchemeValue.Vector(wa)
  wa.array[1] = vec

I haven't found a good way to address the issue. What I would ideally need
is manual access to retain/release functions. I would also be willing to
write my own memory manager for such objects, but also this seems to be
impossible in Swift. As a workaround, I currently built a mark/sweep
garbage collector for all Scheme objects I allocate (by keeping them all in
a weak object pool — also hand-written because it doesn't exist in Swift).
But I consider this application-specific garbage collector that runs on top
of ARC a huge heck.

So, my question is: What are Swift programmers supposed to do if they are
dealing with inherently cyclic data structures that cannot be broken up via
weak references? It almost seems like there's some memory management
functionality missing in Swift that makes this possible.

== Matthias

You need to write code that walks through the data structures, recognizes which parts are no longer live, and clears the strong references to them. This is one of the age-old fundamental problems with reference counting — I remember dealing with it in Smalltalk-80, whose GC was based on refcounts.

I think you’re going to have to end up writing a mark/sweep garbage collector for your Scheme interpreter. A basic implementation is really simple.

Just off the top of my head, I think you’d use only unowned references between your Scheme objects. Then to keep the objects alive, you have a master array/set/whatever that holds strong references to all of them. After doing the recursive marking of all your objects, you traverse the master set and remove all the objects whose mark flags aren’t set.

—Jens

···

On Jan 30, 2016, at 7:17 AM, Matthias Zenger via swift-users <swift-users@swift.org> wrote:

So, my question is: What are Swift programmers supposed to do if they are dealing with inherently cyclic data structures that cannot be broken up via weak references? It almost seems like there's some memory management functionality missing in Swift that makes this possible.

Doing your own GC is probably the best you can do today. You can manually retain and release references to Swift classes using the Unmanaged<T> type, which might help interoperate Swift ARC objects with your GC.

-Joe

···

On Jan 30, 2016, at 7:17 AM, Matthias Zenger via swift-users <swift-users@swift.org> wrote:

Hi everyone,

I'm currently writing a Scheme interpreter in Swift. My original plan was to use Swift's memory management to handle objects in Scheme. Unfortunately, Scheme's data structures are inherently cyclic and I'm struggling to find a good approach to manage memory via ARC.

An example that highlights the problem are arrays that contain elements that refer to the array itself. In Scheme, such arrays can be easily created like this:

  (define vec (vector 1 2 3))
  (vector-set! vec 1 vec)

Vector vec now refers to itself via index 1.

For representing a Scheme array, my interpreter simply uses an object that wraps a Swift array. The array elements point at Scheme objects. So, they could point at the array itself. Since I don't want to loose objects that are linked off the array only, I need to use strong references in the array. Obviously, this creates a serious memory leak. Here's some code illustrating the design and the construction of vec:

  enum SchemeValue {
    case Nil
    case Num(Int)
    indirect case Cons(SchemeValue, SchemeValue)
    case Vector(WrappedArray<SchemeValue>)
  }

  class WrappedArray<T> {
    var array: [T]
    init(_ elem: T...) {
      self.array = elem
    }
  }

  let wa = WrappedArray<SchemeValue>(.Num(1), .Num(2), .Num(3))
  let vec = SchemeValue.Vector(wa)
  wa.array[1] = vec

I haven't found a good way to address the issue. What I would ideally need is manual access to retain/release functions. I would also be willing to write my own memory manager for such objects, but also this seems to be impossible in Swift. As a workaround, I currently built a mark/sweep garbage collector for all Scheme objects I allocate (by keeping them all in a weak object pool — also hand-written because it doesn't exist in Swift). But I consider this application-specific garbage collector that runs on top of ARC a huge heck.

So, my question is: What are Swift programmers supposed to do if they are dealing with inherently cyclic data structures that cannot be broken up via weak references? It almost seems like there's some memory management functionality missing in Swift that makes this possible.

So, my question is: What are Swift programmers supposed to do if they are dealing with inherently cyclic data structures that cannot be broken up via weak references? It almost seems like there's some memory management functionality missing in Swift that makes this possible.

You need to write code that walks through the data structures, recognizes which parts are no longer live, and clears the strong references to them. This is one of the age-old fundamental problems with reference counting — I remember dealing with it in Smalltalk-80, whose GC was based on refcounts.

I think you’re going to have to end up writing a mark/sweep garbage collector for your Scheme interpreter. A basic implementation is really simple.

That's what I ended up doing. It just feels like a huge hack: it messes up your code and the impact on the performance of the interpreter is serious.

Just off the top of my head, I think you’d use only unowned references between your Scheme objects. Then to keep the objects alive, you have a master array/set/whatever that holds strong references to all of them. After doing the recursive marking of all your objects, you traverse the master set and remove all the objects whose mark flags aren’t set.

Yes, that was my first attempt. The problem is that Swift 2 has no good support for weak references beyond weak properties. I'm using arrays throughout my code and those use strong references. Furthermore, my basic data structure is an enum with associated values. I haven't figured out a way to refer to the associated values weakly. This doesn't work:

  enum SchemeValue {
    ...
    case Vector(weak WrappedArray<SchemeValue>)
  }

So, what I ended up doing is to implement a mark/sweep garbage collector which is holding only weak references in an object pool. All references in the application are strong. After the mark phase, I then clear objects that are not marked but that are still weakly linked in the object pool. These are the objects that have cyclic dependencies. I just clear them (reset all references) and ARC deallocates them.

This approach has the same implementation problem I describe above when it comes to implement an array of weak references. I use something like this:

  struct WeakReference<T: AnyObject> {
    weak var obj: T?
  }
  class ObjectPool<T: AnyObject> {
    var references: [WeakReference<T>]
    ...
  }

It does have the benefit that this ugly workaround via struct WeakReference<T> is only needed in the ObjectPool<T> implementation and isn't spread throughout the whole codebase.

Nevertheless, this collaborative garbage collection scheme where the mark/sweep collector handles cycles and ARC handles the rest doesn't feel like a good solution. It's, at best, a workaround that is impossible to optimize. This is why I was wondering if there's anything else I could do...

== Matthias

Doing your own GC is probably the best you can do today. You can manually retain and release references to Swift classes using the Unmanaged<T> type, which might help interoperate Swift ARC objects with your GC.

I found some references to Unmanaged<T> on the web as well, but couldn't figure out how to use it with Swift classes. It seems like Unmanaged<T> exists for interfacing with objects that are not created/managed via the Swift runtime. But as soon as I instantiate a Swift class, it uses the Swift runtime. Is there a way to create an object of a Swift class such that it's not managed by the Swift runtime?

== Matthias

Yes, this is best answer. Just plain ordinary Swift.

-david

···

On Sat, Jan 30, 2016 at 12:36 PM, Andrew W. Donoho via swift-users < swift-users@swift.org> wrote:

I suspect your best answer is to actually build something out of swift
collection and sequence primitives.

It's inconvenient, but you can use a wrapper struct to work around this limitation:

struct Weak<T: class> { weak var value: T? }

enum SchemeValue { case Vector(Weak<WrappedArray<SchemeValue>>) }

-Joe

···

On Jan 30, 2016, at 10:56 AM, Matthias Zenger via swift-users <swift-users@swift.org> wrote:

Yes, that was my first attempt. The problem is that Swift 2 has no good support for weak references beyond weak properties. I'm using arrays throughout my code and those use strong references. Furthermore, my basic data structure is an enum with associated values. I haven't figured out a way to refer to the associated values weakly. This doesn't work:

  enum SchemeValue {
    ...
    case Vector(weak WrappedArray<SchemeValue>)
  }

Matthias,

  What you’ve described is essentially the Python memory management system. It is reference counted along with a cycle analyzer. It works quite well. I would not reject it out of hand too soon.

  While the standard answer is to use Swift structures for everything, there are Cocoa structures that work with with weak references. NSPointerFunctionsWeakMemory is one such. There are others, <NSPointerFunctions | Apple Developer Documentation.

  I suspect your best answer is to actually build something out of swift collection and sequence primitives.

Anon,
Andrew

···

On Jan 30, 2016, at 12:56 , Matthias Zenger via swift-users <swift-users@swift.org> wrote:

Nevertheless, this collaborative garbage collection scheme where the mark/sweep collector handles cycles and ARC handles the rest doesn't feel like a good solution. It's, at best, a workaround that is impossible to optimize. This is why I was wondering if there's anything else I could do…

____________________________________
Andrew W. Donoho
Donoho Design Group, L.L.C.
andrew.donoho@gmail.com, +1 (512) 666-7596, twitter.com/adonoho

New: Spot marks the taX™ App, <http://SpotMarksTheTaX.com>
Retweever Family: <http://Image.Retweever.com>, <http://Retweever.com>

No risk, no art.
  No art, no reward.
    -- Seth Godin