Swift Reflection

I know that there has been some discussion around improving reflection in Swift and I wanted to add to the discussion with some of the work I have been trying to do using the Swift Language. I have been investigating using Swift to create a framework that provides a programming API to process data and execute functions in parallel on a cluster. The framework needs to be able instantiate these functions on the cluster workers and have the data processed by the functions. The plans are to use one of the existing cluster managers, such as Spark or Storm. As of today, I have been looking at using Spark. There would be a predefined set of functions supported such as map, fliter, join, etc. as defined by the cluster manager.

In my experimenting, I have run into a number of issues which I haven't been able to solve due to the limited support for reflection in Swift. In my description of the issues, I'm going to use APIs based on Spark since that is the cluster manager I have been playing with.

Parameter and Return types

The following is an example of a Swift class that maps to the RDD class in Spark.

public class RDD {
public func collect() throws -> [T] {
....
}
}

The value of T could be any basic type to a class. Even if the types are limited to basic types and known Spark types, the list of possibilities is large. From one of the Spark examples, T would be

Tuple2<Int32, Tuple3<Int32, Int32, Double>>

The possible combinations of types is too large to be hard coded given Spark supports Tuples with 22 different types. I can get the type of T in a string, but haven't found a way to instantiate the type using the string. Is there some way around this problem?

User-Defined Functions

A programmer would define functions that will be executed on a cluster to process data. The programmer doesn't need to do special packaging of functions that run on a cluster. The programmer would code a filter function against the cluster the same way as the filter function for a Swift array. For instance, for a filter method such as the following:

let result = RDD.filter({ (value) -> Bool in
return value > 15
})

The framework would need to be able to do reflection on the function to get the information needed to instantiate and call the function on the cluster workers. Following is some of the information needed:

Module name
Class/Struct name
Function name
Parameter names and type information

Once on the cluster the framework would need to do the following:

  1. Instantiate the parameters. Again, a parameter could be a basic type to a class.
  2. Dynamically load/import the module containing the function.
  3. Find the function in the module that matches the signature.
  4. Call the function.
  5. Handle the return type.

With the existing Swift support for reflection, I couldn't get all of the information that is needed and what information I could get wasn't in a very convenient form. In some cases, I needed to parse a string to get the different parameter types. Even if I had the information, I didn't see a way to use the information to load the module and execute the function. My plans are to require the programmer to pass the location of modules and dependencies that need to be deployed to the cluster workers on application startup. Given the limitations of reflections in Swift, I don't see how this framework could be implemented. Since this needs to run on Linux, I want to avoid any solution that uses Objective C.

Thanks

Bob

Robert Goodman

If you’re going to build on top of something like Spark it seems you’d have better luck wrapping the JNI and using Swift protocols to try to automate away as much of the boilerplate of creating JNI classes dynamically.

~Robert Widmann

···

On Oct 18, 2016, at 9:40 AM, Robert Goodman via swift-evolution <swift-evolution@swift.org> wrote:

I know that there has been some discussion around improving reflection in Swift and I wanted to add to the discussion with some of the work I have been trying to do using the Swift Language. I have been investigating using Swift to create a framework that provides a programming API to process data and execute functions in parallel on a cluster. The framework needs to be able instantiate these functions on the cluster workers and have the data processed by the functions. The plans are to use one of the existing cluster managers, such as Spark or Storm. As of today, I have been looking at using Spark. There would be a predefined set of functions supported such as map, fliter, join, etc. as defined by the cluster manager.
In my experimenting, I have run into a number of issues which I haven't been able to solve due to the limited support for reflection in Swift. In my description of the issues, I'm going to use APIs based on Spark since that is the cluster manager I have been playing with.

Parameter and Return types

The following is an example of a Swift class that maps to the RDD class in Spark.

public class RDD <T> {
   public func collect() throws -> [T] {
   ....
   }
}

The value of T could be any basic type to a class. Even if the types are limited to basic types and known Spark types, the list of possibilities is large. From one of the Spark examples, T would be

  Tuple2<Int32, Tuple3<Int32, Int32, Double>>

The possible combinations of types is too large to be hard coded given Spark supports Tuples with 22 different types. I can get the type of T in a string, but haven't found a way to instantiate the type using the string. Is there some way around this problem?

User-Defined Functions

A programmer would define functions that will be executed on a cluster to process data. The programmer doesn't need to do special packaging of functions that run on a cluster. The programmer would code a filter function against the cluster the same way as the filter function for a Swift array. For instance, for a filter method such as the following:

let result = RDD.filter({ (value) -> Bool in
    return value > 15
})

The framework would need to be able to do reflection on the function to get the information needed to instantiate and call the function on the cluster workers. Following is some of the information needed:
                             
  Module name
  Class/Struct name
  Function name
  Parameter names and type information
            
Once on the cluster the framework would need to do the following:

  1. Instantiate the parameters. Again, a parameter could be a basic type to a class.
  2. Dynamically load/import the module containing the function.
  3. Find the function in the module that matches the signature.
  4. Call the function.
  5. Handle the return type.

With the existing Swift support for reflection, I couldn't get all of the information that is needed and what information I could get wasn't in a very convenient form. In some cases, I needed to parse a string to get the different parameter types. Even if I had the information, I didn't see a way to use the information to load the module and execute the function. My plans are to require the programmer to pass the location of modules and dependencies that need to be deployed to the cluster workers on application startup. Given the limitations of reflections in Swift, I don't see how this framework could be implemented. Since this needs to run on Linux, I want to avoid any solution that uses Objective C.

    Thanks
      Bob

Robert Goodman

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

In my experimenting, I did use JNI with swift protocols to solve some of the problems. I also played with some of the techniques that Spark is using to support Python and R. Without Swift reflection I haven't found a way to solve some of the problems and still create an acceptable programming model. For instance, the goal is that a developer writes a filter function likes he does for a Swift Array

let result = RDD.filter({ (value) -> Bool in
return value > 15
})

I haven't been able to find a solution to that problem without having reflection.

Thanks

Bob

Robert Goodman

···

----- Original message -----
From: Robert Widmann devteam.codafi@gmail.com
To: Robert Goodman/Austin/IBM@IBMUS
Cc: swift-evolution@swift.org
Subject: Re: [swift-evolution] Swift Reflection
Date: Sat, Oct 22, 2016 9:23 PM

If you’re going to build on top of something like Spark it seems you’d have better luck wrapping the JNI and using Swift protocols to try to automate away as much of the boilerplate of creating JNI classes dynamically.

~Robert Widmann

On Oct 18, 2016, at 9:40 AM, Robert Goodman via swift-evolution swift-evolution@swift.org wrote:

I know that there has been some discussion around improving reflection in Swift and I wanted to add to the discussion with some of the work I have been trying to do using the Swift Language. I have been investigating using Swift to create a framework that provides a programming API to process data and execute functions in parallel on a cluster. The framework needs to be able instantiate these functions on the cluster workers and have the data processed by the functions. The plans are to use one of the existing cluster managers, such as Spark or Storm. As of today, I have been looking at using Spark. There would be a predefined set of functions supported such as map, fliter, join, etc. as defined by the cluster manager.

In my experimenting, I have run into a number of issues which I haven't been able to solve due to the limited support for reflection in Swift. In my description of the issues, I'm going to use APIs based on Spark since that is the cluster manager I have been playing with.

Parameter and Return types

The following is an example of a Swift class that maps to the RDD class in Spark.

public class RDD {
public func collect() throws -> [T] {
....
}
}

The value of T could be any basic type to a class. Even if the types are limited to basic types and known Spark types, the list of possibilities is large. From one of the Spark examples, T would be

Tuple2<Int32, Tuple3<Int32, Int32, Double>>

The possible combinations of types is too large to be hard coded given Spark supports Tuples with 22 different types. I can get the type of T in a string, but haven't found a way to instantiate the type using the string. Is there some way around this problem?

User-Defined Functions

A programmer would define functions that will be executed on a cluster to process data. The programmer doesn't need to do special packaging of functions that run on a cluster. The programmer would code a filter function against the cluster the same way as the filter function for a Swift array. For instance, for a filter method such as the following:

let result = RDD.filter({ (value) -> Bool in
return value > 15
})

The framework would need to be able to do reflection on the function to get the information needed to instantiate and call the function on the cluster workers. Following is some of the information needed:

Module name
Class/Struct name
Function name
Parameter names and type information

Once on the cluster the framework would need to do the following:

  1. Instantiate the parameters. Again, a parameter could be a basic type to a class.
  2. Dynamically load/import the module containing the function.
  3. Find the function in the module that matches the signature.
  4. Call the function.
  5. Handle the return type.

With the existing Swift support for reflection, I couldn't get all of the information that is needed and what information I could get wasn't in a very convenient form. In some cases, I needed to parse a string to get the different parameter types. Even if I had the information, I didn't see a way to use the information to load the module and execute the function. My plans are to require the programmer to pass the location of modules and dependencies that need to be deployed to the cluster workers on application startup. Given the limitations of reflections in Swift, I don't see how this framework could be implemented. Since this needs to run on Linux, I want to avoid any solution that uses Objective C.

Thanks

Bob

Robert Goodman


swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution