Adding a new filter method which returns 2 arrays


(Arman Shan) #1

Hi,

I was wondering if we could add a new filter method to the stdlib, where
instead of returning just one array, it returns a tuple of 2 arrays, one
with items that the closure returned true for and one with items that the
closure returned false for. What do you think?

Thank you,

Arman


(Nate Birkholz) #2

That seems like a solid idea.

···

On Fri, Jan 15, 2016 at 8:02 AM, Arman Shanjani via swift-evolution < swift-evolution@swift.org> wrote:

Hi,

I was wondering if we could add a new filter method to the stdlib, where
instead of returning just one array, it returns a tuple of 2 arrays, one
with items that the closure returned true for and one with items that the
closure returned false for. What do you think?

Thank you,

Arman

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
Nate Birkholz


(Robert Widmann) #3

If we're going to add a partition method, can the original filter be rewritten in terms of it?

~Robert Widmann

2016/01/15 12:07、Nate Birkholz via swift-evolution <swift-evolution@swift.org> のメッセージ:

···

That seems like a solid idea.

On Fri, Jan 15, 2016 at 8:02 AM, Arman Shanjani via swift-evolution <swift-evolution@swift.org> wrote:
Hi,

I was wondering if we could add a new filter method to the stdlib, where instead of returning just one array, it returns a tuple of 2 arrays, one with items that the closure returned true for and one with items that the closure returned false for. What do you think?

Thank you,

Arman

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
Nate Birkholz
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


New 'partioned(by:)' method
(Tino) #4

I would prefer a more general approach:
Return value could be a dictionary whose values are arrays, and the supplied function could by anything that maps target contents to possible keys (Bool would be possible, but also attributes like "lastName", "age"…).
But it isn't that hard to write this on your own, so I'm not sure if it's fundamental enough for stdlib...

Tino


(Arman Shan) #5

Yes. I was thinking we would just copy over the filter implementation, and it should only need a small modification to keep track of the items being filtered, and finally return 2 arrays.

···

On Jan 15, 2016, at 12:11 PM, Developer <devteam.codafi@gmail.com> wrote:

If we're going to add a partition method, can the original filter be rewritten in terms of it?

~Robert Widmann

2016/01/15 12:07、Nate Birkholz via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> のメッセージ:

That seems like a solid idea.

On Fri, Jan 15, 2016 at 8:02 AM, Arman Shanjani via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Hi,

I was wondering if we could add a new filter method to the stdlib, where instead of returning just one array, it returns a tuple of 2 arrays, one with items that the closure returned true for and one with items that the closure returned false for. What do you think?

Thank you,

Arman

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
Nate Birkholz
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Arman Shan) #6

The dictionary return will be a more complicated version of the partition method. The reason I thought the partition method would be a trivial addition to the stdlib is because filter pretty much does all that work; it just doesn’t return the “other” items. so this function would be sort of a “filter2”.

···

On Jan 16, 2016, at 7:46 AM, Tino Heth <2th@gmx.de> wrote:

I would prefer a more general approach:
Return value could be a dictionary whose values are arrays, and the supplied function could by anything that maps target contents to possible keys (Bool would be possible, but also attributes like "lastName", "age"…).
But it isn't that hard to write this on your own, so I'm not sure if it's fundamental enough for stdlib...

Tino


(David Sweeris) #7

I’m not sure if it’s currently done this way… In theory, filter can allocate enough contiguous memory to store all the results up front, skip all the overhead of expanding an array, and if it doesn’t all get used, well not caring (as much) about memory fragmentation is (one reason) why we have a 64-bit address space. This would be harder to do (or necessarily double the storage requirements) if filter could return two arrays, either of which could potentially contain as many elements as the original array.

Which is not to say I’m against having this because it does sound handy, I just don’t want it instead of the current filter function.

- Dave Sweeris

···

On Jan 15, 2016, at 09:59, Arman Shan via swift-evolution <swift-evolution@swift.org> wrote:

Yes. I was thinking we would just copy over the filter implementation, and it should only need a small modification to keep track of the items being filtered, and finally return 2 arrays.

On Jan 15, 2016, at 12:11 PM, Developer <devteam.codafi@gmail.com <mailto:devteam.codafi@gmail.com>> wrote:

If we're going to add a partition method, can the original filter be rewritten in terms of it?

~Robert Widmann

2016/01/15 12:07、Nate Birkholz via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> のメッセージ:

That seems like a solid idea.

On Fri, Jan 15, 2016 at 8:02 AM, Arman Shanjani via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Hi,

I was wondering if we could add a new filter method to the stdlib, where instead of returning just one array, it returns a tuple of 2 arrays, one with items that the closure returned true for and one with items that the closure returned false for. What do you think?

Thank you,

Arman

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
Nate Birkholz
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(David Sweeris) #8

That only works if the “infrastructure” storage doesn’t have to be in a specific location relative to the element storage. If so, then yes, you’re right. The elements of the one built from the end would be in reverse order, but that’s an entirely solvable problem, IMHO, either through a post-filtering reverse, or just by making the behavior clear in the API notes.

Or we could just not care… I’m not even sure this type of optimization makes a difference these days, with CPUs so often being instruction starved (although embedded and lower-end CPUs like the Core-M might not have as many cycles to spare, so maybe it does matter).

···

On Jan 15, 2016, at 13:08, Jaden Geller wrote:

The total space necessary for both arrays will be exactly that of the original array (plus whatever constant space details need be stored about the array). Thus, the filter implementation could just move items to the front or the back of this contiguous storage space, and then return arrays backed by this storage. I don't necessarily think this level of optimization is necessary here, but it's definitely possible.

Jaden Geller
Sent from my iPhone

On Jan 15, 2016, at 11:22 AM, Dave via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I’m not sure if it’s currently done this way… In theory, filter can allocate enough contiguous memory to store all the results up front, skip all the overhead of expanding an array, and if it doesn’t all get used, well not caring (as much) about memory fragmentation is (one reason) why we have a 64-bit address space. This would be harder to do (or necessarily double the storage requirements) if filter could return two arrays, either of which could potentially contain as many elements as the original array.

Which is not to say I’m against having this because it does sound handy, I just don’t want it instead of the current filter function.

- Dave Sweeris

On Jan 15, 2016, at 09:59, Arman Shan via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Yes. I was thinking we would just copy over the filter implementation, and it should only need a small modification to keep track of the items being filtered, and finally return 2 arrays.

On Jan 15, 2016, at 12:11 PM, Developer <devteam.codafi@gmail.com <mailto:devteam.codafi@gmail.com>> wrote:

If we're going to add a partition method, can the original filter be rewritten in terms of it?

~Robert Widmann

2016/01/15 12:07、Nate Birkholz via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> のメッセージ:

That seems like a solid idea.

On Fri, Jan 15, 2016 at 8:02 AM, Arman Shanjani via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Hi,

I was wondering if we could add a new filter method to the stdlib, where instead of returning just one array, it returns a tuple of 2 arrays, one with items that the closure returned true for and one with items that the closure returned false for. What do you think?

Thank you,

Arman

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
Nate Birkholz
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Arman Shan) #9

I definitely agree that this function should not replace the current filter function. We can give it a new name like "partition" or something.

Regarding the memory constraint: since both filter and partition methods will be provided, the developer may choose whether they prefer performance over memory fragmentation or not.

Or maybe at the end of the partition method, if one of the result arrays is significantly smaller than the other, we copy the data into a smaller array and then return.

···

On Jan 15, 2016, at 2:22 PM, Dave via swift-evolution <swift-evolution@swift.org> wrote:

I’m not sure if it’s currently done this way… In theory, filter can allocate enough contiguous memory to store all the results up front, skip all the overhead of expanding an array, and if it doesn’t all get used, well not caring (as much) about memory fragmentation is (one reason) why we have a 64-bit address space. This would be harder to do (or necessarily double the storage requirements) if filter could return two arrays, either of which could potentially contain as many elements as the original array.

Which is not to say I’m against having this because it does sound handy, I just don’t want it instead of the current filter function.

- Dave Sweeris

On Jan 15, 2016, at 09:59, Arman Shan via swift-evolution <swift-evolution@swift.org> wrote:

Yes. I was thinking we would just copy over the filter implementation, and it should only need a small modification to keep track of the items being filtered, and finally return 2 arrays.

On Jan 15, 2016, at 12:11 PM, Developer <devteam.codafi@gmail.com> wrote:

If we're going to add a partition method, can the original filter be rewritten in terms of it?

~Robert Widmann

2016/01/15 12:07、Nate Birkholz via swift-evolution <swift-evolution@swift.org> のメッセージ:

That seems like a solid idea.

On Fri, Jan 15, 2016 at 8:02 AM, Arman Shanjani via swift-evolution <swift-evolution@swift.org> wrote:
Hi,

I was wondering if we could add a new filter method to the stdlib, where instead of returning just one array, it returns a tuple of 2 arrays, one with items that the closure returned true for and one with items that the closure returned false for. What do you think?

Thank you,

Arman

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
Nate Birkholz
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Jeremy Pereira) #10

The dictionary return will be a more complicated version of the partition method. The reason I thought the partition method would be a trivial addition to the stdlib is because filter pretty much does all that work; it just doesn’t return the “other” items. so this function would be sort of a “filter2”.

I think the point of the dictionary return is that it gives you a generalised version of partition (why would you only ever want to partition a set into only two subsets?). The naive implementation is not *very* complicated. I just wrote it in about five minutes. It lets you do some interesting things, but whether it is worth having in the standard library is for others to decide.

···

On 17 Jan 2016, at 05:55, Arman Shan via swift-evolution <swift-evolution@swift.org> wrote:

On Jan 16, 2016, at 7:46 AM, Tino Heth <2th@gmx.de> wrote:

I would prefer a more general approach:
Return value could be a dictionary whose values are arrays, and the supplied function could by anything that maps target contents to possible keys (Bool would be possible, but also attributes like "lastName", "age"…).
But it isn't that hard to write this on your own, so I'm not sure if it's fundamental enough for stdlib...

Tino

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(David Sweeris) #11

I definitely agree that this function should not replace the current filter function. We can give it a new name like "partition" or something.

+1 on “partition”. That makes the semantics much clearer, IMHO.

Regarding the memory constraint: since both filter and partition methods will be provided, the developer may choose whether they prefer performance over memory fragmentation or not.

Or maybe at the end of the partition method, if one of the result arrays is significantly smaller than the other, we copy the data into a smaller array and then return.

Oh, yeah… I was thinking the excess storage would be deallocated at the end anyway. Honestly, this whole bit about super-optimizing the memory allocation assumes that the actual element storage is managed like C arrays (with a pointer to [0], and using pointer math to get the other elements) behind the scenes, and I don’t actually know that it’s done that way.

- Dave Sweeris

···

On Jan 15, 2016, at 11:49, Arman Shan <armanshan12@gmail.com> wrote:


(Arman Shan) #12

We could do it the easy way and just allocate 2 arrays of original size and return them.

We can also do the optimized way which isn't so user-friendly: make a new structure that contains an array and an index internally. The array will be the original array just "sorted" based on the predicate function. And the index is for where the second half starts. The public API would show 2 arrays somehow and gives the illusion that there are 2 arrays.

- Arman

···

On Jan 15, 2016, at 5:37 PM, Dave via swift-evolution <swift-evolution@swift.org> wrote:

On Jan 15, 2016, at 11:49, Arman Shan <armanshan12@gmail.com> wrote:

I definitely agree that this function should not replace the current filter function. We can give it a new name like "partition" or something.

+1 on “partition”. That makes the semantics much clearer, IMHO.

Regarding the memory constraint: since both filter and partition methods will be provided, the developer may choose whether they prefer performance over memory fragmentation or not.

Or maybe at the end of the partition method, if one of the result arrays is significantly smaller than the other, we copy the data into a smaller array and then return.

Oh, yeah… I was thinking the excess storage would be deallocated at the end anyway. Honestly, this whole bit about super-optimizing the memory allocation assumes that the actual element storage is managed like C arrays (with a pointer to [0], and using pointer math to get the other elements) behind the scenes, and I don’t actually know that it’s done that way.

- Dave Sweeris

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Brent Royal-Gordon) #13

We can also do the optimized way which isn't so user-friendly: make a new structure that contains an array and an index internally. The array will be the original array just "sorted" based on the predicate function. And the index is for where the second half starts. The public API would show 2 arrays somehow and gives the illusion that there are 2 arrays.

The problem is, you would still need some kind of array to tell you which elements fell into each partition. So I don't think that helps us very much.

···

--
Brent Royal-Gordon
Architechies


(David Sweeris) #14

Yeah… If the “optimized” route isn’t available, I don’t think trying to fake it is worth adding complexity to *Arrays*. If this were some heavy-weight object where it already has tons of overhead, then maybe, but not something as relatively basic and low-level as an Array.

- Dave Sweeris

···

On Jan 15, 2016, at 19:03, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

We can also do the optimized way which isn't so user-friendly: make a new structure that contains an array and an index internally. The array will be the original array just "sorted" based on the predicate function. And the index is for where the second half starts. The public API would show 2 arrays somehow and gives the illusion that there are 2 arrays.

The problem is, you would still need some kind of array to tell you which elements fell into each partition. So I don't think that helps us very much.

--
Brent Royal-Gordon
Architechies

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution