Pitch: Method to sum numeric arrays

I’m not sure if you’re trying to expand the pitch, or use a slippery-slope argument in opposition to it, or what.

Regardless, average doesn’t work for integer types (at least, the return type would have to be different from the element type), and I’m not sure product is used anywhere near as often as sum.

I might say that a hypothetical Math library which sits alongside the standard library but is only available if you write import Math, could accommodate product, average, variance, and the like.

But sum is probably the only one that occurs frequently enough in general code to be worth considering for the standard library.

There would need to be a stronger motivation than this. Unlike min, there is a natural answer to summing an empty sequence. If nearly all uses of sum would be written sum() ?? 0, this undermines the slim readability benefits. For those (I suspect extremely rare) cases where you want to handle the empty case differently from the non-empty but sum to zero case, there are other alternatives that don't force the optionality on all users of the method.

7 Likes

Nope, just genuinely inquiring about any future directions we might have in mind, and how we can preempt them in our design.

1 Like

My concern is that once we bake *some* behavior into sum over FloatingPoint, it would probably be unacceptable to change the result later even to make it more accurate.

So, we need to decide up-front if we want accurate rounding, or compensated summation, or block accumulation, or naively calling reduce, or what.

The .NET implementation I use in C# that inspired me for this is really simple (take a look at https://github.com/Microsoft/referencesource/blob/master/System.Core/System/Linq/Enumerable.cs), just a foreach

public static float Sum(this IEnumerable<float> source) {
    if (source == null) throw Error.ArgumentNull("source");
    double sum = 0;
    foreach (float v in source) sum += v;
    return (float)sum;
}
1 Like

Note that it uses double precision for the sum variable when summing single precision floats though.

1 Like

The question is what does it do with doubles?

All the other types, so int, long, double, decimal, use themselves as the counter in the .NET implementation. Only float is an exception, using double as the counter and the casting to float.

+1 from me. I encountered a .reduce(0, +) instance as late as today, and tried to replace it with .sum(). I was a bit surprised that it wasn't available; I work a lot with inherited code, and it must have been part of at least two other projects I've worked on, giving me the impression that it was already part of the standard library. Things which are implemented repeatedly as extensions probably deserve uplifting into the standard library. As simple as these things may be, the fact that I can rely on their availability and a uniform naming and syntax makes life as a consultant so much easier.

I think if sum over numeric arrays is added with this pitch then product over numeric arrays should also be added within the same proposal, as both summation over indices as well as products over indicies appear in math.

It would be inconsistent and surprising if summing over arrays does work, but multiplying over arrays does not.

Why is this better?

3 Likes

map uses underestimatedCount to reserve space for the result, so that we don't need to re-allocate as many times while constructing the result array (or at all, if the estimate is accurate). In the case of sum, there is no dynamically-sized result to avoid resizing, so there's no conceivable benefit from doing this.

I'm also a bit confused that you agree that it should mirror max/min, but your sketch explicitly varies from max/min even more dramatically than the pitch does (not only is the result non-optional, but you've made it a property instead of a function).

Thanks for the clarification. I was under the impression that there is also some inherent benefit to using for _ in instead of while let.

iterator.next() and the accumulator are the loop-carried dependencies no matter how you spell the loop. It is plausible that there would be small codegen differences, but they'll be swamped by what happens in the .next() calls (or the accumulation, for types that don't have simple addition) in almost all scenarios.

With everyone discussing the implementation specifics now, does it mean a "naive" implementation (equivalent to reduce(0,+)) would be OK?

Unfortunately I don’t think so. If we introduce the simple reduce version for AdditiveArithmetic, it will apply to all types thereunto conforming, which includes FloatingPoint types.

If we subsequently decide it would be better to change the implementation for FloatingPoint, that would alter the behavior of existing (at that future time) code. Perhaps it might be acceptable to document that the behavior for FloatingPoint is not guaranteed to be correct, and may change in the future, but that seems undesirable.

To move this pitch forward I think it is time to ask: What are the requirements for the implementation?

As mentioned earlier, .NET Core just has a special case for float summing it as double and then converting the result back to float, all other numeric types are summed in a straightforward way.

Sorry to bump this thread, but I think the while loop is 8X faster than calling reduce or running a for-in loop.

I don't believe my own benchmarking. Please check me.

Could it be that you benchmarked a debug build? In that case that's probably the explanation. See also the following topic:

2 Likes

Ha! Yes. Thank you.

I'm tempted to withdraw my post, but will leave it for posterity.

2 Likes