On Mon, Oct 10, 2016 at 4:02 PM, Nevin Brackett-Rozinsky < nevin.brackettrozinsky@gmail.com> wrote:
I rolled my own (rather simple) statistics struct. It had been using
Double and Array, but I just went back and made it work generically with
FloatingPoint and Sequence. Here’s what it looks like:
struct Statistic<Number: FloatingPoint> {
private var ssqDev: Number = 0
private(set) var count: Number = 0
private(set) var average: Number = 0
private(set) var maximum: Number = Number.infinity
private(set) var minimum: Number = -Number.infinity
var variance: Number { return ssqDev / (count - 1) }
var standardDeviation: Number { return sqrt(variance) }
init() {}
init<T: Sequence> (values: T) where T.Iterator.Element == Number {
addValues(values)
}
mutating func addValues<T: Sequence> (_ vals: T) where
T.Iterator.Element == Number {
for val in vals { addValue(val) }
}
mutating func addValue(_ value: Number) {
count += 1 as Number
let diff = value - average
let frac = diff / count
average += frac
ssqDev += diff * (diff - frac)
minimum = min(minimum, value)
maximum = max(maximum, value)
}
}
(Sorry for the lack of syntax highlighting—Gmail strips the formatting
when I paste it.)
Some notes:
• The approach is to look at each data point once and keep the statistics
correct for the numbers seen so far. This saves memory if the values are
being computed or fetched, since you don’t need to store them. However it
also means that the median cannot be found.
• The calculation to update “average” and “ssqDev” is simplified from the
online-algorithm
<Algorithms for calculating variance - Wikipedia;
found on Wikipedia. (“ssqDev” stores the sum of squared deviations from the
mean, which is just the sample variance times the count.)
• If you want to ignore NaN’s, just add “if value.isNaN { return }” at the
top of “addValue”.
• The “as Number” coercion shouldn’t be necessary, but I was getting an
“ambiguous use of +=” error without it.
• All occurrences of “count” were originally “n”, which was private, and I
had a computed “count” that just returned Int(n). But when I switched from
“Double” to “Number: FloatingPoint” I lost the ability to write “Int(n)”.
Nevin
On Mon, Oct 10, 2016 at 1:13 PM, Harlan Haskins via swift-users < > swift-users@swift.org> wrote:
Oh yeah, I'd love contributions and feedback! I'm essentially
implementing this as I learn things in stats 101 so it's probably woefully
inadequate. 
-- Harlan
On Oct 10, 2016, at 1:04 PM, Michael Ilseman <milseman@apple.com> wrote:
On Oct 8, 2016, at 11:29 AM, Georgios Moschovitis via swift-users < >> swift-users@swift.org> wrote:
Hey everyone,
I would like to implement a few statistics functions in Swift (e.g.
variance, standardDeviation, etc) that are computed over a collection.
I am aware of this library:
GitHub - evgenyneu/SigmaSwiftStatistics: A collection of functions for statistical calculation written in Swift.
My problem is that it only supports Doubles and Arrays. Also the API
doesn’t look very ‘swifty' to me.
You might find this library to be more Swifty: har (har) · GitHub
lanhaskins/Probably
It’s not as generic as possible nor has all the features you might need,
but the author is very responsive to feedback.
I am wondering how would someone implement such functionality in a more
generic way: to allow usage of multiple collections (even custom, e.g. a
RingBuffer) and multiple value types (e.g. Decimal, Double). Extra points
for being 'swifty'.
Thanks in advance for any ideas.
-g.
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users