I've written out the scipy.stats implementation of descriptive, summary and frequency statistics in Gleam.
This implementation strives to be:
- 🥚 Simple to use, with a consistent interface. Every dataset is a
List(Float), so the user needs to convertInts accordingly! - ⭐ Gleam-y. Functions can be chained together in pipelines, or their results
used in other computations. - 🌟 Purely functional in pure Gleam. The
foldcombinator is used extensively.map(g)followed byfold(f)is rewritten asfold(f(acc, g(a))to avoid the memory overhead of intermediate lists. - 🦦 Efficient. Gelman endeavors to perform one-pass over the data, even for higher order moments like skewness and kurtosis. If sorting is required, Gelman sorts only once, unless absolutely necessary.
- 🧪 Extensively tested. Tests are borrowed from scipy/stats, and so the results are guaranteed to be at least as accurate.
Please have a look here: https://hex.pm/packages/gelman
Gelman is named after Andrew Gelman, whose contributions to Bayesian statistics are fundamental, and whose name is a happy near-anagram of Gleam.
If you'd like me to implement other statistical functions, let me know! I plan on implementing the remaining descriptive, summary and frequency in scipy/stats and then I'd like to start on writing some statistical tests. I may reach out if I reach a roadblock, as many mathematical functions (such as C's special-functions) aren't available in Gleam yet. Please let me know if you are interested in helping out. Your comments are welcome. Have a lovely holiday season! 🎅🎅🎅