fastmath.stats

Statistics functions.

Descriptive statistics.
Correlation / covariance
Outliers
Confidence intervals
Extents
Effect size
Student’s t-test
Histogram
ACF/PACF
Bootstrap
Binary measures

All functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences.

Descriptive statistics

All in one function stats-map contains:

:Size - size of the samples, (count ...)
:Min - minimum value
:Max - maximum value
:Range - range of values
:Mean - mean/average
:Median - median, see also: median-3
:Mode - mode, see also: modes
:Q1 - first quartile, use: percentile, quartile
:Q3 - third quartile, use: percentile, quartile
:Total - sum of all samples
:SD - sample standard deviation
:Variance - variance
:MAD - median-absolute-deviation
:SEM - standard error of mean
:LAV - lower adjacent value, use: adjacent-values
:UAV - upper adjacent value, use: adjacent-values
:IQR - interquartile range, (- q3 q1)
:LOF - lower outer fence, (- q1 (* 3.0 iqr))
:UOF - upper outer fence, (+ q3 (* 3.0 iqr))
:LIF - lower inner fence, (- q1 (* 1.5 iqr))
:UIF - upper inner fence, (+ q3 (* 1.5 iqr))
:Outliers - list of outliers, samples which are outside outer fences
:Kurtosis - kurtosis
:Skewness - skewness

Note: percentile and quartile can have 10 different interpolation strategies. See docs

acf

(acf data)(acf data lags)

Calculate acf (autocorrelation function) for given number of lags or a list of lags.

If lags is omitted function returns maximum possible number of lags.

Examples

Usage

(acf (repeatedly 1000 r/grand) 5)
;;=> (1.0
;;=>  0.0056672021105804715
;;=>  0.02683192836034792
;;=>  0.003505061419148288
;;=>  0.017117838382944242
;;=>  -0.014709355084377094)
(acf (repeatedly 1000 r/grand) [10 20 100 500])
;;=> (0.03608425929253231
;;=>  -0.04862331077397911
;;=>  0.0026191550786753507
;;=>  0.006099538382009882)
(acf [1 2 3 4 5 4 3 2 1])
;;=> (1.0
;;=>  0.5396825396825397
;;=>  -0.013492063492063475
;;=>  -0.4666666666666665
;;=>  -0.6269841269841269
;;=>  -0.3015873015873015
;;=>  -0.011904761904761935
;;=>  0.17777777777777773
;;=>  0.20317460317460315)

view source

acf-ci

(acf-ci data lags)(acf-ci data lags alpha)

acf with added confidence interval data.

:cis contains list of calculated ci for every lag.

Examples

Usage

(acf-ci (repeatedly 1000 r/grand) 3)
;;=> {:acf
;;=>  (1.0 0.001652674595128032 0.030605610528521364 -0.006874164003129496),
;;=>  :ci 0.06197950323045615,
;;=>  :cis (0.06197950323045615
;;=>        0.06197967251690712
;;=>        0.062037701604327464
;;=>        0.06204062757534195)}
(acf-ci [1 2 3 4 5 4 3 2 1] 3)
;;=> {:acf
;;=>  (1.0 0.5396825396825397 -0.013492063492063475 -0.4666666666666665),
;;=>  :ci 0.653321328180018,
;;=>  :cis (0.653321328180018
;;=>        0.8218653739461048
;;=>        0.8219599072345126
;;=>        0.9281841012727746)}

view source

adjacent-values

(adjacent-values vs)(adjacent-values vs estimation-strategy)(adjacent-values vs q1 q3)

Lower and upper adjacent values (LAV and UAV).

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1).

LAV is smallest value which is greater or equal to the LIF = (- Q1 (* 1.5 IQR)).
UAV is largest value which is lower or equal to the UIF = (+ Q3 (* 1.5 IQR)).
third value is a median of samples

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See estimation-strategies.

Examples

[LAV, UAV]

(adjacent-values [1 2 3 -1 -1 2 -1 11 111])
;;=> [-1.0 11.0 2.0]

Gaussian distribution [LAV, UAV]

(adjacent-values (repeatedly 1000000 r/grand))
;;=> [-2.698318697864716 2.6996062842439654 0.0016124036257504448]

view source

ameasure

(ameasure group1 group2)

Vargha-Delaney A measure for two populations a and b

Examples

Usage

(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (ameasure t c))
;;=> 0.20833333333333334

view source

binary-measures

(binary-measures truth prediction)(binary-measures truth prediction true-value)

Subset of binary measures. See binary-measures-all.

Following keys are returned: [:tp :tn :fp :fn :accuracy :fdr :f-measure :fall-out :precision :recall :sensitivity :specificity :prevalance]

Examples

Usage

(binary-measures [true false true false true false true false]
                 [true false false true false false false true])
;;=> {:accuracy 0.375,
;;=>  :f-measure 0.28571428571428575,
;;=>  :fall-out 0.5,
;;=>  :fdr 0.6666666666666667,
;;=>  :fn 3,
;;=>  :fp 2,
;;=>  :precision 0.3333333333333333,
;;=>  :recall 0.25,
;;=>  :sensitivity 0.25,
;;=>  :specificity 0.5,
;;=>  :tn 2,
;;=>  :tp 1}

Treat 1 as true value.

(binary-measures [1 0 1 0 1 0 1 0] [1 0 0 1 0 0 0 1] [1])
;;=> {:accuracy 0.375,
;;=>  :f-measure 0.28571428571428575,
;;=>  :fall-out 0.5,
;;=>  :fdr 0.6666666666666667,
;;=>  :fn 3,
;;=>  :fp 2,
;;=>  :precision 0.3333333333333333,
;;=>  :recall 0.25,
;;=>  :sensitivity 0.25,
;;=>  :specificity 0.5,
;;=>  :tn 2,
;;=>  :tp 1}

Treat :a and :b as true value.

(binary-measures [:a :b :c :d :e :f :a :b]
                 [:a :b :a :b :a :f :d :b]
                 {:a true, :b true, :c false})
;;=> {:accuracy 0.5,
;;=>  :f-measure 0.6,
;;=>  :fall-out 0.75,
;;=>  :fdr 0.5,
;;=>  :fn 1,
;;=>  :fp 3,
;;=>  :precision 0.5,
;;=>  :recall 0.75,
;;=>  :sensitivity 0.75,
;;=>  :specificity 0.25,
;;=>  :tn 1,
;;=>  :tp 3}

view source

binary-measures-all

(binary-measures-all truth prediction)(binary-measures-all truth prediction true-value)

Collection of binary measures.

truth - list of ground truth values
prediction - list of predicted values
true-value - optional, what is true in truth and prediction

true-value can be one of:

nil - values are treating as booleans
any sequence - values from sequence will be treated as true
map - conversion will be done according to provided map (if there is no correspondin key, value is treated as false)

https://en.wikipedia.org/wiki/Precision_and_recall

Examples

Usage

(binary-measures-all [true false true false true false true false]
                     [true false false true false false false true])
;;=> {:accuracy 0.375,
;;=>  :bm -0.25,
;;=>  :cn 4.0,
;;=>  :cp 4.0,
;;=>  :dor 0.3333333333333333,
;;=>  :f-beta #,
;;=>  :f-measure 0.28571428571428575,
;;=>  :f1-score 0.28571428571428575,
;;=>  :fall-out 0.5,
;;=>  :fdr 0.6666666666666667,
;;=>  :fn 3,
;;=>  :fnr 0.75,
;;=>  :for 0.6,
;;=>  :fp 2,
;;=>  :fpr 0.5,
;;=>  :hit-rate 0.25,
;;=>  :lr+ 0.5,
;;=>  :lr- 1.5,
;;=>  :mcc -0.2581988897471611,
;;=>  :miss-rate 0.75,
;;=>  :mk -0.2666666666666666,
;;=>  :npv 0.4,
;;=>  :pcn 5.0,
;;=>  :pcp 3.0,
;;=>  :ppv 0.3333333333333333,
;;=>  :precision 0.3333333333333333,
;;=>  :prevalence 0.5,
;;=>  :recall 0.25,
;;=>  :selectivity 0.5,
;;=>  :sensitivity 0.25,
;;=>  :specificity 0.5,
;;=>  :tn 2,
;;=>  :tnr 0.5,
;;=>  :total 8.0,
;;=>  :tp 1,
;;=>  :tpr 0.25}

Treat 1 as true value.

(binary-measures-all [1 0 1 0 1 0 1 0] [1 0 0 1 0 0 0 1] [1])
;;=> {:accuracy 0.375,
;;=>  :bm -0.25,
;;=>  :cn 4.0,
;;=>  :cp 4.0,
;;=>  :dor 0.3333333333333333,
;;=>  :f-beta #,
;;=>  :f-measure 0.28571428571428575,
;;=>  :f1-score 0.28571428571428575,
;;=>  :fall-out 0.5,
;;=>  :fdr 0.6666666666666667,
;;=>  :fn 3,
;;=>  :fnr 0.75,
;;=>  :for 0.6,
;;=>  :fp 2,
;;=>  :fpr 0.5,
;;=>  :hit-rate 0.25,
;;=>  :lr+ 0.5,
;;=>  :lr- 1.5,
;;=>  :mcc -0.2581988897471611,
;;=>  :miss-rate 0.75,
;;=>  :mk -0.2666666666666666,
;;=>  :npv 0.4,
;;=>  :pcn 5.0,
;;=>  :pcp 3.0,
;;=>  :ppv 0.3333333333333333,
;;=>  :precision 0.3333333333333333,
;;=>  :prevalence 0.5,
;;=>  :recall 0.25,
;;=>  :selectivity 0.5,
;;=>  :sensitivity 0.25,
;;=>  :specificity 0.5,
;;=>  :tn 2,
;;=>  :tnr 0.5,
;;=>  :total 8.0,
;;=>  :tp 1,
;;=>  :tpr 0.25}

Treat :a and :b as true value.

(binary-measures-all [:a :b :c :d :e :f :a :b]
                     [:a :b :a :b :a :f :d :b]
                     {:a true, :b true, :c false})
;;=> {:accuracy 0.5,
;;=>  :bm 0.0,
;;=>  :cn 4.0,
;;=>  :cp 4.0,
;;=>  :dor 1.0,
;;=>  :f-beta #,
;;=>  :f-measure 0.6,
;;=>  :f1-score 0.6,
;;=>  :fall-out 0.75,
;;=>  :fdr 0.5,
;;=>  :fn 1,
;;=>  :fnr 0.25,
;;=>  :for 0.5,
;;=>  :fp 3,
;;=>  :fpr 0.75,
;;=>  :hit-rate 0.75,
;;=>  :lr+ 1.0,
;;=>  :lr- 1.0,
;;=>  :mcc 0.0,
;;=>  :miss-rate 0.25,
;;=>  :mk 0.0,
;;=>  :npv 0.5,
;;=>  :pcn 2.0,
;;=>  :pcp 6.0,
;;=>  :ppv 0.5,
;;=>  :precision 0.5,
;;=>  :prevalence 0.5,
;;=>  :recall 0.75,
;;=>  :selectivity 0.25,
;;=>  :sensitivity 0.75,
;;=>  :specificity 0.25,
;;=>  :tn 1,
;;=>  :tnr 0.25,
;;=>  :total 8.0,
;;=>  :tp 3,
;;=>  :tpr 0.75}

F-beta is a function. When beta is equal 1.0, you get f1-score.

(let [fbeta (:f-beta (binary-measures-all
                      [true false true false true false true false]
                      [true false false true false false false true]))]
  [(fbeta 1.0) (fbeta 2.0) (fbeta 0.5)])
;;=> [0.28571428571428575 0.7142857142857144 0.1785714285714286]

view source

bootstrap

(bootstrap vs)(bootstrap vs samples)(bootstrap vs samples size)

Generate set of samples of given size from provided data.

Default samples is 50, number of size defaults to 1000

Examples

Usage

(bootstrap [1 2 3 4 1 2 3 1 2 1] 2 20)
;;=> ((1.0
;;=>   1.0
;;=>   2.0
;;=>   1.0
;;=>   4.0
;;=>   3.0
;;=>   2.0
;;=>   2.0
;;=>   1.0
;;=>   4.0
;;=>   2.0
;;=>   1.0
;;=>   4.0
;;=>   2.0
;;=>   1.0
;;=>   3.0
;;=>   1.0
;;=>   3.0
;;=>   3.0
;;=>   2.0)
;;=>  (1.0
;;=>   4.0
;;=>   1.0
;;=>   2.0
;;=>   1.0
;;=>   3.0
;;=>   2.0
;;=>   1.0
;;=>   1.0
;;=>   1.0
;;=>   1.0
;;=>   2.0
;;=>   2.0
;;=>   1.0
;;=>   2.0
;;=>   3.0
;;=>   1.0
;;=>   1.0
;;=>   2.0
;;=>   3.0))
(let [data [1 2 3 4 1 2 3 1 2 1]
      fdata (frequencies data)
      bdata (bootstrap data 5 1000)]
  {:source fdata, :bootstrapped (map frequencies bdata)})
;;=> {:bootstrapped ({1.0 413, 2.0 294, 3.0 204, 4.0 89}
;;=>                 {1.0 415, 2.0 285, 3.0 192, 4.0 108}
;;=>                 {1.0 406, 2.0 297, 3.0 189, 4.0 108}
;;=>                 {1.0 419, 2.0 300, 3.0 199, 4.0 82}
;;=>                 {1.0 401, 2.0 322, 3.0 196, 4.0 81}),
;;=>  :source {1 4, 2 3, 3 2, 4 1}}

view source

bootstrap-ci

(bootstrap-ci vs)(bootstrap-ci vs alpha)(bootstrap-ci vs alpha samples)(bootstrap-ci vs alpha samples stat-fn)

Bootstrap method to calculate confidence interval.

Alpha defaults to 0.98, samples to 1000. Last parameter is statistical function used to measure, default: mean.

Returns ci and statistical function value.

Examples

Usage

(bootstrap-ci [-5 1 1 1 1 2 2 5 11 71])
;;=> [-5.796000000000005 17.8 9.0]
(bootstrap-ci [-5 1 1 1 1 2 2 5 11 71] 0.8)
;;=> [2.5999999999999996 15.280000000000005 9.0]
(bootstrap-ci [-5 1 1 1 1 2 2 5 11 71] 0.8 100000)
;;=> [2.5999999999999996 15.7 9.0]
(bootstrap-ci [-5 1 1 1 1 2 2 5 11 71] 0.98 1000 median)
;;=> [-5.0 2.0 1.5]

view source

ci

(ci vs)(ci vs alpha)

T-student based confidence interval for given data. Alpha value defaults to 0.98.

Last value is mean.

Examples

Usage

(ci [-5 1 1 1 1 2 2 5 11 71])
;;=> [-10.759020390886263 28.759020390886263 9.0]
(ci [-5 1 1 1 1 2 2 5 11 71] 0.8)
;;=> [-0.6855907410547175 18.685590741054718 9.0]

view source

cliffs-delta

(cliffs-delta group1 group2)

Cliff’s delta effect size

Examples

Usage

(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (cliffs-delta t c))
;;=> -0.25

view source

cohens-d

(cohens-d group1 group2)

Cohen’s d effect size for two groups

Examples

Usage

(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (cohens-d t c))
;;=> -0.42090943320131763

view source

cohens-d-orig

(cohens-d-orig group1 group2)

Original version of Cohen’s d effect size for two groups

Examples

Usage

(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (cohens-d-orig t c))
;;=> -0.39372472247513574

view source

correlation

(correlation vs1 vs2)

Correlation of two sequences.

Examples

Correlation of uniform and gaussian distribution samples.

(correlation (repeatedly 100000 (partial r/grand 1.0 10.0))
             (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> 0.004322084824287007

view source

covariance

(covariance vs1 vs2)

Covariance of two sequences.

Examples

Covariance of uniform and gaussian distribution samples.

(covariance (repeatedly 100000 (partial r/grand 1.0 10.0))
            (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> 0.02860827181203021

view source

covariance-matrix

(covariance-matrix vss)

Generate covariance matrix from seq of seqs. Row order.

Examples

Usage

(covariance-matrix [[1 2 3 4 5 11] [3 2 3 2 3 4]])
;;=> ([12.666666666666668 1.8666666666666667]
;;=>  [1.8666666666666667 0.5666666666666667])

view source

demean

(demean vs)

Subtract mean from sequence

Examples

Usage

(demean [-5 1 1 1 1 2 2 5 11 71])
;;=> (-14.0 -8.0 -8.0 -8.0 -8.0 -7.0 -7.0 -4.0 2.0 62.0)

view source

estimate-bins

(estimate-bins vs)(estimate-bins vs bins-or-estimate-method)

Estimate number of bins for histogram.

Possible methods are: :sqrt :sturges :rice :doane :scott :freedman-diaconis (default).

Examples

Estimate number of bins for various methods. vs contains 1000 random samples from Log-Normal distribution.

(estimate-bins vs :sqrt)
;;=> 31
(estimate-bins vs :sturges)
;;=> 11
(estimate-bins vs :rice)
;;=> 20
(estimate-bins vs :doane)
;;=> 16
(estimate-bins vs :scott)
;;=> 34
(estimate-bins vs :freedman-diaconis)
;;=> 81

view source

estimation-strategies-list

Examples

List of estimation strategies for percentile

(sort (keys estimation-strategies-list))
;;=> (:legacy :r1 :r2 :r3 :r4 :r5 :r6 :r7 :r8 :r9)

view source

extent

(extent vs)

Return extent (min, max, mean) values from sequence

Examples

min/max and mean of gaussian distribution

(extent (repeatedly 100000 r/grand))
;;=> [-4.433795945695126 4.1230995512178925 -0.0030623577039036527]

view source

glass-delta

(glass-delta group1 group2)

Glass’s delta effect size for two groups

Examples

Usage

(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (glass-delta t c))
;;=> -0.3849741916091626

view source

hedges-g

(hedges-g group1 group2)

Hedges’s g effect size for two groups

Examples

Usage

(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (hedges-g t c))
;;=> -0.3907787841275092

view source

hedges-g*

(hedges-g* group1 group2)

Less biased Hedges’s g effect size for two groups

Examples

Usage

(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (hedges-g* t c))
;;=> -0.36946357772055416

view source

histogram

(histogram vs)(histogram vs bins-or-estimate-method)(histogram vs bins [mn mx])

Calculate histogram.

Returns map with keys:

:size - number of bins
:step - distance between bins
:bins - list of pairs of range lower value and number of hits
:min - min value
:max - max value
:samples - number of used samples

For estimation methods check estimate-bins.

Examples

3 bins from uniform distribution.

(histogram (repeatedly 1000 rand) 3)
;;=> {:bins ([3.852964808315207E-5 333]
;;=>         [0.3324743943078295 344]
;;=>         [0.6649102589675758 323]),
;;=>  :max 0.9973461236273221,
;;=>  :min 3.852964808315207E-5,
;;=>  :samples 1000,
;;=>  :size 3,
;;=>  :step 0.33243586465974634}

3 bins from uniform distribution for given range.

(histogram (repeatedly 10000 rand) 3 [0.1 0.5])
;;=> {:bins ([0.1 1325] [0.2333333333333334 1305] [0.3666666666666668 1362]),
;;=>  :max 0.5000000000000001,
;;=>  :min 0.1,
;;=>  :samples 3992,
;;=>  :size 3,
;;=>  :step 0.1333333333333334}

5 bins from normal distribution.

(histogram (repeatedly 10000 r/grand) 5)
;;=> {:bins ([-3.6207687808508426 156]
;;=>         [-2.189509070713405 2083]
;;=>         [-0.7582493605759675 5213]
;;=>         [0.6730103495614701 2348]
;;=>         [2.1042700596989077 200]),
;;=>  :max 3.535529769836345,
;;=>  :min -3.6207687808508426,
;;=>  :samples 10000,
;;=>  :size 5,
;;=>  :step 1.4312597101374376}

Estimate number of bins

(:size (histogram (repeatedly 10000 r/grand)))
;;=> 63

Estimate number of bins, Rice rule

(:size (histogram (repeatedly 10000 r/grand) :rice))
;;=> 44

view source

iqr

(iqr vs)(iqr vs estimation-strategy)

Interquartile range.

Examples

IQR

(iqr (repeatedly 100000 r/grand))
;;=> 1.3440848153319385

view source

jensen-shannon-divergence

(jensen-shannon-divergence vs1 vs2)

Jensen-Shannon divergence of two sequences.

Examples

Jensen-Shannon divergence

(jensen-shannon-divergence (repeatedly 100 (fn* [] (r/irand 100)))
                           (repeatedly 100 (fn* [] (r/irand 100))))
;;=> 569.0495492365783

view source

kendall-correlation

(kendall-correlation vs1 vs2)

Kendall’s correlation of two sequences.

Examples

Kendall’s correlation of uniform and gaussian distribution samples.

(kendall-correlation (repeatedly 100000 (partial r/grand 1.0 10.0))
                     (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> 0.0013475046750467505

view source

kullback-leibler-divergence

(kullback-leibler-divergence vs1 vs2)

Kullback-Leibler divergence of two sequences.

Examples

Kullback-Leibler divergence.

(kullback-leibler-divergence (repeatedly 100 (fn* [] (r/irand 100)))
                             (repeatedly 100 (fn* [] (r/irand 100))))
;;=> 2974.653698088623

view source

kurtosis

(kurtosis vs)

Calculate kurtosis from sequence.

Examples

Kurtosis

(kurtosis [1 2 3 -1 -1 2 -1 11 111])
;;=> 8.732515263272099

view source

mad-extent

(mad-extent vs)

-/+ median-absolute-deviation and median

Examples

median absolute deviation from median for gaussian distribution

(mad-extent (repeatedly 100000 r/grand))
;;=> [-0.6674624646061628 0.6788112117843635 0.005674373589100425]

view source

maximum

(maximum vs)

Maximum value from sequence.

Examples

Maximum value

(maximum [1 2 3 -1 -1 2 -1 11 111])
;;=> 111.0

view source

mean

(mean vs)

Calculate mean of vs

Examples

Mean (average value)

(mean [1 2 3 -1 -1 2 -1 11 111])
;;=> 14.111111111111109

view source

median

(median vs)

Calculate median of vs. See median-3.

Examples

Median (percentile 50%).

(median [1 2 3 -1 -1 2 -1 11 111])
;;=> 2.0

For three elements use faster median-3.

(median [7 1 4])
;;=> 4.0

view source

median-3

(median-3 a b c)

Median of three values. See median.

Examples

Median of [7 1 4]

(median-3 7 1 4)
;;=> 4.0

view source

median-absolute-deviation

(median-absolute-deviation vs)

Calculate MAD

Examples

MAD

(median-absolute-deviation [1 2 3 -1 -1 2 -1 11 111])
;;=> 3.0

view source

minimum

(minimum vs)

Minimum value from sequence.

Examples

Minimum value

(minimum [1 2 3 -1 -1 2 -1 11 111])
;;=> -1.0

view source

mode

(mode vs)

Find the value that appears most often in a dataset vs.

Examples

Example

(mode [1 2 3 -1 -1 2 -1 11 111])
;;=> -1.0

Returns lowest value when every element appears equally.

(mode [5 1 2 3 4])
;;=> 1.0

view source

modes

(modes vs)

Find the values that appears most often in a dataset vs.

Returns sequence with all most appearing values in increasing order.

Examples

Example

(modes [1 2 3 -1 -1 2 -1 11 111])
;;=> (-1.0)

Returns lowest value when every element appears equally.

(modes [5 5 1 1 2 3 4 4])
;;=> (1.0 4.0 5.0)

view source

moment

(moment vs)(moment vs order)(moment vs order {:keys [absolute? center mean?], :or {absolute? false, center nil, mean? true}})

Calculate moment (central or/and absolute) of given order (default: 2).

Additional parameters as a map:

:absolute? - calculate sum as absolute values (default: false)
:mean? - returns mean (proper moment) or just sum of differences (default: true)
:center - value of central (default: nil = mean)

Examples

Usage

(moment [3 7 5 9 -8])
;;=> 35.36
(moment [3 7 5 9 -8] 1.0)
;;=> 0.0
(moment [3 7 5 9 -8] 4.0)
;;=> 3417.171199999999
(moment [3 7 5 9 -8] 3.0)
;;=> -229.82399999999993
(moment [3 7 5 9 -8] 3.0 {:center 0.0})
;;=> 142.4
(moment [3 7 5 9 -8] 3.0 {:mean? false})
;;=> -1149.1199999999997
(moment [3 7 5 9 -8] 3.0 {:absolute? true})
;;=> 332.15039999999993
(moment [3 7 5 9 -8] 3.0 {:center -3.0})
;;=> 666.2
(moment [3 7 5 9 -8] 0.5 {:absolute? true})
;;=> 1.8986344545712772

view source

outliers

(outliers vs)(outliers vs estimation-strategy)(outliers vs q1 q3)

Find outliers defined as values outside outer fences.

Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1).

LIF (Lower Outer Fence) equals (- Q1 (* 1.5 IQR)).
UIF (Upper Outer Fence) equals (+ Q3 (* 1.5 IQR)).

Returns sequence.

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See estimation-strategies.

Examples

Outliers

(outliers [1 2 3 -1 -1 2 -1 11 111])
;;=> (111.0)

Gaussian distribution outliers

(count (outliers (repeatedly 3000000 r/grand)))
;;=> 20845

view source

pacf

(pacf data)(pacf data lags)

Caluclate pacf (partial autocorrelation function) for given number of lags.

If lags is omitted function returns maximum possible number of lags.

pacf returns also lag 0 (which is 0.0).

Examples

Usage

(pacf (repeatedly 1000 r/grand) 10)
;;=> (0.0
;;=>  -0.026736190605719863
;;=>  0.010877194279486118
;;=>  0.032203194789526435
;;=>  -0.035726254389457764
;;=>  -0.002736273319801345
;;=>  -0.05322100355959638
;;=>  0.016135531529402884
;;=>  0.008998546289145027
;;=>  -0.0614874305488179
;;=>  0.010613174279058944)
(pacf [1 2 3 4 5 4 3 2 1])
;;=> (0.0
;;=>  0.5396825396825397
;;=>  -0.4299857803057234
;;=>  -0.388084834596935
;;=>  -0.2792571208141194
;;=>  0.17585056996358742
;;=>  -0.2652225487589841
;;=>  -0.17978918763554708
;;=>  -0.10771973872263883)

view source

pacf-ci

(pacf-ci data lags)(pacf-ci data lags alpha)

pacf with added confidence interval data.

Examples

Usage

(pacf-ci (repeatedly 1000 r/grand) 3)
;;=> {:ci 0.06197950323045615,
;;=>  :pacf
;;=>  (0.0 -0.003069469436674989 0.019682898818580288 0.008559579552033444)}
(pacf-ci [1 2 3 4 5 4 3 2 1] 3)
;;=> {:ci 0.653321328180018,
;;=>  :pacf (0.0 0.5396825396825397 -0.4299857803057234 -0.388084834596935)}

view source

pearson-correlation

(pearson-correlation vs1 vs2)

Pearson’s correlation of two sequences.

Examples

Pearson’s correlation of uniform and gaussian distribution samples.

(pearson-correlation (repeatedly 100000 (partial r/grand 1.0 10.0))
                     (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> -0.0013362357862912368

view source

percentile

(percentile vs p)(percentile vs p estimation-strategy)

Calculate percentile of a vs.

Percentile p is from range 0-100.

See docs.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

Examples

Percentile 25%

(percentile [1 2 3 -1 -1 2 -1 11 111] 25.0)
;;=> -1.0

Percentile 50% (median)

(percentile [1 2 3 -1 -1 2 -1 11 111] 50.0)
;;=> 2.0

Percentile 75%

(percentile [1 2 3 -1 -1 2 -1 11 111] 75.0)
;;=> 7.0

Percentile 90%

(percentile [1 2 3 -1 -1 2 -1 11 111] 90.0)
;;=> 111.0

Various estimation strategies.

(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :legacy)
;;=> 61.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r1)
;;=> 11.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r2)
;;=> 11.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r3)
;;=> 11.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r4)
;;=> 8.199999999999996
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r5)
;;=> 25.999999999999858
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r6)
;;=> 61.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r7)
;;=> 9.399999999999999
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r8)
;;=> 37.66666666666675
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r9)
;;=> 34.75000000000007

view source

percentile-extent

(percentile-extent vs)(percentile-extent vs p)(percentile-extent vs p1 p2)(percentile-extent vs p1 p2 estimation-strategy)

Return percentile range and median.

p - calculates extent of p and 100-p (default: p=25)

Examples

for samples from gaussian distribution

(percentile-extent (repeatedly 100000 r/grand))
;;=> [-0.6757379614999778 0.6743169986727071 -6.525193090059774E-4]
(percentile-extent (repeatedly 100000 r/grand) 10)
;;=> [-1.2857354257915978 1.2809543542935327 -0.005038343134417929]
(percentile-extent (repeatedly 100000 r/grand) 30 70)
;;=> [-0.5245075255533502 0.522954632601382 -6.971662739497372E-4]

view source

percentiles

(percentiles vs ps)(percentiles vs ps estimation-strategy)

Calculate percentiles of a vs.

Percentiles are sequence of values from range 0-100.

See docs.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

Examples

Usage

(percentiles [1 2 3 -1 -1 2 -1 11 111] [25 50 75 90])
;;=> [-1.0 2.0 7.0 111.0]

view source

population-stddev

(population-stddev vs)(population-stddev vs u)

Calculate population standard deviation of vs.

See stddev.

Examples

Population standard deviation.

(population-stddev [1 2 3 -1 -1 2 -1 11 111])
;;=> 34.4333315406403

view source

population-variance

(population-variance vs)(population-variance vs u)

Calculate population variance of vs.

See variance.

Examples

Population variance

(population-variance [1 2 3 -1 -1 2 -1 11 111])
;;=> 1185.6543209876543

view source

quantile

(quantile vs q)(quantile vs q estimation-strategy)

Calculate quantile of a vs.

Quantile q is from range 0.0-1.0.

See docs for interpolation strategy.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

Examples

Quantile 0.25

(quantile [1 2 3 -1 -1 2 -1 11 111] 0.25)
;;=> -1.0

Quantile 0.5 (median)

(quantile [1 2 3 -1 -1 2 -1 11 111] 0.5)
;;=> 2.0

Quantile 0.75

(quantile [1 2 3 -1 -1 2 -1 11 111] 0.75)
;;=> 7.0

Quantile 0.9

(quantile [1 2 3 -1 -1 2 -1 11 111] 0.9)
;;=> 111.0

Various estimation strategies.

(quantile [1 11 111 1111] 0.7 :legacy)
;;=> 611.0
(quantile [1 11 111 1111] 0.7 :r1)
;;=> 111.0
(quantile [1 11 111 1111] 0.7 :r2)
;;=> 111.0
(quantile [1 11 111 1111] 0.7 :r3)
;;=> 111.0
(quantile [1 11 111 1111] 0.7 :r4)
;;=> 90.99999999999999
(quantile [1 11 111 1111] 0.7 :r5)
;;=> 410.99999999999983
(quantile [1 11 111 1111] 0.7 :r6)
;;=> 611.0
(quantile [1 11 111 1111] 0.7 :r7)
;;=> 210.99999999999966
(quantile [1 11 111 1111] 0.7 :r8)
;;=> 477.66666666666623
(quantile [1 11 111 1111] 0.7 :r9)
;;=> 460.99999999999966

view source

quantiles

(quantiles vs qs)(quantiles vs qs estimation-strategy)

Calculate quantiles of a vs.

Quantilizes is sequence with values from range 0.0-1.0.

See docs for interpolation strategy.

Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here

Examples

Usage

(quantiles [1 2 3 -1 -1 2 -1 11 111] [0.25 0.5 0.75 0.9])
;;=> [-1.0 2.0 7.0 111.0]

view source

second-moment

deprecated in Use `moment` function

view source

sem

(sem vs)

Standard error of mean

Examples

SEM

(sem [1 2 3 -1 -1 2 -1 11 111])
;;=> 12.174021115615695

view source

sem-extent

(sem-extent vs)

-/+ sem and mean

Examples

standard error of mean and mean for gaussian distribution

(sem-extent (repeatedly 100000 r/grand))
;;=> [-0.0041852769369816285 0.002121789728373999 -0.001031743604303815]

view source

skewness

(skewness vs)

Calculate kurtosis from sequence.

Examples

Skewness

(skewness [1 2 3 -1 -1 2 -1 11 111])
;;=> 2.94268445417954

view source

spearman-correlation

(spearman-correlation vs1 vs2)

Spearman’s correlation of two sequences.

Examples

Spearsman’s correlation of uniform and gaussian distribution samples.

(spearman-correlation (repeatedly 100000 (partial r/grand 1.0 10.0))
                      (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> -2.2283008578278884E-4

view source

standardize

(standardize vs)

Normalize samples to have mean = 0 and stddev = 1.

Examples

Standardize

(standardize [1 2 3 -1 -1 2 -1 11 111])
;;=> (-0.3589915220998317
;;=>  -0.33161081278713267
;;=>  -0.30423010347443363
;;=>  -0.4137529407252298
;;=>  -0.4137529407252298
;;=>  -0.33161081278713267
;;=>  -0.4137529407252298
;;=>  -0.08518442897284138
;;=>  2.652886502297062)

view source

stats-map

(stats-map vs)(stats-map vs estimation-strategy)

Calculate several statistics of vs and return as map.

Optional estimation-strategy argument can be set to change quantile calculations estimation type. See estimation-strategies.

Examples

Stats

(stats-map [1 2 3 -1 -1 2 -1 11 111])
;;=> {:IQR 8.0,
;;=>  :Kurtosis 8.732515263272099,
;;=>  :LAV -1.0,
;;=>  :LIF -13.0,
;;=>  :LOF -25.0,
;;=>  :MAD 3.0,
;;=>  :Max 111.0,
;;=>  :Mean 14.11111111111111,
;;=>  :Median 2.0,
;;=>  :Min -1.0,
;;=>  :Mode -1.0,
;;=>  :Outliers (111.0),
;;=>  :Q1 -1.0,
;;=>  :Q3 7.0,
;;=>  :Range 112.0,
;;=>  :SD 36.522063346847084,
;;=>  :SEM 12.174021115615695,
;;=>  :Size 9,
;;=>  :Skewness 2.94268445417954,
;;=>  :Total 127.0,
;;=>  :UAV 11.0,
;;=>  :UIF 19.0,
;;=>  :UOF 31.0,
;;=>  :Variance 1333.8611111111113}

view source

stddev

(stddev vs)(stddev vs u)

Calculate standard deviation of vs.

See population-stddev.

Examples

Standard deviation.

(stddev [1 2 3 -1 -1 2 -1 11 111])
;;=> 36.522063346847084

view source

stddev-extent

(stddev-extent vs)

-/+ stddev and mean

Examples

standard deviation from mean and mean for gaussian distribution

(stddev-extent (repeatedly 100000 r/grand))
;;=> [-1.0008742722153459 0.994022466044159 -0.0034259030855934678]

view source

sum

(sum vs)

Sum of all vs values.

Examples

Sum

(sum [1 2 3 -1 -1 2 -1 11 111])
;;=> 127.0

view source

ttest-one-sample

(ttest-one-sample xs)(ttest-one-sample xs {:keys [alpha sides mu], :or {alpha 0.05, sides :two-sided, mu 0.0}})

One-sample Student’s t-test

alpha - significance level (default: 0.05)
sides - one of: :two-sided, :one-sided-less (short: :one-sided) or :one-sided-greater
mu - mean (default: 0.0)

Examples

Usage

(ttest-one-sample [1 2 3 4 5 6 7 8 9 10])
;;=> {:confidence-intervals [3.3341494103317983 7.665850589668201],
;;=>  :df 9,
;;=>  :estimated-mu 5.5,
;;=>  :p-value 2.781960110481859E-4,
;;=>  :t 5.744562646538029,
;;=>  :test-type :two-sided}
(ttest-one-sample [1 2 3 4 5 6 7 8 9 10] {:alpha 0.2})
;;=> {:confidence-intervals [4.175850795053416 6.824149204946584],
;;=>  :df 9,
;;=>  :estimated-mu 5.5,
;;=>  :p-value 2.781960110481859E-4,
;;=>  :t 5.744562646538029,
;;=>  :test-type :two-sided}
(ttest-one-sample [1 2 3 4 5 6 7 8 9 10] {:sides :one-sided})
;;=> {:confidence-intervals [##-Inf 7.255072013309326],
;;=>  :df 9,
;;=>  :estimated-mu 5.5,
;;=>  :p-value 0.9998609019944759,
;;=>  :t 5.744562646538029,
;;=>  :test-type :one-sided}
(ttest-one-sample [1 2 3 4 5 6 7 8 9 10] {:mu 5.0})
;;=> {:confidence-intervals [3.334149410331798 7.665850589668201],
;;=>  :df 9,
;;=>  :estimated-mu 5.5,
;;=>  :p-value 0.6141172548083933,
;;=>  :t 0.5222329678670935,
;;=>  :test-type :two-sided}

view source

ttest-two-samples

(ttest-two-samples xs ys)

(ttest-two-samples xs ys {:keys [alpha sides mu paired? equal-variances?], :or {alpha 0.05, sides :two-sided, mu 0.0, paired? false, equal-variances? false}, :as params})

Two-sample Student’s t-test

alpha - significance level (default: 0.05)
sides - one of: :two-sided, :one-sided-less (short: :one-sided) or :one-sided-greater
mu - mean (default: 0.0)
paired? - unpaired or paired test, boolean (default: false)
equal-variances? - unequal or equal variances, boolean (default: false)

Examples

Usage

(ttest-two-samples [1 2 3 4 5 6 7 8 9 10]
                   [7 8 9 10 11 12 13 14 15 16 17 18 19 20])
;;=> {:confidence-intervals [-11.052801725158163 -4.9471982748418375],
;;=>  :df 21.982212340188994,
;;=>  :estimated-mu [5.5 13.5],
;;=>  :p-value 1.8552818325118146E-5,
;;=>  :paired? false,
;;=>  :t -5.4349297638940595,
;;=>  :test-type :two-sided}
(ttest-two-samples [1 2 3 4 5 6 7 8 9 10]
                   [7 8 9 10 11 12 13 14 15 16 17 18 19 20 200])
;;=> {:confidence-intervals [-47.242899887102105 6.376233220435439],
;;=>  :df 14.164598953012467,
;;=>  :estimated-mu [5.5 25.93333333333333],
;;=>  :p-value 0.12451349808974498,
;;=>  :paired? false,
;;=>  :t -1.632902633201205,
;;=>  :test-type :two-sided}
(ttest-two-samples [1 2 3 4 5 6 7 8 9 10]
                   [7 8 9 10 11 12 13 14 15 16 17 18 19 20]
                   {:equal-variances? true})
;;=> {:confidence-intervals [-11.22324472988163 -4.77675527011837],
;;=>  :df 22.0,
;;=>  :estimated-mu [5.5 13.5],
;;=>  :p-value 3.690577215911943E-5,
;;=>  :paired? false,
;;=>  :t -5.147292847304685,
;;=>  :test-type :two-sided}
(ttest-two-samples [1 2 3 4 5 6 7 8 9 10]
                   [200 11 200 11 200 11 200 11 200 11]
                   {:paired? true})
;;=> {:confidence-intervals [-171.66671936335894 -28.333280636641092],
;;=>  :df 9,
;;=>  :estimated-mu -100.0,
;;=>  :p-value 0.011615504295919215,
;;=>  :paired? true,
;;=>  :t -3.156496045715208,
;;=>  :test-type :two-sided}

view source

variance

(variance vs)(variance vs u)

Calculate variance of vs.

See population-variance.

Examples

Variance.

(variance [1 2 3 -1 -1 2 -1 11 111])
;;=> 1333.861111111111

view source

Generated by Codox with RDash UI theme

Fastmath 1.5.3-SNAPSHOT

Project

Namespaces

Public Vars

fastmath.stats

Descriptive statistics

Categories

acf

Examples

acf-ci

Examples

adjacent-values

Examples

ameasure

Examples

binary-measures

Examples

binary-measures-all

Examples

bootstrap

Examples

bootstrap-ci

Examples

ci

Examples

cliffs-delta

Examples

cohens-d

Examples

cohens-d-orig

Examples

correlation

Examples

covariance

Examples

covariance-matrix

Examples

demean

Examples

estimate-bins

Examples

estimation-strategies-list

Examples

extent

Examples

glass-delta

Examples

hedges-g

Examples

hedges-g*

Examples

histogram

Examples

iqr

Examples

jensen-shannon-divergence

Examples

kendall-correlation

Examples

kullback-leibler-divergence

Examples

kurtosis

Examples

mad-extent

Examples

maximum

Examples

mean

Examples

median

Examples

median-3

Examples

median-absolute-deviation

Examples

minimum

Examples

mode

Examples