fastmath.stats
Statistics functions.
- Descriptive statistics.
- Correlation / covariance
- Outliers
- Confidence intervals
- Extents
- Effect size
- Tests
- Histogram
- ACF/PACF
- Bootstrap (see fastmath.stats.bootstrap)
- Binary measures
Functions are backed by Apache Commons Math or SMILE libraries. All work with Clojure sequences.
Descriptive statistics
All in one function stats-map contains:
- :Size- size of the samples,- (count ...)
- :Min- minimum value
- :Max- maximum value
- :Range- range of values
- :Mean- mean/average
- :Median- median, see also: median-3
- :Mode- mode, see also: modes
- :Q1- first quartile, use: percentile, quartile
- :Q3- third quartile, use: percentile, quartile
- :Total- sum of all samples
- :SD- sample standard deviation
- :Variance- variance
- :MAD- median-absolute-deviation
- :SEM- standard error of mean
- :LAV- lower adjacent value, use: adjacent-values
- :UAV- upper adjacent value, use: adjacent-values
- :IQR- interquartile range,- (- q3 q1)
- :LOF- lower outer fence,- (- q1 (* 3.0 iqr))
- :UOF- upper outer fence,- (+ q3 (* 3.0 iqr))
- :LIF- lower inner fence,- (- q1 (* 1.5 iqr))
- :UIF- upper inner fence,- (+ q3 (* 1.5 iqr))
- :Outliers- list of outliers, samples which are outside outer fences
- :Kurtosis- kurtosis
- :Skewness- skewness
Note: percentile and quartile can have 10 different interpolation strategies. See docs
Categories
Other vars: ->confusion-matrix L0 L1 L2 L2sq LInf acf acf-ci ad-test-one-sample adjacent-values ameasure binary-measures binary-measures-all binomial-ci binomial-ci-methods binomial-test bootstrap bootstrap-ci brown-forsythe-test chisq-test ci cliffs-delta coefficient-matrix cohens-d cohens-d-corrected cohens-f cohens-f2 cohens-kappa cohens-q cohens-u2 cohens-u3 cohens-w contingency-2x2-measures contingency-2x2-measures-all contingency-table contingency-table->marginals correlation correlation-matrix count= covariance covariance-matrix cramers-c cramers-v cramers-v-corrected cressie-read-test demean dissimilarity epsilon-sq estimate-bins estimation-strategies-list eta-sq extent f-test fligner-killeen-test freeman-tukey-test geomean glass-delta harmean hedges-g hedges-g* hedges-g-corrected histogram hpdi-extent inner-fence-extent iqr jensen-shannon-divergence kendall-correlation kruskal-test ks-test-one-sample ks-test-two-samples kullback-leibler-divergence kurtosis levene-test mad mad-extent mae mape maximum mcc me mean mean-absolute-deviation means-ratio means-ratio-corrected median median-3 median-absolute-deviation minimum minimum-discrimination-information-test mode modes moment mse multinomial-likelihood-ratio-test neyman-modified-chisq-test omega-sq one-way-anova-test outer-fence-extent outliers p-overlap p-value pacf pacf-ci pearson-correlation pearson-r percentile percentile-bc-extent percentile-bca-extent percentile-extent percentiles pi pi-extent pooled-stddev pooled-variance population-stddev population-variance power-divergence-test powmean psnr quantile quantile-extent quantiles r2 r2-determination rank-epsilon-sq rank-eta-sq rescale rmse robust-standardize rows->contingency-table rss second-moment sem sem-extent similarity skewness span spearman-correlation standardize stats-map stddev stddev-extent sum t-test-one-sample t-test-two-samples trim tschuprows-t ttest-one-sample ttest-two-samples variance variation weighted-kappa winsor wmean wmedian wmw-odds wquantile wquantiles z-test-one-sample z-test-two-samples
->confusion-matrix
(->confusion-matrix tp fn fp tn)(->confusion-matrix confusion-matrix)(->confusion-matrix actual prediction)(->confusion-matrix actual prediction encode-true)Convert input to confusion matrix
acf
(acf data)(acf data lags)Examples
Usage
(acf (repeatedly 1000 r/grand) 5)
;;=> (1.0
;;=>  0.024818272833702658
;;=>  -0.016643111047625263
;;=>  -0.04387141411591324
;;=>  -0.017525857903339097
;;=>  -0.02298214632194487)
(acf (repeatedly 1000 r/grand) [10 20 100 500])
;;=> (-0.0376606395291497
;;=>  -0.034034408698036596
;;=>  -0.010242012560395726
;;=>  0.0045013783824463)
(acf [1 2 3 4 5 4 3 2 1])
;;=> (1.0
;;=>  0.5396825396825397
;;=>  -0.013492063492063475
;;=>  -0.4666666666666665
;;=>  -0.6269841269841269
;;=>  -0.3015873015873015
;;=>  -0.011904761904761935
;;=>  0.17777777777777773
;;=>  0.20317460317460315)acf-ci
(acf-ci data)(acf-ci data lags)(acf-ci data lags alpha)acf with added confidence interval data.
:cis contains list of calculated ci for every lag.
Examples
Usage
(acf-ci (repeatedly 1000 r/grand) 3)
;;=> {:acf
;;=>  (1.0 8.400697641833316E-4 0.0448042665012578 0.009808923241870098),
;;=>  :ci 0.06197950323045615,
;;=>  :cis (0.06197950323045615
;;=>        0.06197954697044274
;;=>        0.062103841288912796
;;=>        0.062109792420910005)}
(acf-ci [1 2 3 4 5 4 3 2 1] 3)
;;=> {:acf
;;=>  (1.0 0.5396825396825397 -0.013492063492063475 -0.4666666666666665),
;;=>  :ci 0.653321328180018,
;;=>  :cis (0.653321328180018
;;=>        0.8218653739461048
;;=>        0.8219599072345126
;;=>        0.9281841012727746)}ad-test-one-sample
(ad-test-one-sample xs)(ad-test-one-sample xs distribution-or-ys)(ad-test-one-sample xs distribution-or-ys {:keys [sides kernel bandwidth], :or {sides :one-sided-greater, kernel :gaussian}})Anderson-Darling test
adjacent-values
(adjacent-values vs)(adjacent-values vs estimation-strategy)(adjacent-values vs q1 q3 m)Lower and upper adjacent values (LAV and UAV).
Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1).
- LAV is smallest value which is greater or equal to the LIF = (- Q1 (* 1.5 IQR)).
- UAV is largest value which is lower or equal to the UIF = (+ Q3 (* 1.5 IQR)).
- third value is a median of samples
Optional estimation-strategy argument can be set to change quantile calculations estimation type. See estimation-strategies.
Examples
(adjacent-values [1 2 3 -1 -1 2 -1 11 111])
;;=> [-1.0 11.0 2.0]Gaussian distribution LAV, UAV
(adjacent-values (repeatedly 1000000 r/grand))
;;=> [-2.7024365882934935 2.7023587415571386 5.342937622886728E-4]ameasure
(ameasure [group1 group2])(ameasure group1 group2)Vargha-Delaney A measure for two populations a and b
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (ameasure t c))
;;=> 0.375binary-measures
(binary-measures tp fn fp tn)(binary-measures confusion-matrix)(binary-measures actual prediction)(binary-measures actual prediction true-value)Subset of binary measures. See binary-measures-all.
Following keys are returned: [:tp :tn :fp :fn :accuracy :fdr :f-measure :fall-out :precision :recall :sensitivity :specificity :prevalence]
Examples
Usage
(binary-measures [true false true false true false true false]
                 [true false false true false false false true])
;;=> {:accuracy 0.375,
;;=>  :f-measure 0.28571428571428575,
;;=>  :fall-out 0.5,
;;=>  :fdr 0.6666666666666667,
;;=>  :fn 3,
;;=>  :fp 2,
;;=>  :precision 0.3333333333333333,
;;=>  :prevalence 0.5,
;;=>  :recall 0.25,
;;=>  :sensitivity 0.25,
;;=>  :specificity 0.5,
;;=>  :tn 2,
;;=>  :tp 1}Treat
1astruevalue.
(binary-measures [1 0 1 0 1 0 1 0] [1 0 0 1 0 0 0 1] [1])
;;=> {:accuracy 0.375,
;;=>  :f-measure 0.28571428571428575,
;;=>  :fall-out 0.5,
;;=>  :fdr 0.6666666666666667,
;;=>  :fn 3,
;;=>  :fp 2,
;;=>  :precision 0.3333333333333333,
;;=>  :prevalence 0.5,
;;=>  :recall 0.25,
;;=>  :sensitivity 0.25,
;;=>  :specificity 0.5,
;;=>  :tn 2,
;;=>  :tp 1}Treat
:aand:bastruevalue.
(binary-measures [:a :b :c :d :e :f :a :b]
                 [:a :b :a :b :a :f :d :b]
                 {:a true, :b true, :c false})
;;=> {:accuracy 0.5,
;;=>  :f-measure 0.6,
;;=>  :fall-out 0.75,
;;=>  :fdr 0.5,
;;=>  :fn 1,
;;=>  :fp 3,
;;=>  :precision 0.5,
;;=>  :prevalence 0.5,
;;=>  :recall 0.75,
;;=>  :sensitivity 0.75,
;;=>  :specificity 0.25,
;;=>  :tn 1,
;;=>  :tp 3}binary-measures-all
(binary-measures-all tp fn fp tn)(binary-measures-all confusion-matrix)(binary-measures-all actual prediction)(binary-measures-all actual prediction true-value)Collection of binary measures.
Arguments: * confusion-matrix - either map or sequence with [:tp :fn :fp :tn] values
or
- actual- list of ground truth values
- prediction- list of predicted values
- true-value- optional, true/false encoding, what is true in- truthand- prediction
true-value can be one of:
- nil- values are treating as booleans
- any sequence - values from sequence will be treated as true
- map - conversion will be done according to provided map (if there is no correspondin key, value is treated as false)
- any predicate
Examples
Usage
(binary-measures-all [true false true false true false true false]
                     [true false false true false false false true])
;;=> {:accuracy 0.375,
;;=>  :ba 0.375,
;;=>  :bm -0.25,
;;=>  :cn 4.0,
;;=>  :cp 4.0,
;;=>  :dor 0.3333333333333333,
;;=>  :f-beta #,
;;=>  :f-measure 0.28571428571428575,
;;=>  :f1-score 0.28571428571428575,
;;=>  :fall-out 0.5,
;;=>  :fdr 0.6666666666666667,
;;=>  :fm 0.28867513459481287,
;;=>  :fn 3,
;;=>  :fnr 0.75,
;;=>  :for 0.6,
;;=>  :fp 2,
;;=>  :fpr 0.5,
;;=>  :hit-rate 0.25,
;;=>  :jaccard 0.16666666666666666,
;;=>  :kappa -0.25,
;;=>  :lr+ 0.5,
;;=>  :lr- 1.5,
;;=>  :mcc -0.2581988897471611,
;;=>  :miss-rate 0.75,
;;=>  :mk -0.2666666666666666,
;;=>  :n 4.0,
;;=>  :npv 0.4,
;;=>  :p 4.0,
;;=>  :pcn 5.0,
;;=>  :pcp 3.0,
;;=>  :phi -0.2581988897471611,
;;=>  :pn 5.0,
;;=>  :pp 3.0,
;;=>  :ppv 0.3333333333333333,
;;=>  :precision 0.3333333333333333,
;;=>  :prevalence 0.5,
;;=>  :pt 0.5857864376269049,
;;=>  :recall 0.25,
;;=>  :selectivity 0.5,
;;=>  :sensitivity 0.25,
;;=>  :specificity 0.5,
;;=>  :tn 2,
;;=>  :tnr 0.5,
;;=>  :total 8.0,
;;=>  :tp 1,
;;=>  :tpr 0.25,
;;=>  :ts 0.16666666666666666} Treat
1astruevalue.
(binary-measures-all [1 0 1 0 1 0 1 0] [1 0 0 1 0 0 0 1] [1])
;;=> {:accuracy 0.375,
;;=>  :ba 0.375,
;;=>  :bm -0.25,
;;=>  :cn 4.0,
;;=>  :cp 4.0,
;;=>  :dor 0.3333333333333333,
;;=>  :f-beta #,
;;=>  :f-measure 0.28571428571428575,
;;=>  :f1-score 0.28571428571428575,
;;=>  :fall-out 0.5,
;;=>  :fdr 0.6666666666666667,
;;=>  :fm 0.28867513459481287,
;;=>  :fn 3,
;;=>  :fnr 0.75,
;;=>  :for 0.6,
;;=>  :fp 2,
;;=>  :fpr 0.5,
;;=>  :hit-rate 0.25,
;;=>  :jaccard 0.16666666666666666,
;;=>  :kappa -0.25,
;;=>  :lr+ 0.5,
;;=>  :lr- 1.5,
;;=>  :mcc -0.2581988897471611,
;;=>  :miss-rate 0.75,
;;=>  :mk -0.2666666666666666,
;;=>  :n 4.0,
;;=>  :npv 0.4,
;;=>  :p 4.0,
;;=>  :pcn 5.0,
;;=>  :pcp 3.0,
;;=>  :phi -0.2581988897471611,
;;=>  :pn 5.0,
;;=>  :pp 3.0,
;;=>  :ppv 0.3333333333333333,
;;=>  :precision 0.3333333333333333,
;;=>  :prevalence 0.5,
;;=>  :pt 0.5857864376269049,
;;=>  :recall 0.25,
;;=>  :selectivity 0.5,
;;=>  :sensitivity 0.25,
;;=>  :specificity 0.5,
;;=>  :tn 2,
;;=>  :tnr 0.5,
;;=>  :total 8.0,
;;=>  :tp 1,
;;=>  :tpr 0.25,
;;=>  :ts 0.16666666666666666} Treat
:aand:bastruevalue.
(binary-measures-all [:a :b :c :d :e :f :a :b]
                     [:a :b :a :b :a :f :d :b]
                     {:a true, :b true, :c false})
;;=> {:accuracy 0.5,
;;=>  :ba 0.5,
;;=>  :bm 0.0,
;;=>  :cn 4.0,
;;=>  :cp 4.0,
;;=>  :dor 1.0,
;;=>  :f-beta #,
;;=>  :f-measure 0.6,
;;=>  :f1-score 0.6,
;;=>  :fall-out 0.75,
;;=>  :fdr 0.5,
;;=>  :fm 0.6123724356957945,
;;=>  :fn 1,
;;=>  :fnr 0.25,
;;=>  :for 0.5,
;;=>  :fp 3,
;;=>  :fpr 0.75,
;;=>  :hit-rate 0.75,
;;=>  :jaccard 0.42857142857142855,
;;=>  :kappa 0.0,
;;=>  :lr+ 1.0,
;;=>  :lr- 1.0,
;;=>  :mcc 0.0,
;;=>  :miss-rate 0.25,
;;=>  :mk 0.0,
;;=>  :n 4.0,
;;=>  :npv 0.5,
;;=>  :p 4.0,
;;=>  :pcn 2.0,
;;=>  :pcp 6.0,
;;=>  :phi 0.0,
;;=>  :pn 2.0,
;;=>  :pp 6.0,
;;=>  :ppv 0.5,
;;=>  :precision 0.5,
;;=>  :prevalence 0.5,
;;=>  :pt ##NaN,
;;=>  :recall 0.75,
;;=>  :selectivity 0.25,
;;=>  :sensitivity 0.75,
;;=>  :specificity 0.25,
;;=>  :tn 1,
;;=>  :tnr 0.25,
;;=>  :total 8.0,
;;=>  :tp 3,
;;=>  :tpr 0.75,
;;=>  :ts 0.42857142857142855} F-beta is a function. When
betais equal1.0, you getf1-score.
(let [fbeta (:f-beta (binary-measures-all
                      [true false true false true false true false]
                      [true false false true false false false true]))]
  [(fbeta 1.0) (fbeta 2.0) (fbeta 0.5)])
;;=> [0.28571428571428575 0.2631578947368421 0.3125]binomial-ci
(binomial-ci number-of-successes number-of-trials)(binomial-ci number-of-successes number-of-trials method)(binomial-ci number-of-successes number-of-trials method alpha)Return confidence interval for a binomial distribution.
Possible methods are: * :asymptotic (normal aproximation, based on central limit theorem), default * :agresti-coull * :clopper-pearson * :wilson * :prop.test - one sample proportion test * :cloglog * :logit * :probit * :arcsine * :all - apply all methods and return a map of triplets
Default alpha is 0.05
Returns a triple lower ci, upper ci, p=successes/trials
binomial-test
(binomial-test xs)(binomial-test xs maybe-params)(binomial-test number-of-successes number-of-trials {:keys [alpha p ci-method sides], :or {alpha 0.05, p 0.5, ci-method :asymptotic, sides :two-sided}})Binomial test
- alpha- significance level (default:- 0.05)
- sides- one of:- :two-sided(default),- :one-sided-less(short:- :one-sided) or- :one-sided-greater
- ci-method- see binomial-ci-methods
- p- tested probability
bootstrap
deprecated in Please use fastmath.stats.bootstrap/bootstrap instead
(bootstrap vs)(bootstrap vs samples)(bootstrap vs samples size)Generate set of samples of given size from provided data.
Default samples is 200, number of size defaults to sample size.
Examples
Usage
(bootstrap [1 2 3 4 1 2 3 1 2 1] 2 20)
;;=> ((2.0
;;=>   3.0
;;=>   1.0
;;=>   2.0
;;=>   3.0
;;=>   2.0
;;=>   3.0
;;=>   2.0
;;=>   2.0
;;=>   2.0
;;=>   2.0
;;=>   2.0
;;=>   2.0
;;=>   1.0
;;=>   2.0
;;=>   2.0
;;=>   3.0
;;=>   2.0
;;=>   1.0
;;=>   2.0)
;;=>  (2.0
;;=>   2.0
;;=>   1.0
;;=>   2.0
;;=>   2.0
;;=>   1.0
;;=>   2.0
;;=>   2.0
;;=>   3.0
;;=>   4.0
;;=>   1.0
;;=>   1.0
;;=>   4.0
;;=>   3.0
;;=>   3.0
;;=>   2.0
;;=>   4.0
;;=>   2.0
;;=>   1.0
;;=>   1.0))
(let [data [1 2 3 4 1 2 3 1 2 1]
      fdata (frequencies data)
      bdata (bootstrap data 5 1000)]
  {:source fdata, :bootstrapped (map frequencies bdata)})
;;=> {:bootstrapped ({1.0 392, 2.0 295, 3.0 216, 4.0 97}
;;=>                 {1.0 388, 2.0 286, 3.0 228, 4.0 98}
;;=>                 {1.0 378, 2.0 305, 3.0 223, 4.0 94}
;;=>                 {1.0 423, 2.0 298, 3.0 190, 4.0 89}
;;=>                 {1.0 406, 2.0 294, 3.0 202, 4.0 98}),
;;=>  :source {1 4, 2 3, 3 2, 4 1}}bootstrap-ci
deprecated in Please use fastmath.stats.boostrap/ci-basic instead
(bootstrap-ci vs)(bootstrap-ci vs alpha)(bootstrap-ci vs alpha samples)(bootstrap-ci vs alpha samples stat-fn)Bootstrap method to calculate confidence interval.
Alpha defaults to 0.98, samples to 1000. Last parameter is statistical function used to measure, default: mean.
Returns ci and statistical function value.
Examples
Usage
(bootstrap-ci [-5 1 1 1 1 2 2 5 11 71])
;;=> [-5.498000000000001 17.700000000000003 9.0]
(bootstrap-ci [-5 1 1 1 1 2 2 5 11 71] 0.8)
;;=> [2.5999999999999996 15.7 9.0]
(bootstrap-ci [-5 1 1 1 1 2 2 5 11 71] 0.8 100000)
;;=> [2.5999999999999996 15.7 9.0]
(bootstrap-ci [-5 1 1 1 1 2 2 5 11 71] 0.98 1000 median)
;;=> [-20.5 3.0 1.5]chisq-test
(chisq-test contingency-table-or-xs)(chisq-test contingency-table-or-xs params)Chi square test, a power divergence test for lambda 1.0
Power divergence test.
First argument should be one of:
- contingency table
- sequence of counts (for goodness of fit)
- sequence of data (for goodness of fit against distribution)
For goodness of fit there are two options:
- comparison of observed counts vs expected probabilities or weights (:p)
- comparison of data against given distribution (:p), in this case histogram from data is created and compared to distribution PDF in bins ranges. Use:binsoption to control histogram creation.
Options are:
- :lambda- test type:- 1.0- chisq-test
- 0.0- multinomial-likelihood-ratio-test
- -1.0- minimum-discrimination-information-test
- -2.0- neyman-modified-chisq-test
- -0.5- freeman-tukey-test
- 2/3- cressie-read-test - default
 
- :p- probabilites, weights or distribution object.
- :alpha- significance level (default: 0.05)
- :ci-sides- confidence interval sides (default:- :two-sided)
- :sides- p-value sides (- :two-sided,- :one-side-greater- default,- :one-side-less)
- :bootstrap-samples- number of samples to estimate confidence intervals (default: 1000)
- :ddof- delta degrees of freedom, adjustment for dof (default: 0.0)
- :bins- number of bins or estimator name for histogram
ci
(ci vs)(ci vs alpha)T-student based confidence interval for given data. Alpha value defaults to 0.05.
Last value is mean.
Examples
Usage
(ci [-5 1 1 1 1 2 2 5 11 71])
;;=> [-6.842279963189242 24.84227996318924 9.0]
(ci [-5 1 1 1 1 2 2 5 11 71] 0.8)
;;=> [7.172484402810223 10.827515597189777 9.0]cliffs-delta
(cliffs-delta [group1 group2])(cliffs-delta group1 group2)Cliff’s delta effect size for ordinal data.
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (cliffs-delta t c))
;;=> -0.25
(let [t [:a :b :c :D :e :f] c [:a :z :X :y :x :y]] (cliffs-delta t c))
;;=> -0.4722222222222222coefficient-matrix
(coefficient-matrix vss)(coefficient-matrix vss measure-fn)(coefficient-matrix vss measure-fn symmetric?)Generate coefficient (correlation, covariance, any two arg function) matrix from seq of seqs. Row order.
Default method: pearson-correlation
cohens-d
(cohens-d [group1 group2])(cohens-d group1 group2)(cohens-d group1 group2 method)Cohen’s d effect size for two groups
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (cohens-d t c))
;;=> -0.42208932839491475cohens-d-corrected
(cohens-d-corrected [group1 group2])(cohens-d-corrected group1 group2)(cohens-d-corrected group1 group2 method)Cohen’s d corrected for small group size
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (cohens-d-corrected t c))
;;=> -0.3990662741188285cohens-f
(cohens-f [group1 group2])(cohens-f group1 group2)(cohens-f group1 group2 type)Cohens f, sqrt of Cohens f2.
Possible type values are: :eta (default), :omega and :epsilon.
cohens-f2
(cohens-f2 [group1 group2])(cohens-f2 group1 group2)(cohens-f2 group1 group2 type)Cohens f2, by default based on eta-sq.
Possible type values are: :eta (default), :omega and :epsilon.
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [-50 20 30 40 40 50 10 20 30 10]]
  {:default (cohens-f2 t c),
   :eta (cohens-f2 t c :eta),
   :omega (cohens-f2 t c :omega),
   :epsilon (cohens-f2 t c :epsilon)})
;;=> {:default 0.06779661016949151,
;;=>  :epsilon -0.050847457627118633,
;;=>  :eta 0.06779661016949151,
;;=>  :omega -0.04576271186440677}cohens-q
(cohens-q r1 r2)(cohens-q group1 group2a group2b)(cohens-q group1a group2a group1b group2b)Comparison of two correlations.
Arity:
- 2 - compare two correlation values
- 3 - compare correlation of group1andgroup2awith correlation ofgroup1andgroup2b
- 4 - compare correlation of first two arguments with correlation of last two arguments
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [-50 20 30 40 40 50 10 20 30 10]
      d [5 2 3 4 4 5 4 2 3 -1]
      e (range 10)]
  {:arity-2 (cohens-q 0.5 -0.25),
   :arity-3 (cohens-q t c d),
   :arity-4 (cohens-q t c d e)})
;;=> {:arity-2 0.8047189562170503,
;;=>  :arity-3 0.9030140869391835,
;;=>  :arity-4 0.8369271764963739}cohens-u2
(cohens-u2 [group1 group2])(cohens-u2 group1 group2)(cohens-u2 group1 group2 estimation-strategy)Cohen’s U2, the proportion of one of the groups that exceeds the same proportion in the other group.
cohens-u3
(cohens-u3 [group1 group2])(cohens-u3 group1 group2)(cohens-u3 group1 group2 estimation-strategy)Cohen’s U3, the proportion of the second group that is smaller than the median of the first group.
cohens-w
(cohens-w group1 group2)(cohens-w contingency-table)Cohen’s W effect size for discrete data.
Examples
Usage
(let [a [:a :a :b :b :f :a :a :b :b :c :a :a :b :b :c :a :a :b :b :c]
      b [:b :f :a :a :b :b :y :z :c :b :b :c :a :a :b :b :c :a :a :b]]
  (cohens-w a b))
;;=> 1.0408329997330665contingency-2x2-measures-all
(contingency-2x2-measures-all a b c d)(contingency-2x2-measures-all map-or-seq)(contingency-2x2-measures-all [a b] [c d])contingency-table
(contingency-table & seqs)Returns frequencies map of tuples built from seqs.
correlation
(correlation [vs1 vs2])(correlation vs1 vs2)Correlation of two sequences.
Examples
Correlation of uniform and gaussian distribution samples.
(correlation (repeatedly 100000 (partial r/grand 1.0 10.0))
             (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> -0.003303068749082862correlation-matrix
(correlation-matrix vss)(correlation-matrix vss measure)Generate correlation matrix from seq of seqs. Row order.
Possible measures: :pearson (default), :kendall, :spearman.
covariance
(covariance [vs1 vs2])(covariance vs1 vs2)Covariance of two sequences.
Examples
Covariance of uniform and gaussian distribution samples.
(covariance (repeatedly 100000 (partial r/grand 1.0 10.0))
            (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> 0.04673681719722958covariance-matrix
(covariance-matrix vss)Generate covariance matrix from seq of seqs. Row order.
Examples
Usage
(covariance-matrix [[1 2 3 4 5 11] [3 2 3 2 3 4]])
;;=> ([12.666666666666668 1.8666666666666667]
;;=>  [1.8666666666666667 0.5666666666666667])cramers-c
(cramers-c group1 group2)(cramers-c contingency-table)Cramer’s C effect size for discrete data.
cramers-v
(cramers-v group1 group2)(cramers-v contingency-table)Cramer’s V effect size for discrete data.
Examples
Usage
(let [a [:a :a :b :b :f :a :a :b :b :c :a :a :b :b :c :a :a :b :b :c]
      b [:b :f :a :a :b :b :y :z :c :b :b :c :a :a :b :b :c :a :a :b]]
  (cramers-v a b))
;;=> 0.6009252125773316cramers-v-corrected
(cramers-v-corrected group1 group2)(cramers-v-corrected contingency-table)Corrected Cramer’s V
Examples
Usage
(let [a [:a :a :b :b :f :a :a :b :b :c :a :a :b :b :c :a :a :b :b :c]
      b [:b :f :a :a :b :b :y :z :c :b :b :c :a :a :b :b :c :a :a :b]]
  (cramers-v-corrected a b))
;;=> 0.3410563654946855cressie-read-test
(cressie-read-test contingency-table-or-xs)(cressie-read-test contingency-table-or-xs params)Cressie-Read test, a power divergence test for lambda 2/3
Power divergence test.
First argument should be one of:
- contingency table
- sequence of counts (for goodness of fit)
- sequence of data (for goodness of fit against distribution)
For goodness of fit there are two options:
- comparison of observed counts vs expected probabilities or weights (:p)
- comparison of data against given distribution (:p), in this case histogram from data is created and compared to distribution PDF in bins ranges. Use:binsoption to control histogram creation.
Options are:
- :lambda- test type:- 1.0- chisq-test
- 0.0- multinomial-likelihood-ratio-test
- -1.0- minimum-discrimination-information-test
- -2.0- neyman-modified-chisq-test
- -0.5- freeman-tukey-test
- 2/3- cressie-read-test - default
 
- :p- probabilites, weights or distribution object.
- :alpha- significance level (default: 0.05)
- :ci-sides- confidence interval sides (default:- :two-sided)
- :sides- p-value sides (- :two-sided,- :one-side-greater- default,- :one-side-less)
- :bootstrap-samples- number of samples to estimate confidence intervals (default: 1000)
- :ddof- delta degrees of freedom, adjustment for dof (default: 0.0)
- :bins- number of bins or estimator name for histogram
demean
(demean vs)Subtract mean from sequence
Examples
Usage
(demean [-5 1 1 1 1 2 2 5 11 71])
;;=> (-14.0 -8.0 -8.0 -8.0 -8.0 -7.0 -7.0 -4.0 2.0 62.0)dissimilarity
(dissimilarity method P-observed Q-expected)(dissimilarity method P-observed Q-expected {:keys [bins probabilities? epsilon log-base power], :or {probabilities? true, epsilon 1.0E-6, log-base m/E, power 2.0}})Various PDF distance between two histograms (frequencies) or probabilities.
Q can be a distribution object. Then, histogram will be created out of P.
Arguments:
- method- distance method
- P-observed- frequencies, probabilities or actual data (when Q is a distribution)
- Q-expected- frequencies, probabilities or distribution object (when P is a data)
Options:
- :probabilities?- should P/Q be converted to a probabilities, default:- true.
- :epsilon- small number which replaces- 0.0when division or logarithm is used`
- :log-base- base for logarithms, default:- e
- :power- exponent for- :minkowskidistance, default:- 2.0
- :bins- number of bins or bins estimation method, see histogram.
The list of methods: :euclidean, :city-block, :manhattan, :chebyshev, :minkowski, :sorensen, :gower, :soergel, :kulczynski, :canberra, :lorentzian, :non-intersection, :wave-hedges, :czekanowski, :motyka, :tanimoto, :jaccard, :dice, :bhattacharyya, :hellinger, :matusita, :squared-chord, :euclidean-sq, :squared-euclidean, :pearson-chisq, :chisq, :neyman-chisq, :squared-chisq, :symmetric-chisq, :divergence, :clark, :additive-symmetric-chisq, :kullback-leibler, :jeffreys, :k-divergence, :topsoe, :jensen-shannon, :jensen-difference, :taneja, :kumar-johnson, :avg
See more: Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions by Sung-Hyuk Cha
epsilon-sq
(epsilon-sq [group1 group2])(epsilon-sq group1 group2)Less biased R2
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [-50 20 30 40 40 50 10 20 30 10]]
  (epsilon-sq t c))
;;=> -0.05357142857142856estimate-bins
(estimate-bins vs)(estimate-bins vs bins-or-estimate-method)Estimate number of bins for histogram.
Possible methods are: :sqrt :sturges :rice :doane :scott :freedman-diaconis (default).
The number returned is not higher than number of samples.
Examples
Estimate number of bins for various methods.
vscontains 1000 random samples from Log-Normal distribution.
(estimate-bins vs :sqrt)
;;=> 31
(estimate-bins vs :sturges)
;;=> 11
(estimate-bins vs :rice)
;;=> 20
(estimate-bins vs :doane)
;;=> 18
(estimate-bins vs :scott)
;;=> 66
(estimate-bins vs :freedman-diaconis)
;;=> 201estimation-strategies-list
List of estimation strategies for percentile/quantile functions.
Examples
List of estimation strategies for percentile
(sort (keys estimation-strategies-list))
;;=> (:legacy :r1 :r2 :r3 :r4 :r5 :r6 :r7 :r8 :r9)eta-sq
(eta-sq [group1 group2])(eta-sq group1 group2)R2, coefficient of determination
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [-50 20 30 40 40 50 10 20 30 10]]
  (eta-sq t c))
;;=> 0.06349206349206347extent
(extent vs)Return extent (min, max, mean) values from sequence
Examples
min/max and mean of gaussian distribution
(extent (repeatedly 100000 r/grand))
;;=> [-4.754024082311232 4.226469292832746 0.0036117613630082942]f-test
(f-test xs ys)(f-test xs ys {:keys [sides alpha], :or {sides :two-sided, alpha 0.05}})Variance F-test of two samples.
- alpha- significance level (default:- 0.05)
- sides- one of:- :two-sided(default),- :one-sided-less(short:- :one-sided) or- :one-sided-greater
fligner-killeen-test
(fligner-killeen-test xss)(fligner-killeen-test xss {:keys [sides], :or {sides :one-sided-greater}})freeman-tukey-test
(freeman-tukey-test contingency-table-or-xs)(freeman-tukey-test contingency-table-or-xs params)Freeman-Tukey test, a power divergence test for lambda -0.5
Power divergence test.
First argument should be one of:
- contingency table
- sequence of counts (for goodness of fit)
- sequence of data (for goodness of fit against distribution)
For goodness of fit there are two options:
- comparison of observed counts vs expected probabilities or weights (:p)
- comparison of data against given distribution (:p), in this case histogram from data is created and compared to distribution PDF in bins ranges. Use:binsoption to control histogram creation.
Options are:
- :lambda- test type:- 1.0- chisq-test
- 0.0- multinomial-likelihood-ratio-test
- -1.0- minimum-discrimination-information-test
- -2.0- neyman-modified-chisq-test
- -0.5- freeman-tukey-test
- 2/3- cressie-read-test - default
 
- :p- probabilites, weights or distribution object.
- :alpha- significance level (default: 0.05)
- :ci-sides- confidence interval sides (default:- :two-sided)
- :sides- p-value sides (- :two-sided,- :one-side-greater- default,- :one-side-less)
- :bootstrap-samples- number of samples to estimate confidence intervals (default: 1000)
- :ddof- delta degrees of freedom, adjustment for dof (default: 0.0)
- :bins- number of bins or estimator name for histogram
geomean
(geomean vs)Geometric mean for positive values only
Examples
Geometric mean
(geomean [1 2 3 1 1 2 1 11 111])
;;=> 2.903203203730772glass-delta
(glass-delta [group1 group2])(glass-delta group1 group2)Glass’s delta effect size for two groups
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (glass-delta t c))
;;=> -0.3849741916091626harmean
(harmean vs)Harmonic mean
Examples
Harmonic mean
(harmean [1 2 3 -1 -1 2 -1 11 111])
;;=> -15.880057803468203hedges-g
(hedges-g [group1 group2])(hedges-g group1 group2)Hedges’s g effect size for two groups
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (hedges-g t c))
;;=> -0.42208932839491475hedges-g*
(hedges-g* [group1 group2])(hedges-g* group1 group2)Less biased Hedges’s g effect size for two groups, J term correction.
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (hedges-g* t c))
;;=> -0.3989958628030656hedges-g-corrected
(hedges-g-corrected [group1 group2])(hedges-g-corrected group1 group2)Cohen’s d corrected for small group size
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [10 20 30 40 40 50]]
  (hedges-g-corrected t c))
;;=> -0.3990662741188285histogram
(histogram vs)(histogram vs bins-or-estimate-method)(histogram vs bins-or-estimate-method [mn mx])Calculate histogram.
Returns map with keys:
- :size- number of bins
- :step- distance between bins
- :bins- list of pairs of range lower value and number of hits
- :min- min value
- :max- max value
- :samples- number of used samples
For estimation methods check estimate-bins.
If difference between min and max values is 0, number of bins is set to 1.
Examples
3 bins from uniform distribution.
(histogram (repeatedly 1000 rand) 3)
;;=> {:bins ([0.0011357582159668977 340]
;;=>         [0.3340889989822621 341]
;;=>         [0.6670422397485573 319]),
;;=>  :max 0.9999954805148524,
;;=>  :min 0.0011357582159668977,
;;=>  :samples 1000,
;;=>  :size 3,
;;=>  :step 0.3329532407662952}3 bins from uniform distribution for given range.
(histogram (repeatedly 10000 rand) 3 [0.1 0.5])
;;=> {:bins
;;=>  ([0.1 1336] [0.23333333333333334 1370] [0.3666666666666667 1333]),
;;=>  :max 0.5,
;;=>  :min 0.1,
;;=>  :samples 4039,
;;=>  :size 3,
;;=>  :step 0.13333333333333333}5 bins from normal distribution.
(histogram (repeatedly 10000 r/grand) 5)
;;=> {:bins ([-3.4805958085123243 188]
;;=>         [-2.0755018370184155 2289]
;;=>         [-0.6704078655245067 5299]
;;=>         [0.7346861059694016 2069]
;;=>         [2.139780077463311 155]),
;;=>  :max 3.544874048957219,
;;=>  :min -3.4805958085123243,
;;=>  :samples 10000,
;;=>  :size 5,
;;=>  :step 1.4050939714939088}Estimate number of bins
(:size (histogram (repeatedly 10000 r/grand)))
;;=> 64Estimate number of bins, Rice rule
(:size (histogram (repeatedly 10000 r/grand) :rice))
;;=> 44hpdi-extent
(hpdi-extent vs)(hpdi-extent vs size)Higher Posterior Density interval + median.
size parameter is the target probability content of the interval.
inner-fence-extent
(inner-fence-extent vs)(inner-fence-extent vs estimation-strategy)Returns LIF, UIF and median
iqr
(iqr vs)(iqr vs estimation-strategy)Interquartile range.
Examples
IQR
(iqr (repeatedly 100000 r/grand))
;;=> 1.3535813918712551jensen-shannon-divergence
deprecated in Use [[dissimilarity]].
(jensen-shannon-divergence [vs1 vs2])(jensen-shannon-divergence vs1 vs2)Jensen-Shannon divergence of two sequences.
Examples
Jensen-Shannon divergence
(jensen-shannon-divergence (repeatedly 100 (fn* [] (r/irand 100)))
                           (repeatedly 100 (fn* [] (r/irand 100))))
;;=> 439.86198157468397kendall-correlation
(kendall-correlation [vs1 vs2])(kendall-correlation vs1 vs2)Kendall’s correlation of two sequences.
Examples
Kendall’s correlation of uniform and gaussian distribution samples.
(kendall-correlation (repeatedly 100000 (partial r/grand 1.0 10.0))
                     (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> -6.012940129401294E-4kruskal-test
(kruskal-test xss)(kruskal-test xss {:keys [sides], :or {sides :right}})Kruskal-Wallis rank sum test.
ks-test-one-sample
(ks-test-one-sample xs)(ks-test-one-sample xs distribution-or-ys)(ks-test-one-sample xs distribution-or-ys {:keys [sides kernel bandwidth distinct?], :or {sides :two-sided, kernel :gaussian, distinct? true}})One sample Kolmogorov-Smirnov test
ks-test-two-samples
(ks-test-two-samples xs ys)(ks-test-two-samples xs ys {:keys [sides distinct?], :or {sides :two-sided, distinct? true}})Two samples Kolmogorov-Smirnov test
kullback-leibler-divergence
deprecated in Use [[dissimilarity]].
(kullback-leibler-divergence [vs1 vs2])(kullback-leibler-divergence vs1 vs2)Kullback-Leibler divergence of two sequences.
Examples
Kullback-Leibler divergence.
(kullback-leibler-divergence (repeatedly 100 (fn* [] (r/irand 100)))
                             (repeatedly 100 (fn* [] (r/irand 100))))
;;=> 1511.2029846076014kurtosis
(kurtosis vs)(kurtosis vs typ)Calculate kurtosis from sequence.
Possible typs: :G2 (default), :g2 (or :excess), :geary or :kurt.
Examples
Kurtosis
(kurtosis [1 2 3 -1 -1 2 -1 11 111])
;;=> 8.732515263272099
(kurtosis [1 2 3 -1 -1 2 -1 11 111] :G2)
;;=> 8.732515263272099
(kurtosis [1 2 3 -1 -1 2 -1 11 111] :g2)
;;=> 3.9845705132178515
(kurtosis [1 2 3 -1 -1 2 -1 11 111] :excess)
;;=> 3.9845705132178515
(kurtosis [1 2 3 -1 -1 2 -1 11 111] :kurt)
;;=> 6.984570513217852levene-test
(levene-test xss)(levene-test xss {:keys [sides statistic scorediff], :or {sides :one-sided-greater, statistic mean, scorediff abs}})mad-extent
(mad-extent vs)-/+ median-absolute-deviation and median
Examples
median absolute deviation from median for gaussian distribution
(mad-extent (repeatedly 100000 r/grand))
;;=> [-0.6730878283010525 0.6797538719277473 0.003333021813347429]maximum
(maximum vs)Maximum value from sequence.
Examples
Maximum value
(maximum [1 2 3 -1 -1 2 -1 11 111])
;;=> 111.0mcc
(mcc group1 group2)(mcc ct)Matthews correlation coefficient also known as phi coefficient.
mean
(mean vs)Calculate mean of vs
Examples
Mean (average value)
(mean [1 2 3 -1 -1 2 -1 11 111])
;;=> 14.111111111111109mean-absolute-deviation
(mean-absolute-deviation vs)(mean-absolute-deviation vs center)Calculate mean absolute deviation
means-ratio
(means-ratio [group1 group2])(means-ratio group1 group2)(means-ratio group1 group2 adjusted?)Means ratio
means-ratio-corrected
(means-ratio-corrected [group1 group2])(means-ratio-corrected group1 group2)Bias correced means ratio
median
(median vs estimation-strategy)(median vs)Calculate median of vs. See median-3.
Examples
Median (percentile 50%).
(median [1 2 3 -1 -1 2 -1 11 111])
;;=> 2.0For three elements use faster median-3.
(median [7 1 4])
;;=> 4.0median-3
(median-3 a b c)Median of three values. See median.
Examples
Median of 7 1 4
(median-3 7 1 4)
;;=> 4.0median-absolute-deviation
(median-absolute-deviation vs)(median-absolute-deviation vs center)(median-absolute-deviation vs center estimation-strategy)Calculate MAD
Examples
MAD
(median-absolute-deviation [1 2 3 -1 -1 2 -1 11 111])
;;=> 3.0minimum
(minimum vs)Minimum value from sequence.
Examples
Minimum value
(minimum [1 2 3 -1 -1 2 -1 11 111])
;;=> -1.0minimum-discrimination-information-test
(minimum-discrimination-information-test contingency-table-or-xs)(minimum-discrimination-information-test contingency-table-or-xs params)Minimum discrimination information test, a power divergence test for lambda -1.0
Power divergence test.
First argument should be one of:
- contingency table
- sequence of counts (for goodness of fit)
- sequence of data (for goodness of fit against distribution)
For goodness of fit there are two options:
- comparison of observed counts vs expected probabilities or weights (:p)
- comparison of data against given distribution (:p), in this case histogram from data is created and compared to distribution PDF in bins ranges. Use:binsoption to control histogram creation.
Options are:
- :lambda- test type:- 1.0- chisq-test
- 0.0- multinomial-likelihood-ratio-test
- -1.0- minimum-discrimination-information-test
- -2.0- neyman-modified-chisq-test
- -0.5- freeman-tukey-test
- 2/3- cressie-read-test - default
 
- :p- probabilites, weights or distribution object.
- :alpha- significance level (default: 0.05)
- :ci-sides- confidence interval sides (default:- :two-sided)
- :sides- p-value sides (- :two-sided,- :one-side-greater- default,- :one-side-less)
- :bootstrap-samples- number of samples to estimate confidence intervals (default: 1000)
- :ddof- delta degrees of freedom, adjustment for dof (default: 0.0)
- :bins- number of bins or estimator name for histogram
mode
(mode vs method)(mode vs method opts)(mode vs)Find the value that appears most often in a dataset vs.
For sample from continuous distribution, three algorithms are possible: * :histogram - calculated from histogram * :kde - calculated from KDE * :pearson - mode = mean-3(median-mean) * :default - discrete mode
Histogram accepts optional :bins (see histogram). KDE method accepts :kde for kernel name (default :gaussian) and :bandwidth (auto). Pearson can accept :estimation-strategy for median.
See also modes.
Examples
Example
(mode [1 2 3 -1 -1 2 -1 11 111])
;;=> -1.0Returns lowest value when every element appears equally.
(mode [5 1 2 3 4])
;;=> 1.0modes
(modes vs method)(modes vs method opts)(modes vs)Find the values that appears most often in a dataset vs.
Returns sequence with all most appearing values in increasing order.
See also mode.
Examples
Example
(modes [1 2 3 -1 -1 2 -1 11 111])
;;=> (-1.0)Returns lowest value when every element appears equally.
(modes [5 5 1 1 2 3 4 4])
;;=> (1.0 4.0 5.0)moment
(moment vs)(moment vs order)(moment vs order {:keys [absolute? center mean? normalize?], :or {mean? true}})Calculate moment (central or/and absolute) of given order (default: 2).
Additional parameters as a map:
- :absolute?- calculate sum as absolute values (default:- false)
- :mean?- returns mean (proper moment) or just sum of differences (default:- true)
- :center- value of center (default:- nil= mean)
- :normalize?- apply normalization by standard deviation to the order power
Examples
Usage
(moment [3 7 5 9 -8])
;;=> 35.36
(moment [3 7 5 9 -8] 1.0)
;;=> 0.0
(moment [3 7 5 9 -8] 4.0)
;;=> 3417.171199999999
(moment [3 7 5 9 -8] 3.0)
;;=> -229.82399999999993
(moment [3 7 5 9 -8] 3.0 {:center 0.0})
;;=> 142.4
(moment [3 7 5 9 -8] 3.0 {:mean? false})
;;=> -1149.1199999999997
(moment [3 7 5 9 -8] 3.0 {:absolute? true})
;;=> 332.15039999999993
(moment [3 7 5 9 -8] 3.0 {:center -3.0})
;;=> 666.2
(moment [3 7 5 9 -8] 0.5 {:absolute? true})
;;=> 1.8986344545712772multinomial-likelihood-ratio-test
(multinomial-likelihood-ratio-test contingency-table-or-xs)(multinomial-likelihood-ratio-test contingency-table-or-xs params)Multinomial likelihood ratio test, a power divergence test for lambda 0.0
Power divergence test.
First argument should be one of:
- contingency table
- sequence of counts (for goodness of fit)
- sequence of data (for goodness of fit against distribution)
For goodness of fit there are two options:
- comparison of observed counts vs expected probabilities or weights (:p)
- comparison of data against given distribution (:p), in this case histogram from data is created and compared to distribution PDF in bins ranges. Use:binsoption to control histogram creation.
Options are:
- :lambda- test type:- 1.0- chisq-test
- 0.0- multinomial-likelihood-ratio-test
- -1.0- minimum-discrimination-information-test
- -2.0- neyman-modified-chisq-test
- -0.5- freeman-tukey-test
- 2/3- cressie-read-test - default
 
- :p- probabilites, weights or distribution object.
- :alpha- significance level (default: 0.05)
- :ci-sides- confidence interval sides (default:- :two-sided)
- :sides- p-value sides (- :two-sided,- :one-side-greater- default,- :one-side-less)
- :bootstrap-samples- number of samples to estimate confidence intervals (default: 1000)
- :ddof- delta degrees of freedom, adjustment for dof (default: 0.0)
- :bins- number of bins or estimator name for histogram
neyman-modified-chisq-test
(neyman-modified-chisq-test contingency-table-or-xs)(neyman-modified-chisq-test contingency-table-or-xs params)Neyman modifield chi square test, a power divergence test for lambda -2.0
Power divergence test.
First argument should be one of:
- contingency table
- sequence of counts (for goodness of fit)
- sequence of data (for goodness of fit against distribution)
For goodness of fit there are two options:
- comparison of observed counts vs expected probabilities or weights (:p)
- comparison of data against given distribution (:p), in this case histogram from data is created and compared to distribution PDF in bins ranges. Use:binsoption to control histogram creation.
Options are:
- :lambda- test type:- 1.0- chisq-test
- 0.0- multinomial-likelihood-ratio-test
- -1.0- minimum-discrimination-information-test
- -2.0- neyman-modified-chisq-test
- -0.5- freeman-tukey-test
- 2/3- cressie-read-test - default
 
- :p- probabilites, weights or distribution object.
- :alpha- significance level (default: 0.05)
- :ci-sides- confidence interval sides (default:- :two-sided)
- :sides- p-value sides (- :two-sided,- :one-side-greater- default,- :one-side-less)
- :bootstrap-samples- number of samples to estimate confidence intervals (default: 1000)
- :ddof- delta degrees of freedom, adjustment for dof (default: 0.0)
- :bins- number of bins or estimator name for histogram
omega-sq
(omega-sq [group1 group2])(omega-sq group1 group2)Adjusted R2
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [-50 20 30 40 40 50 10 20 30 10]]
  (omega-sq t c))
;;=> -0.04795737122557726one-way-anova-test
(one-way-anova-test xss)(one-way-anova-test xss {:keys [sides], :or {sides :one-sided-greater}})outer-fence-extent
(outer-fence-extent vs)(outer-fence-extent vs estimation-strategy)Returns LOF, UOF and median
outliers
(outliers vs)(outliers vs estimation-strategy)(outliers vs q1 q3)Find outliers defined as values outside inner fences.
Let Q1 is 25-percentile and Q3 is 75-percentile. IQR is (- Q3 Q1).
- LIF (Lower Inner Fence) equals (- Q1 (* 1.5 IQR)).
- UIF (Upper Inner Fence) equals (+ Q3 (* 1.5 IQR)).
Returns sequence.
Optional estimation-strategy argument can be set to change quantile calculations estimation type. See estimation-strategies.
Examples
Outliers
(outliers [1 2 3 -1 -1 2 -1 11 111])
;;=> (111.0)Gaussian distribution outliers
(count (outliers (repeatedly 3000000 r/grand)))
;;=> 20801p-overlap
(p-overlap [group1 group2])(p-overlap group1 group2)(p-overlap group1 group2 {:keys [kde bandwidth min-iterations steps], :or {kde :gaussian, min-iterations 3, steps 500}})Overlapping index, kernel density approximation
p-value
(p-value stat)(p-value distribution stat)(p-value distribution stat sides)Calculate p-value for given distribution (default: N(0,1)), stat  and sides (one of :two-sided, :one-sided-greater or :one-sided-less/:one-sided).
pacf
(pacf data)(pacf data lags)Examples
Usage
(pacf (repeatedly 1000 r/grand) 10)
;;=> (0.0
;;=>  -0.007493264913704033
;;=>  0.03950940288978749
;;=>  -0.027119682527161276
;;=>  -8.127647576342593E-4
;;=>  0.023184353076001057
;;=>  -0.07492493089581809
;;=>  -0.00415801721495847
;;=>  -0.029857518595778
;;=>  -0.025366182150073028
;;=>  0.02293649431732736)
(pacf [1 2 3 4 5 4 3 2 1])
;;=> (0.0
;;=>  0.5396825396825397
;;=>  -0.4299857803057234
;;=>  -0.388084834596935
;;=>  -0.2792571208141194
;;=>  0.17585056996358742
;;=>  -0.2652225487589841
;;=>  -0.17978918763554708
;;=>  -0.10771973872263883)pacf-ci
(pacf-ci data)(pacf-ci data lags)(pacf-ci data lags alpha)pacf with added confidence interval data.
Examples
Usage
(pacf-ci (repeatedly 1000 r/grand) 3)
;;=> {:ci 0.06197950323045615,
;;=>  :pacf
;;=>  (0.0 -0.04930794881410651 -0.005276481462760803 -0.03643226574541986)}
(pacf-ci [1 2 3 4 5 4 3 2 1] 3)
;;=> {:ci 0.653321328180018,
;;=>  :pacf (0.0 0.5396825396825397 -0.4299857803057234 -0.388084834596935)}pearson-correlation
(pearson-correlation [vs1 vs2])(pearson-correlation vs1 vs2)Pearson’s correlation of two sequences.
Examples
Pearson’s correlation of uniform and gaussian distribution samples.
(pearson-correlation (repeatedly 100000 (partial r/grand 1.0 10.0))
                     (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> -0.002465297102056219pearson-r
(pearson-r [group1 group2])(pearson-r group1 group2)Pearson r correlation coefficient
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [-50 20 30 40 40 50 10 20 30 10]]
  (pearson-r t c))
;;=> 0.2519763153394848percentile
(percentile vs p)(percentile vs p estimation-strategy)Examples
Percentile 25%
(percentile [1 2 3 -1 -1 2 -1 11 111] 25.0)
;;=> -1.0Percentile 50% (median)
(percentile [1 2 3 -1 -1 2 -1 11 111] 50.0)
;;=> 2.0Percentile 75%
(percentile [1 2 3 -1 -1 2 -1 11 111] 75.0)
;;=> 7.0Percentile 90%
(percentile [1 2 3 -1 -1 2 -1 11 111] 90.0)
;;=> 111.0Various estimation strategies.
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :legacy)
;;=> 61.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r1)
;;=> 11.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r2)
;;=> 11.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r3)
;;=> 11.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r4)
;;=> 8.199999999999996
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r5)
;;=> 25.999999999999858
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r6)
;;=> 61.0
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r7)
;;=> 9.399999999999999
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r8)
;;=> 37.66666666666675
(percentile [1 2 3 -1 -1 2 -1 11 111] 85.0 :r9)
;;=> 34.75000000000007percentile-bc-extent
(percentile-bc-extent vs)(percentile-bc-extent vs p)(percentile-bc-extent vs p1 p2)(percentile-bc-extent vs p1 p2 estimation-strategy)Return bias corrected percentile range and mean for bootstrap samples. See https://projecteuclid.org/euclid.ss/1032280214
p - calculates extent of bias corrected p and 100-p (default: p=2.5)
Set estimation-strategy to :r7 to get the same result as in R coxed::bca.
Examples
for samples from gaussian distribution
(percentile-bc-extent (repeatedly 100000 r/grand))
;;=> [-1.9686998976773415 1.9488455054357283 -0.001753000836894031]
(percentile-bc-extent (repeatedly 100000 r/grand) 10)
;;=> [-1.2644568505959903 1.2992608668913583 0.006974183450094526]
(percentile-bc-extent (repeatedly 100000 r/grand) 30 70)
;;=> [-0.5390348516759447 0.5110256349062376 7.918286580104579E-4]percentile-bca-extent
(percentile-bca-extent vs)(percentile-bca-extent vs p)(percentile-bca-extent vs p1 p2)(percentile-bca-extent vs p1 p2 estimation-strategy)(percentile-bca-extent vs p1 p2 accel estimation-strategy)Return bias corrected percentile range and mean for bootstrap samples. Also accounts for variance variations throught the accelaration parameter. See https://projecteuclid.org/euclid.ss/1032280214
p - calculates extent of bias corrected p and 100-p (default: p=2.5)
Set estimation-strategy to :r7 to get the same result as in R coxed::bca.
Examples
for samples from gaussian distribution
(percentile-bca-extent (repeatedly 100000 r/grand))
;;=> [-1.9611115428631776 1.9395600880977821 -0.0031832554346387315]
(percentile-bca-extent (repeatedly 100000 r/grand) 10)
;;=> [-1.2666928768037038 1.2838206326649084 0.0011417437187627867]
(percentile-bca-extent (repeatedly 100000 r/grand) 30 70)
;;=> [-0.5316035303243614 0.5241767534807082 0.001668106379069707]percentile-extent
(percentile-extent vs)(percentile-extent vs p)(percentile-extent vs p1 p2)(percentile-extent vs p1 p2 estimation-strategy)Return percentile range and median.
p - calculates extent of p and 100-p (default: p=25)
Examples
for samples from gaussian distribution
(percentile-extent (repeatedly 100000 r/grand))
;;=> [-0.6811300265122471 0.67725176529625 0.002258912209196587]
(percentile-extent (repeatedly 100000 r/grand) 10)
;;=> [-1.2775454067052694 1.2710171173948166 7.651774916386039E-4]
(percentile-extent (repeatedly 100000 r/grand) 30 70)
;;=> [-0.5247561462693245 0.5243627229342358 -0.001450827530566729]percentiles
(percentiles vs)(percentiles vs ps)(percentiles vs ps estimation-strategy)Examples
Usage
(percentiles [1 2 3 -1 -1 2 -1 11 111] [25 50 75 90])
;;=> [-1.0 2.0 7.0 111.0]pi
(pi vs)(pi vs size)(pi vs size estimation-strategy)Returns PI as a map, quantile intervals based on interval size.
Quantiles are (1-size)/2 and 1-(1-size)/2
pi-extent
(pi-extent vs)(pi-extent vs size)(pi-extent vs size estimation-strategy)Returns PI extent, quantile intervals based on interval size + median.
Quantiles are (1-size)/2 and 1-(1-size)/2
pooled-stddev
(pooled-stddev groups)(pooled-stddev groups method)Calculate pooled standard deviation for samples and method
pooled-variance
(pooled-variance groups)(pooled-variance groups method)Calculate pooled variance for samples and method.
Methods: * :unbiased - sqrt of weighted average of variances (default) * :biased - biased version of :unbiased * :avg - sqrt of average of variances
population-stddev
(population-stddev vs)(population-stddev vs u)Calculate population standard deviation of vs.
See stddev.
Examples
Population standard deviation.
(population-stddev [1 2 3 -1 -1 2 -1 11 111])
;;=> 34.4333315406403population-variance
(population-variance vs)(population-variance vs u)Calculate population variance of vs.
See variance.
Examples
Population variance
(population-variance [1 2 3 -1 -1 2 -1 11 111])
;;=> 1185.6543209876543power-divergence-test
(power-divergence-test contingency-table-or-xs)(power-divergence-test contingency-table-or-xs {:keys [lambda ci-sides sides p alpha bootstrap-samples ddof bins], :or {lambda m/TWO_THIRD, sides :one-sided-greater, ci-sides :two-sided, alpha 0.05, bootstrap-samples 1000, ddof 0}})Power divergence test.
First argument should be one of:
- contingency table
- sequence of counts (for goodness of fit)
- sequence of data (for goodness of fit against distribution)
For goodness of fit there are two options:
- comparison of observed counts vs expected probabilities or weights (:p)
- comparison of data against given distribution (:p), in this case histogram from data is created and compared to distribution PDF in bins ranges. Use:binsoption to control histogram creation.
Options are:
- :lambda- test type:- 1.0- chisq-test
- 0.0- multinomial-likelihood-ratio-test
- -1.0- minimum-discrimination-information-test
- -2.0- neyman-modified-chisq-test
- -0.5- freeman-tukey-test
- 2/3- cressie-read-test - default
 
- :p- probabilites, weights or distribution object.
- :alpha- significance level (default: 0.05)
- :ci-sides- confidence interval sides (default:- :two-sided)
- :sides- p-value sides (- :two-sided,- :one-side-greater- default,- :one-side-less)
- :bootstrap-samples- number of samples to estimate confidence intervals (default: 1000)
- :ddof- delta degrees of freedom, adjustment for dof (default: 0.0)
- :bins- number of bins or estimator name for histogram
powmean
(powmean vs power)Generalized power mean
Examples
Power mean
(powmean [1 2 3 1 1 2 1 11 111] 0.0)
;;=> 2.903203203730772
(powmean [1 2 3 1 1 2 1 11 111] 0.1)
;;=> 3.2703950036489737
(powmean [1 2 3 1 1 2 1 11 111] 0.5)
;;=> 6.201625343593919
(powmean [1 2 3 1 1 2 1 11 111] 1.0)
;;=> 14.777777777777782
(powmean [1 2 3 1 1 2 1 11 111] 2.0)
;;=> 37.21260240533814
(powmean [1 2 3 1 1 2 1 11 111] 3.0)
;;=> 53.381150691705734
(powmean [1 2 3 1 1 2 1 11 111] 5.5)
;;=> 74.44312203513597psnr
(psnr [vs1 vs2-or-val])(psnr vs1 vs2-or-val)(psnr vs1 vs2-or-val max-value)Peak signal to noise, max-value is maximum possible value (default: max from vs1 and vs2)
quantile
(quantile vs q)(quantile vs q estimation-strategy)Calculate quantile of a vs.
Quantile q is from range 0.0-1.0.
See docs for interpolation strategy.
Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here
See also percentile.
Examples
Quantile 0.25
(quantile [1 2 3 -1 -1 2 -1 11 111] 0.25)
;;=> -1.0Quantile 0.5 (median)
(quantile [1 2 3 -1 -1 2 -1 11 111] 0.5)
;;=> 2.0Quantile 0.75
(quantile [1 2 3 -1 -1 2 -1 11 111] 0.75)
;;=> 7.0Quantile 0.9
(quantile [1 2 3 -1 -1 2 -1 11 111] 0.9)
;;=> 111.0Various estimation strategies.
(quantile [1 11 111 1111] 0.7 :legacy)
;;=> 611.0
(quantile [1 11 111 1111] 0.7 :r1)
;;=> 111.0
(quantile [1 11 111 1111] 0.7 :r2)
;;=> 111.0
(quantile [1 11 111 1111] 0.7 :r3)
;;=> 111.0
(quantile [1 11 111 1111] 0.7 :r4)
;;=> 90.99999999999999
(quantile [1 11 111 1111] 0.7 :r5)
;;=> 410.99999999999983
(quantile [1 11 111 1111] 0.7 :r6)
;;=> 611.0
(quantile [1 11 111 1111] 0.7 :r7)
;;=> 210.99999999999966
(quantile [1 11 111 1111] 0.7 :r8)
;;=> 477.66666666666623
(quantile [1 11 111 1111] 0.7 :r9)
;;=> 460.99999999999966quantile-extent
(quantile-extent vs)(quantile-extent vs q)(quantile-extent vs q1 q2)(quantile-extent vs q1 q2 estimation-strategy)Return quantile range and median.
q - calculates extent of q and 1.0-q (default: q=0.25)
quantiles
(quantiles vs)(quantiles vs qs)(quantiles vs qs estimation-strategy)Calculate quantiles of a vs.
Quantilizes is sequence with values from range 0.0-1.0.
See docs for interpolation strategy.
Optionally you can provide estimation-strategy to change interpolation methods for selecting values. Default is :legacy. See more here
See also percentiles.
Examples
Usage
(quantiles [1 2 3 -1 -1 2 -1 11 111] [0.25 0.5 0.75 0.9])
;;=> [-1.0 2.0 7.0 111.0]r2-determination
(r2-determination [group1 group2])(r2-determination group1 group2)Coefficient of determination
Examples
Usage
(let [t [10 10 20 20 20 30 30 30 40 50]
      c [-50 20 30 40 40 50 10 20 30 10]]
  (r2-determination t c))
;;=> 0.06349206349206347robust-standardize
(robust-standardize vs)(robust-standardize vs q)Normalize samples to have median = 0 and MAD = 1.
If q argument is used, scaling is done by quantile difference (Q_q, Q_(1-q)). Set 0.25 for IQR.
sem
(sem vs)Standard error of mean
Examples
SEM
(sem [1 2 3 -1 -1 2 -1 11 111])
;;=> 12.174021115615695sem-extent
(sem-extent vs)-/+ sem and mean
Examples
standard error of mean and mean for gaussian distribution
(sem-extent (repeatedly 100000 r/grand))
;;=> [6.802850148960229E-4 0.007013149643892707 0.003846717329394365]similarity
(similarity method P-observed Q-expected)(similarity method P-observed Q-expected {:keys [bins probabilities? epsilon], :or {probabilities? true, epsilon 1.0E-6}})Various PDF similarities between two histograms (frequencies) or probabilities.
Q can be a distribution object. Then, histogram will be created out of P.
Arguments:
- method- distance method
- P-observed- frequencies, probabilities or actual data (when Q is a distribution)
- Q-expected- frequencies, probabilities or distribution object (when P is a data)
Options:
- :probabilities?- should P/Q be converted to a probabilities, default:- true.
- :epsilon- small number which replaces- 0.0when division or logarithm is used`
- :bins- number of bins or bins estimation method, see histogram.
The list of methods: :intersection, :czekanowski, :motyka, :kulczynski, :ruzicka, :inner-product, :harmonic-mean, :cosine, :jaccard, :dice, :fidelity, :squared-chord
See more: Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions by Sung-Hyuk Cha
skewness
(skewness vs)(skewness vs typ)Calculate skewness from sequence.
Possible types: :G1 (default), :g1 (:pearson), :b1, :B1 (:yule), :B3, :skew, :mode or :median.
Examples
Skewness
(skewness [1 2 3 -1 -1 2 -1 11 111])
;;=> 2.94268445417954
(skewness [1 2 3 -1 -1 2 -1 11 111] :G1)
;;=> 2.94268445417954
(skewness [1 2 3 -1 -1 2 -1 11 111] :g1)
;;=> 2.4275908211830184
(skewness [1 2 3 -1 -1 2 -1 11 111] :pearson)
;;=> 2.4275908211830184
(skewness [1 2 3 -1 -1 2 -1 11 111] :b1)
;;=> 2.034448511531534
(skewness [1 2 3 -1 -1 2 -1 11 111] :B1)
;;=> 0.25
(skewness [1 2 3 -1 -1 2 -1 11 111] :yule)
;;=> 0.25
(skewness [1 2 3 -1 -1 2 -1 11 111] :B3)
;;=> 0.8449612403100772
(skewness [1 2 3 -1 -1 2 -1 11 111] :mode)
;;=> 0.4137529407252298
(skewness [1 2 3 -1 -1 2 -1 11 111] :median)
;;=> 0.9948324383613981
(skewness [1 2 3 -1 -1 2 -1 11 111] :skew)
;;=> 0.8091969403943394spearman-correlation
(spearman-correlation [vs1 vs2])(spearman-correlation vs1 vs2)Spearman’s correlation of two sequences.
Examples
Spearsman’s correlation of uniform and gaussian distribution samples.
(spearman-correlation (repeatedly 100000 (partial r/grand 1.0 10.0))
                      (repeatedly 100000 (partial r/drand -10.0 -5.0)))
;;=> -0.0020795592460891798standardize
(standardize vs)Normalize samples to have mean = 0 and stddev = 1.
Examples
Standardize
(standardize [1 2 3 -1 -1 2 -1 11 111])
;;=> (-0.3589915220998317
;;=>  -0.33161081278713267
;;=>  -0.30423010347443363
;;=>  -0.4137529407252298
;;=>  -0.4137529407252298
;;=>  -0.33161081278713267
;;=>  -0.4137529407252298
;;=>  -0.08518442897284138
;;=>  2.652886502297062)stats-map
(stats-map vs)(stats-map vs estimation-strategy)Calculate several statistics of vs and return as map.
Optional estimation-strategy argument can be set to change quantile calculations estimation type. See estimation-strategies.
Examples
Stats
(stats-map [1 2 3 -1 -1 2 -1 11 111])
;;=> {:IQR 8.0,
;;=>  :Kurtosis 8.732515263272099,
;;=>  :LAV -1.0,
;;=>  :LIF -13.0,
;;=>  :LOF -25.0,
;;=>  :MAD 3.0,
;;=>  :Max 111.0,
;;=>  :Mean 14.11111111111111,
;;=>  :Median 2.0,
;;=>  :Min -1.0,
;;=>  :Mode -1.0,
;;=>  :Outliers (111.0),
;;=>  :Q1 -1.0,
;;=>  :Q3 7.0,
;;=>  :Range 112.0,
;;=>  :SD 36.522063346847084,
;;=>  :SEM 12.174021115615695,
;;=>  :Size 9,
;;=>  :Skewness 2.94268445417954,
;;=>  :Total 127.0,
;;=>  :UAV 11.0,
;;=>  :UIF 19.0,
;;=>  :UOF 31.0,
;;=>  :Variance 1333.8611111111113}stddev
(stddev vs)(stddev vs u)Calculate standard deviation of vs.
See population-stddev.
Examples
Standard deviation.
(stddev [1 2 3 -1 -1 2 -1 11 111])
;;=> 36.522063346847084stddev-extent
(stddev-extent vs)-/+ stddev and mean
Examples
standard deviation from mean and mean for gaussian distribution
(stddev-extent (repeatedly 100000 r/grand))
;;=> [-0.9996022983939234 0.9951962468655247 -0.0022030257641994337]t-test-one-sample
(t-test-one-sample xs)(t-test-one-sample xs m)One sample Student’s t-test
- alpha- significance level (default:- 0.05)
- sides- one of:- :two-sided,- :one-sided-less(short:- :one-sided) or- :one-sided-greater
- mu- mean (default:- 0.0)
t-test-two-samples
(t-test-two-samples xs ys)(t-test-two-samples xs ys {:keys [paired? equal-variances?], :or {paired? false, equal-variances? false}, :as params})Two samples Student’s t-test
- alpha- significance level (default:- 0.05)
- sides- one of:- :two-sided(default),- :one-sided-less(short:- :one-sided) or- :one-sided-greater
- mu- mean (default:- 0.0)
- paired?- unpaired or paired test, boolean (default:- false)
- equal-variances?- unequal or equal variances, boolean (default:- false)
trim
(trim vs)(trim vs quantile)(trim vs quantile estimation-strategy)(trim vs low high nan)Return trimmed data. Trim is done by using quantiles, by default is set to 0.2.
tschuprows-t
(tschuprows-t group1 group2)(tschuprows-t contingency-table)Tschuprows T effect size for discrete data
Examples
Usage
(let [a [:a :a :b :b :f :a :a :b :b :c :a :a :b :b :c :a :a :b :b :c]
      b [:b :f :a :a :b :b :y :z :c :b :b :c :a :a :b :b :c :a :a :b]]
  (tschuprows-t a b))
;;=> 0.5288813325243744ttest-one-sample
deprecated in Use [[t-test-one-sample]]
Examples
Usage
(ttest-one-sample [1 2 3 4 5 6 7 8 9 10])
;;=> {:alpha 0.05,
;;=>  :confidence-interval [3.3341494103317983 7.665850589668201],
;;=>  :df 9,
;;=>  :estimate 5.5,
;;=>  :level 0.95,
;;=>  :mu 0.0,
;;=>  :n 10,
;;=>  :p-value 2.7819601104828173E-4,
;;=>  :stat 5.744562646538029,
;;=>  :stderr 0.9574271077563381,
;;=>  :t 5.744562646538029,
;;=>  :test-type :two-sided}
(ttest-one-sample [1 2 3 4 5 6 7 8 9 10] {:alpha 0.2})
;;=> {:alpha 0.2,
;;=>  :confidence-interval [4.175850795053416 6.824149204946584],
;;=>  :df 9,
;;=>  :estimate 5.5,
;;=>  :level 0.8,
;;=>  :mu 0.0,
;;=>  :n 10,
;;=>  :p-value 2.7819601104828173E-4,
;;=>  :stat 5.744562646538029,
;;=>  :stderr 0.9574271077563381,
;;=>  :t 5.744562646538029,
;;=>  :test-type :two-sided}
(ttest-one-sample [1 2 3 4 5 6 7 8 9 10] {:sides :one-sided})
;;=> {:alpha 0.05,
;;=>  :confidence-interval [##-Inf 7.255072013309326],
;;=>  :df 9,
;;=>  :estimate 5.5,
;;=>  :level 0.95,
;;=>  :mu 0.0,
;;=>  :n 10,
;;=>  :p-value 0.9998609019944759,
;;=>  :stat 5.744562646538029,
;;=>  :stderr 0.9574271077563381,
;;=>  :t 5.744562646538029,
;;=>  :test-type :one-sided}
(ttest-one-sample [1 2 3 4 5 6 7 8 9 10] {:mu 5.0})
;;=> {:alpha 0.05,
;;=>  :confidence-interval [3.334149410331798 7.665850589668201],
;;=>  :df 9,
;;=>  :estimate 5.5,
;;=>  :level 0.95,
;;=>  :mu 5.0,
;;=>  :n 10,
;;=>  :p-value 0.6141172548083933,
;;=>  :stat 0.5222329678670935,
;;=>  :stderr 0.9574271077563381,
;;=>  :t 0.5222329678670935,
;;=>  :test-type :two-sided}ttest-two-samples
deprecated in Use [[t-test-two-samples]]
Examples
Usage
(ttest-two-samples [1 2 3 4 5 6 7 8 9 10]
                   [7 8 9 10 11 12 13 14 15 16 17 18 19 20])
;;=> {:alpha 0.05,
;;=>  :confidence-interval [-11.052801725158163 -4.9471982748418375],
;;=>  :df 21.982212340188994,
;;=>  :equal-variances? false,
;;=>  :estimate -8.0,
;;=>  :estimated-mu [5.5 13.5],
;;=>  :level 0.95,
;;=>  :mu 0.0,
;;=>  :n [10 14],
;;=>  :nx 10,
;;=>  :ny 14,
;;=>  :p-value 1.8552818325118146E-5,
;;=>  :paired? false,
;;=>  :sides :two-sided,
;;=>  :stat -5.4349297638940595,
;;=>  :stderr 1.4719601443879746,
;;=>  :t -5.4349297638940595,
;;=>  :test-type :two-sided}
(ttest-two-samples [1 2 3 4 5 6 7 8 9 10]
                   [7 8 9 10 11 12 13 14 15 16 17 18 19 20 200])
;;=> {:alpha 0.05,
;;=>  :confidence-interval [-47.242899887102105 6.376233220435439],
;;=>  :df 14.164598953012467,
;;=>  :equal-variances? false,
;;=>  :estimate -20.43333333333333,
;;=>  :estimated-mu [5.5 25.93333333333333],
;;=>  :level 0.95,
;;=>  :mu 0.0,
;;=>  :n [10 15],
;;=>  :nx 10,
;;=>  :ny 15,
;;=>  :p-value 0.12451349808974498,
;;=>  :paired? false,
;;=>  :sides :two-sided,
;;=>  :stat -1.632902633201205,
;;=>  :stderr 12.51350381698818,
;;=>  :t -1.632902633201205,
;;=>  :test-type :two-sided}
(ttest-two-samples [1 2 3 4 5 6 7 8 9 10]
                   [7 8 9 10 11 12 13 14 15 16 17 18 19 20]
                   {:equal-variances? true})
;;=> {:alpha 0.05,
;;=>  :confidence-interval [-11.22324472988163 -4.77675527011837],
;;=>  :df 22.0,
;;=>  :equal-variances? true,
;;=>  :estimate -8.0,
;;=>  :estimated-mu [5.5 13.5],
;;=>  :level 0.95,
;;=>  :mu 0.0,
;;=>  :n [10 14],
;;=>  :nx 10,
;;=>  :ny 14,
;;=>  :p-value 3.690577215911943E-5,
;;=>  :paired? false,
;;=>  :sides :two-sided,
;;=>  :stat -5.147292847304685,
;;=>  :stderr 1.5542150480497916,
;;=>  :t -5.147292847304685,
;;=>  :test-type :two-sided}
(ttest-two-samples [1 2 3 4 5 6 7 8 9 10]
                   [200 11 200 11 200 11 200 11 200 11]
                   {:paired? true})
;;=> {:alpha 0.05,
;;=>  :confidence-interval [-171.66671936335894 -28.333280636641092],
;;=>  :df 9,
;;=>  :estimate -100.0,
;;=>  :level 0.95,
;;=>  :mu 0.0,
;;=>  :n 10,
;;=>  :p-value 0.011615504295919215,
;;=>  :paired? true,
;;=>  :stat -3.156496045715208,
;;=>  :stderr 31.680698645494967,
;;=>  :t -3.156496045715208,
;;=>  :test-type :two-sided}variance
(variance vs)(variance vs u)Calculate variance of vs.
See population-variance.
Examples
Variance.
(variance [1 2 3 -1 -1 2 -1 11 111])
;;=> 1333.861111111111weighted-kappa
(weighted-kappa contingency-table)(weighted-kappa contingency-table weights)Cohen’s weighted kappa for indexed contingency table
winsor
(winsor vs)(winsor vs quantile)(winsor vs quantile estimation-strategy)(winsor vs low high nan)Return winsorized data. Trim is done by using quantiles, by default is set to 0.2.
wmedian
(wmedian vs ws)(wmedian vs ws method)Weighted median.
Calculation is done using interpolation. There are three methods: * :linear - linear interpolation, default * :step - step interpolation * :average - average of ties
Based on spatstat.geom::weighted.quantile from R.
wquantile
(wquantile vs ws q)(wquantile vs ws q method)Weighted quantile.
Calculation is done using interpolation. There are three methods: * :linear - linear interpolation, default * :step - step interpolation * :average - average of ties
Based on spatstat.geom::weighted.quantile from R.
wquantiles
(wquantiles vs ws)(wquantiles vs ws qs)(wquantiles vs ws qs method)Weighted quantiles.
Calculation is done using interpolation. There are three methods: * :linear - linear interpolation, default * :step - step interpolation * :average - average of ties
Based on spatstat.geom::weighted.quantile from R.
z-test-one-sample
(z-test-one-sample xs)(z-test-one-sample xs m)One sample z-test
- alpha- significance level (default:- 0.05)
- sides- one of:- :two-sided,- :one-sided-less(short:- :one-sided) or- :one-sided-greater
- mu- mean (default:- 0.0)
z-test-two-samples
(z-test-two-samples xs ys)(z-test-two-samples xs ys {:keys [paired? equal-variances?], :or {paired? false, equal-variances? false}, :as params})Two samples z-test
- alpha- significance level (default:- 0.05)
- sides- one of:- :two-sided(default),- :one-sided-less(short:- :one-sided) or- :one-sided-greater
- mu- mean (default:- 0.0)
- paired?- unpaired or paired test, boolean (default:- false)
- equal-variances?- unequal or equal variances, boolean (default:- false)