1 # Created by Octave 3.6.1, Mon Apr 23 21:08:02 2012 UTC <root@brouzouf>
13 # name: <cell-element>
17 BLAND_ALTMANN shows the Bland-Altman plot of two columns of measurements
18 and computes several summary results.
20 bland_altman(m1, m2 [,group])
21 bland_altman(data [, group])
24 m1,m2 are two colums with the same number of elements
25 containing the measurements. m1,m2 can be also combined
26 in a single two column data matrix.
27 group [optional] indicates which measurements belong to the same group
28 This is useful to account for repeated measurements.
32 [1] JM Bland and DG Altman, Measuring agreement in method comparison studies.
33 Statistical Methods in Medical Research, 1999; 8; 135.
34 doi:10.1177/09622802990080204
35 [2] P.S. Myles, Using the Bland– Altman method to measure agreement with repeated measures
36 British Journal of Anaesthesia 99(3):309–11 (2007)
37 doi:10.1093/bja/aem214
41 # name: <cell-element>
45 BLAND_ALTMANN shows the Bland-Altman plot of two columns of measurements
50 # name: <cell-element>
57 # name: <cell-element>
61 CAT2BIN converts categorial into binary data
62 each category of each column in D is converted into a logical column
65 [B,BinLabel] = cat2bin(C,Label);
66 [B,BinLabel] = cat2bin(C,Label,MODE)
70 Label description of each column in C
71 BinLabel description of each column in B
72 MODE default [], ignores NaN
73 'notIgnoreNAN' includes binary column for NaN
74 'IgnoreZeros' zeros do not get a separate category
75 'IgnoreZeros+NaN' zeros and NaN are ignored
78 cat2bin([1;2;5;1;5]) results in
87 # name: <cell-element>
91 CAT2BIN converts categorial into binary data
92 each category of each column i
96 # name: <cell-element>
103 # name: <cell-element>
107 CDFPLOT plots empirical commulative distribution function
111 cdfplot(X, PROPERTY, VALUE,...)
113 [h,stats] = cdfplot(X)
115 X contains the data vector
116 (matrix data is currently changed to a vector, this might change in future)
118 are used for formating; see HELP PLOT for more details
119 h graphics handle to the cdf curve
121 a struct containing various summary statistics including
122 mean, std, median, min, max.
124 see also: ecdf, median, statistics, hist2res, plot
130 # name: <cell-element>
134 CDFPLOT plots empirical commulative distribution function
139 # name: <cell-element>
146 # name: <cell-element>
150 CENTER removes the mean
152 [z,mu] = center(x,DIM,W)
153 removes mean x along dimension DIM
159 default or []: first DIMENSION, with more than 1 element
160 W weights to computed weighted mean (default: [], all weights = 1)
161 numel(W) must be equal to size(x,DIM)
164 - can deal with NaN's (missing values)
167 - compatible to Matlab and Octave
169 see also: SUMSKIPNAN, MEAN, STD, DETREND, ZSCORE
175 # name: <cell-element>
179 CENTER removes the mean
184 # name: <cell-element>
191 # name: <cell-element>
195 CLASSIFY classifies sample data into categories
196 defined by the training data and its group information
198 CLASS = classify(sample, training, group)
199 CLASS = classify(sample, training, group, TYPE)
200 [CLASS,ERR,POSTERIOR,LOGP,COEF] = CLASSIFY(...)
202 CLASS contains the assigned group.
203 ERR is the classification error on the training set weighted by the
204 prior propability of each group.
206 The same classifier as in TRAIN_SC are supported.
208 ATTENTION: no cross-validation is applied, therefore the
209 classification error is too optimistic (overfitting).
210 Use XVAL instead to obtain cross-validated performance.
212 see also: TRAIN_SC, TEST_SC, XVAL
215 [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed.
216 John Wiley & Sons, 2001.
220 # name: <cell-element>
224 CLASSIFY classifies sample data into categories
225 defined by the training data
229 # name: <cell-element>
233 coefficient_of_variation
236 # name: <cell-element>
240 COEFFICIENT_OF_VARIATION returns STD(X)/MEAN(X)
242 cv=coefficient_of_variation(x [,DIM])
245 see also: SUMSKIPNAN, MEAN, STD
248 http://mathworld.wolfram.com/VariationCoefficient.html
252 # name: <cell-element>
256 COEFFICIENT_OF_VARIATION returns STD(X)/MEAN(X)
258 cv=coefficient_of_variation(
262 # name: <cell-element>
269 # name: <cell-element>
273 COR calculates the correlation matrix
274 X and Y can contain missing values encoded with NaN.
275 NaN's are skipped, NaN do not result in a NaN output.
276 (Its assumed that the occurence of NaN's is uncorrelated)
277 The output gives NaN only if there are insufficient input data
280 calculates the (auto-)correlation matrix of X
282 calculates the crosscorrelation between X and Y
285 c is the correlation matrix
287 W weights to compute weighted mean (default: [])
288 if W=[], all weights are 1.
289 number of elements in W must match size(x,DIM)
293 # name: <cell-element>
297 COR calculates the correlation matrix
298 X and Y can contain missing values encod
302 # name: <cell-element>
309 # name: <cell-element>
313 CORRCOEF calculates the correlation matrix from pairwise correlations.
314 The input data can contain missing values encoded with NaN.
315 Missing data (NaN's) are handled by pairwise deletion [15].
316 In order to avoid possible pitfalls, use case-wise deletion or
317 or check the correlation of NaN's with your data (see below).
318 A significance test for testing the Hypothesis
319 'correlation coefficient R is significantly different to zero'
323 calculates the (auto-)correlation matrix of X
324 [...] = CORRCOEF(X,Y);
325 calculates the crosscorrelation between X and Y
327 [...] = CORRCOEF(..., Mode);
328 Mode='Pearson' or 'parametric' [default]
329 gives the correlation coefficient
330 also known as the 'product-moment coefficient of correlation'
331 or 'Pearson''s correlation' [1]
332 Mode='Spearman' gives 'Spearman''s Rank Correlation Coefficient'
333 This replaces SPEARMAN.M
334 Mode='Rank' gives a nonparametric Rank Correlation Coefficient
335 This is the "Spearman rank correlation with proper handling of ties"
336 This replaces RANKCORR.M
338 [...] = CORRCOEF(..., param1, value1, param2, value2, ... );
340 'Mode' type of correlation
341 'Pearson','parametric'
344 'rows' how do deal with missing values encoded as NaN's.
345 'complete': remove all rows with at least one NaN
346 'pairwise': [default]
347 'alpha' 0.01 : significance level to compute confidence interval
349 [R,p,ci1,ci2,nansig] = CORRCOEF(...);
350 R is the correlation matrix
351 R(i,j) is the correlation coefficient r between X(:,i) and Y(:,j)
352 p gives the significance of R
353 It tests the null hypothesis that the product moment correlation coefficient is zero
354 using Student's t-test on the statistic t = r*sqrt(N-2)/sqrt(1-r^2)
355 where N is the number of samples (Statistics, M. Spiegel, Schaum series).
356 p > alpha: do not reject the Null hypothesis: 'R is zero'.
357 p < alpha: The alternative hypothesis 'R is larger than zero' is true with probability (1-alpha).
358 ci1 lower (1-alpha) confidence interval
359 ci2 upper (1-alpha) confidence interval
360 If no alpha is provided, the default alpha is 0.01. This can be changed with function flag_implicit_significance.
361 nan_sig p-value whether H0: 'NaN''s are not correlated' could be correct
362 if nan_sig < alpha, H1 ('NaNs are correlated') is very likely.
364 The result is only valid if the occurence of NaN's is uncorrelated. In
365 order to avoid this pitfall, the correlation of NaN's should be checked
366 or case-wise deletion should be applied.
367 Case-Wise deletion can be implemented
368 ix = ~any(isnan([X,Y]),2);
369 [...] = CORRCOEF(X(ix,:),Y(ix,:),...);
371 Correlation (non-random distribution) of NaN's can be checked with
372 [nan_R,nan_sig]=corrcoef(X,isnan(X))
373 or [nan_R,nan_sig]=corrcoef([X,Y],isnan([X,Y]))
374 or [R,p,ci1,ci2] = CORRCOEF(...);
376 Further recommandation related to the correlation coefficient:
377 + LOOK AT THE SCATTERPLOTS to make sure that the relationship is linear
378 + Correlation is not causation because
379 it is not clear which parameter is 'cause' and which is 'effect' and
380 the observed correlation between two variables might be due to the action of other, unobserved variables.
382 see also: SUMSKIPNAN, COVM, COV, COR, SPEARMAN, RANKCORR, RANKS,
383 PARTCORRCOEF, flag_implicit_significance
386 on the correlation coefficient
387 [ 1] http://mathworld.wolfram.com/CorrelationCoefficient.html
388 [ 2] http://www.geography.btinternet.co.uk/spearman.htm
389 [ 3] Hogg, R. V. and Craig, A. T. Introduction to Mathematical Statistics, 5th ed. New York: Macmillan, pp. 338 and 400, 1995.
390 [ 4] Lehmann, E. L. and D'Abrera, H. J. M. Nonparametrics: Statistical Methods Based on Ranks, rev. ed. Englewood Cliffs, NJ: Prentice-Hall, pp. 292, 300, and 323, 1998.
391 [ 5] Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 634-637, 1992
392 [ 6] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html
393 on the significance test of the correlation coefficient
394 [11] http://www.met.rdg.ac.uk/cag/STATS/corr.html
395 [12] http://www.janda.org/c10/Lectures/topic06/L24-significanceR.htm
396 [13] http://faculty.vassar.edu/lowry/ch4apx.html
397 [14] http://davidmlane.com/hyperstat/B134689.html
398 [15] http://www.statsoft.com/textbook/stbasic.html%Correlations
400 [20] http://www.tufts.edu/~gdallal/corr.htm
401 [21] Fisher transformation http://en.wikipedia.org/wiki/Fisher_transformation
405 # name: <cell-element>
409 CORRCOEF calculates the correlation matrix from pairwise correlations.
413 # name: <cell-element>
420 # name: <cell-element>
424 COV covariance matrix
425 X and Y can contain missing values encoded with NaN.
426 NaN's are skipped, NaN do not result in a NaN output.
427 The output gives NaN only if there are insufficient input data
428 The mean is removed from the data.
430 Remark: for data contains missing values, the resulting
431 matrix might not be positiv definite, and its elements have magnitudes
432 larger than one. This ill-behavior is more likely for small sample
433 sizes, but there is no garantee that the result "behaves well" for larger
434 sample sizes. If you want the a "well behaved" result (i.e. positive
435 definiteness and magnitude of elements not larger than 1), use CORRCOEF.
436 However, COV is faster than CORRCOEF and might be good enough in some cases.
439 calculates the (auto-)correlation matrix of X
440 C = COV(X,Y [,Mode]);
441 calculates the crosscorrelation between X and Y.
442 C(i,j) is the correlation between the i-th and jth
443 column of X and Y, respectively.
444 NOTE: Octave and Matlab have (in some special cases) incompatible implemenations.
445 This implementation follows Octave. If the result could be ambigous or
446 incompatible, a warning will be presented in Matlab. To avoid this warning use:
447 a) use COV([X(:),Y(:)]) if you want the traditional Matlab result.
448 b) use C = COV([X,Y]), C = C(1:size(X,2),size(X,2)+1:size(C,2)); if you want to be compatible with this software.
450 Mode = 0 [default] scales C by (N-1)
451 Mode = 1 scales C by N.
453 see also: COVM, COR, CORRCOEF, SUMSKIPNAN
456 http://mathworld.wolfram.com/Covariance.html
460 # name: <cell-element>
464 COV covariance matrix
465 X and Y can contain missing values encoded with NaN.
469 # name: <cell-element>
476 # name: <cell-element>
480 COVM generates covariance matrix
481 X and Y can contain missing values encoded with NaN.
482 NaN's are skipped, NaN do not result in a NaN output.
483 The output gives NaN only if there are insufficient input data
486 calculates the (auto-)correlation matrix of X
488 calculates the crosscorrelation between X and Y
490 weighted crosscorrelation
492 Mode = 'M' minimum or standard mode [default]
493 C = X'*X; or X'*Y correlation matrix
495 Mode = 'E' extended mode
496 C = [1 X]'*[1 X]; % l is a matching column of 1's
497 C is additive, i.e. it can be applied to subsequent blocks and summed up afterwards
498 the mean (or sum) is stored on the 1st row and column of C
500 Mode = 'D' or 'D0' detrended mode
501 the mean of X (and Y) is removed. If combined with extended mode (Mode='DE'),
502 the mean (or sum) is stored in the 1st row and column of C.
503 The default scaling is factor (N-1).
504 Mode = 'D1' is the same as 'D' but uses N for scaling.
507 C is the scaled by N in Mode M and by (N-1) in mode D.
509 C is not scaled, provides the scaling factor N
510 C./N gives the scaled version.
512 see also: DECOVM, XCOVF
516 # name: <cell-element>
520 COVM generates covariance matrix
521 X and Y can contain missing values encoded wi
525 # name: <cell-element>
532 # name: <cell-element>
536 CUMSUMSKIPNAN Cumulative sum while skiping NaN's.
537 If DIM is omitted, it defaults to the first non-singleton dimension.
539 Y = cumsumskipnan(x [,DIM])
542 DIM dimension (default: [])
545 see also: CUMSUM, SUMSKIPNAN
549 # name: <cell-element>
553 CUMSUMSKIPNAN Cumulative sum while skiping NaN's.
557 # name: <cell-element>
564 # name: <cell-element>
568 decompose extended covariance matrix into mean (mu),
569 standard deviation, the (pure) Covariance (COV),
570 correlation (xc) matrix and the correlation coefficients R2.
571 NaN's are condsidered as missing values.
572 [mu,sd,COV,xc,N,R2]=decovm(ECM[,NN])
574 ECM is the extended covariance matrix
575 NN is the number of elements, each estimate (in ECM) is based on
577 see also: MDBC, COVM, R2
581 # name: <cell-element>
585 decompose extended covariance matrix into mean (mu),
586 standard deviation, the
590 # name: <cell-element>
597 # name: <cell-element>
601 DETREND removes the trend from data, NaN's are considered as missing values
603 DETREND is fully compatible to previous Matlab and Octave DETREND with the following features added:
604 - handles NaN's by assuming that these are missing values
605 - handles unequally spaced data
606 - second output parameter gives the trend of the data
607 - compatible to Matlab and Octave
609 [...]=detrend([t,] X [,p])
610 removes trend for unequally spaced data
611 t represents the time points
612 X(i) is the value at time t(i)
616 [...]=detrend(X,'constant')
620 removes polynomial of order p (default p=1)
622 [...]=detrend(X,1) - default
623 [...]=detrend(X,'linear')
628 X is the detrended data
629 T is the removed trend
631 see also: SUMSKIPNAN, ZSCORE
635 # name: <cell-element>
639 DETREND removes the trend from data, NaN's are considered as missing values
645 # name: <cell-element>
652 # name: <cell-element>
656 ECDF empirical cumulative function
657 NaN's are considered Missing values and are ignored.
660 calculates empirical cumulative distribution functions (i.e Kaplan-Meier estimate)
663 without output arguments plots the empirical cdf, in axis gca.
666 must be a vector or matrix, in case Y is a matrix, the ecdf for every column is computed.
668 see also: HISTO2, HISTO3, PERCENTILE, QUANTILE
672 # name: <cell-element>
676 ECDF empirical cumulative function
677 NaN's are considered Missing values and
681 # name: <cell-element>
688 # name: <cell-element>
692 FLAG_ACCURACY_LEVEL sets and gets accuracy level
693 used in SUMSKIPNAN_MEX and COVM_MEX
694 The error margin of the naive summation is N*eps (N is the number of samples),
695 the error margin is only 2*eps if Kahan's summation is used [1].
697 0: maximum speed [default]
698 accuracy of double (64bit) with naive summation (error = N*2^-52)
699 1: accuracy of extended (80bit) with naive summation (error = N*2^-64)
700 2: accuracy of double (64bit) with Kahan summation (error = 2^-52)
701 3: accuracy of extended (80bit) with Kahan summation (error = 2^-64)
703 Please note, level 3 might be equally accurate but slower than 1 or 2 on
704 some platforms. In order to determine what is good for you, you might want
707 FLAG = flag_accuracy_level()
709 flag_accuracy_level(FLAG)
716 What Every Computer Scientist Should Know About Floating-Point Arithmetic
717 ACM Computing Surveys, Vol 23, No 1, March 1991.
721 # name: <cell-element>
725 FLAG_ACCURACY_LEVEL sets and gets accuracy level
726 used in SUMSKIPNAN_MEX and
730 # name: <cell-element>
734 flag_implicit_significance
737 # name: <cell-element>
741 The use of FLAG_IMPLICIT_SIGNIFICANCE is in experimental state.
742 flag_implicit_significance might even become obsolete.
744 FLAG_IMPLICIT_SIGNIFICANCE sets and gets default alpha (level) of any significance test
745 The default alpha-level is stored in the global variable FLAG_implicit_significance
746 The idea is that the significance must not be assigned explicitely.
747 This might yield more readable code.
749 Choose alpha low enough, because in alpha*100% of the cases, you will
750 reject the Null hypothesis just by change. For this reason, the default
753 flag_implicit_significance(0.01)
754 sets the alpha-level for the significance test
756 alpha = flag_implicit_significance()
759 flag_implicit_significance(alpha)
760 sets default alpha-level
762 alpha = flag_implicit_significance(alpha)
766 - compatible to Matlab and Octave
768 see also: CORRCOEF, PARTCORRCOEF
772 # name: <cell-element>
776 The use of FLAG_IMPLICIT_SIGNIFICANCE is in experimental state.
780 # name: <cell-element>
784 flag_implicit_skip_nan
787 # name: <cell-element>
791 FLAG_IMPLICIT_SKIP_NAN sets and gets default mode for handling NaNs
792 1 skips NaN's (the default mode if no mode is set)
793 0 NaNs are propagated; input NaN's give NaN's at the output
795 FLAG = flag_implicit_skip_nan()
798 flag_implicit_skip_nan(FLAG)
801 prevFLAG = flag_implicit_skip_nan(nextFLAG)
802 gets previous set FLAG and sets FLAG for the future
803 flag_implicit_skip_nan(prevFLAG)
804 resets FLAG to previous mode
807 SUMSKIPNAN, MEDIAN, QUANTILES, TRIMEAN
808 and affects many other functions like:
809 CENTER, KURTOSIS, MAD, MEAN, MOMENT, RMS, SEM, SKEWNESS,
810 STATISTIC, STD, VAR, ZSCORE etc.
812 The mode is stored in the global variable FLAG_implicit_skip_nan
813 It is recommended to use flag_implicit_skip_nan(1) as default and
814 flag_implicit_skip_nan(0) should be used for exceptional cases only.
815 This feature might disappear without further notice, so you should really not
820 # name: <cell-element>
824 FLAG_IMPLICIT_SKIP_NAN sets and gets default mode for handling NaNs
829 # name: <cell-element>
836 # name: <cell-element>
840 FLAG_NANS_OCCURED checks whether the last call(s) to sumskipnan or covm
841 contained any not-a-numbers in the input argument. Because many other
842 functions like mean, std, etc. are also using sumskipnan,
843 also these functions can be checked for NaN's in the input data.
845 A call to FLAG_NANS_OCCURED() resets also the flag whether NaN's occured.
846 Only sumskipnan or covm can set the flag again.
848 see also: SUMSKIPNAN, COVM
852 # name: <cell-element>
856 FLAG_NANS_OCCURED checks whether the last call(s) to sumskipnan or covm
861 # name: <cell-element>
868 # name: <cell-element>
872 FSS - feature subset selection and feature ranking
873 the method is motivated by the max-relevance-min-redundancy (mRMR)
874 approach [1]. However, the default method uses partial correlation,
875 which has been developed from scratch. PCCM [3] describes
876 a similar idea, but is more complicated.
877 An alternative method based on FSDD is implemented, too.
879 [idx,score] = fss(D,cl)
880 [idx,score] = fss(D,cl,MODE)
881 [idx,score] = fss(D,cl,MODE)
883 D data - each column represents a feature
885 Mode 'Pearson' [default] correlation
887 'FSDD' feature selection algorithm based on a distance discriminant [2]
888 %%% 'MRMR','MID','MIQ' max-relevance, min redundancy [1] - not supported yet.
890 score score of the feature
891 idx ranking of the feature
892 [tmp,idx]=sort(-score)
894 see also: TRAIN_SC, XVAL, ROW_COL_DELETION
897 [1] Peng, H.C., Long, F., and Ding, C.,
898 Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,
899 IEEE Transactions on Pattern Analysis and Machine Intelligence,
900 Vol. 27, No. 8, pp.1226-1238, 2005.
901 [2] Jianning Liang, Su Yang, Adam Winstanley,
902 Invariant optimal feature selection: A distance discriminant and feature ranking based solution,
903 Pattern Recognition, Volume 41, Issue 5, May 2008, Pages 1429-1439.
904 ISSN 0031-3203, DOI: 10.1016/j.patcog.2007.10.018.
905 [3] K. Raghuraj Rao and S. Lakshminarayanan
906 Partial correlation based variable selection approach for multivariate data classification methods
907 Chemometrics and Intelligent Laboratory Systems
908 Volume 86, Issue 1, 15 March 2007, Pages 68-81
909 http://dx.doi.org/10.1016/j.chemolab.2006.08.007
913 # name: <cell-element>
917 FSS - feature subset selection and feature ranking
918 the method is motivated
922 # name: <cell-element>
929 # name: <cell-element>
933 GEOMEAN calculates the geomentric mean of data elements.
935 y = geomean(x [,DIM [,W]]) is the same as
936 y = mean(x,'G' [,DIM])
941 default or []: first DIMENSION, with more than 1 element
942 W weights to compute weighted mean (default: [])
943 if W=[], all weights are 1.
944 number of elements in W must match size(x,DIM)
947 - can deal with NaN's (missing values)
949 - dimension argument also in Octave
950 - compatible to Matlab and Octave
952 see also: SUMSKIPNAN, MEAN, HARMMEAN
954 This program is free software; you can redistribute it and/or modify
955 it under the terms of the GNU General Public License as published by
956 the Free Software Foundation; either version 3 of the License, or
957 (at your option) any later version.
959 This program is distributed in the hope that it will be useful,
960 but WITHOUT ANY WARRANTY; without even the implied warranty of
961 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
962 GNU General Public License for more details.
964 You should have received a copy of the GNU General Public License
965 along with this program; If not, see <http://www.gnu.org/licenses/>.
969 # name: <cell-element>
973 GEOMEAN calculates the geomentric mean of data elements.
977 # name: <cell-element>
984 # name: <cell-element>
988 GSCATTER scatter plot of groups
991 gscatter(x,y,group,clr,sym,siz)
992 gscatter(x,y,group,clr,sym,siz,doleg)
993 gscatter(x,y,group,clr,sym,siz,doleg,xname,yname)
996 x,y, group: vectors with equal length
997 clf: color vector, default 'bgrcmyk'
998 sym: symbol, default '.'
1000 doleg: 'on' (default) shows legend, 'off' turns of legend
1001 xname, yname: name of axis
1004 see also: ecdf, cdfplot
1010 # name: <cell-element>
1014 GSCATTER scatter plot of groups
1019 # name: <cell-element>
1026 # name: <cell-element>
1030 HARMMEAN calculates the harmonic mean of data elements.
1031 The harmonic mean is the inverse of the mean of the inverse elements.
1033 y = harmmean(x [,DIM [,W]]) is the same as
1034 y = mean(x,'H' [,DIM [,W]])
1039 default or []: first DIMENSION, with more than 1 element
1040 W weights to compute weighted mean (default: [])
1041 if W=[], all weights are 1.
1042 number of elements in W must match size(x,DIM)
1045 - can deal with NaN's (missing values)
1047 - dimension argument also in Octave
1048 - compatible to Matlab and Octave
1050 see also: SUMSKIPNAN, MEAN, GEOMEAN
1055 # name: <cell-element>
1059 HARMMEAN calculates the harmonic mean of data elements.
1063 # name: <cell-element>
1070 # name: <cell-element>
1074 Evaluates Histogram data
1078 estimates fun-statistic
1081 'std' standard deviation
1083 'sem' standard error of the mean
1084 'rms' root mean square
1085 'meansq' mean of squares
1087 'sumsq' sum of squares
1088 'CM#' central moment of order #
1090 'kurtosis' excess coefficient (Fisher kurtosis)
1092 see also: NaN/statistic
1095 [1] C.L. Nikias and A.P. Petropulu "Higher-Order Spectra Analysis" Prentice Hall, 1993.
1096 [2] C.E. Shannon and W. Weaver "The mathematical theory of communication" University of Illinois Press, Urbana 1949 (reprint 1963).
1097 [3] http://www.itl.nist.gov/
1098 [4] http://mathworld.wolfram.com/
1102 # name: <cell-element>
1106 Evaluates Histogram data
1112 # name: <cell-element>
1119 # name: <cell-element>
1123 IQR calculates the interquartile range
1124 Missing values (encoded as NaN) are ignored.
1128 returns the IQR along dimension DIM of sample array Y.
1131 returns the IQR from the histogram HIS.
1132 HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
1134 see also: MAD, RANGE, HISTO2, HISTO3, PERCENTILE, QUANTILE
1138 # name: <cell-element>
1142 IQR calculates the interquartile range
1143 Missing values (encoded as NaN) are
1147 # name: <cell-element>
1154 # name: <cell-element>
1158 KAPPA estimates Cohen's kappa coefficient
1159 and related statistics
1161 [...] = kappa(d1,d2);
1162 NaN's are handled as missing values and are ignored
1163 [...] = kappa(d1,d2,'notIgnoreNAN');
1164 NaN's are handled as just another Label.
1165 [kap,sd,H,z,ACC,sACC,MI] = kappa(...);
1171 kap Cohen's kappa coefficient point
1172 se standard error of the kappa estimate
1173 H Concordance matrix, i.e. confusion matrix
1175 ACC overall agreement (accuracy)
1176 sACC specific accuracy
1177 MI Mutual information or transfer information (in [bits])
1178 X is a struct containing all the fields above
1179 For two classes, a number of additional summary statistics including
1180 TPR, FPR, FDR, PPV, NPF, F1, dprime, Matthews Correlation coefficient (MCC) or
1181 Phi coefficient (PHI=MCC), Specificity and Sensitivity
1182 are provided. Note, the positive category must the larger label (in d and c), otherwise
1183 the confusion matrix becomes transposed and the summary statistics are messed up.
1187 [1] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.
1188 [2] J Bortz, GA Lienert (1998) Kurzgefasste Statistik f|r die klassische Forschung, Springer Berlin - Heidelberg.
1189 Kapitel 6: Uebereinstimmungsmasze fuer subjektive Merkmalsurteile. p. 265-270.
1190 [3] http://www.cmis.csiro.au/Fiona.Evans/personal/msc/html/chapter3.html
1191 [4] Kraemer, H. C. (1982). Kappa coefficient. In S. Kotz and N. L. Johnson (Eds.),
1192 Encyclopedia of Statistical Sciences. New York: John Wiley & Sons.
1193 [5] http://ourworld.compuserve.com/homepages/jsuebersax/kappa.htm
1194 [6] http://en.wikipedia.org/wiki/Receiver_operating_characteristic
1198 # name: <cell-element>
1202 KAPPA estimates Cohen's kappa coefficient
1203 and related statistics
1208 # name: <cell-element>
1215 # name: <cell-element>
1219 KURTOSIS estimates the kurtosis
1222 calculates kurtosis of x in dimension DIM
1227 default or []: first DIMENSION, with more than 1 element
1230 - can deal with NaN's (missing values)
1231 - dimension argument
1232 - compatible to Matlab and Octave
1234 see also: SUMSKIPNAN, VAR, STD, VAR, SKEWNESS, MOMENT, STATISTIC,
1238 http://mathworld.wolfram.com/
1242 # name: <cell-element>
1246 KURTOSIS estimates the kurtosis
1251 # name: <cell-element>
1258 # name: <cell-element>
1263 loads famous iris data set from Fisher, 1936 [1].
1266 [1] Fisher,R.A. "The use of multiple measurements in taxonomic problems"
1267 Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).
1268 [2] Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
1269 (Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
1273 # name: <cell-element>
1278 loads famous iris data set from Fisher, 1936 [1].
1282 # name: <cell-element>
1289 # name: <cell-element>
1293 MAD estimates the Mean Absolute deviation
1294 (note that according to [1,2] this is the mean deviation;
1295 not the mean absolute deviation)
1298 calculates the mean deviation of x in dimension DIM
1303 default or []: first DIMENSION, with more than 1 element
1306 - can deal with NaN's (missing values)
1307 - dimension argument
1308 - compatible to Matlab and Octave
1310 see also: SUMSKIPNAN, VAR, STD,
1313 [1] http://mathworld.wolfram.com/MeanDeviation.html
1314 [2] L. Sachs, "Applied Statistics: A Handbook of Techniques", Springer-Verlag, 1984, page 253.
1316 [3] http://mathworld.wolfram.com/MeanAbsoluteDeviation.html
1317 [4] Kenney, J. F. and Keeping, E. S. "Mean Absolute Deviation." §6.4 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 76-77 1962.
1321 # name: <cell-element>
1325 MAD estimates the Mean Absolute deviation
1326 (note that according to [1,2] this i
1330 # name: <cell-element>
1337 # name: <cell-element>
1341 MAHAL return the Mahalanobis' D-square distance between the
1342 multivariate samples x and y, which must have the same number
1343 of components (columns), but may have a different number of observations (rows).
1347 d(k) = (X(k,:)-MU)*inv(SIGMA)*(X(k,:)-MU)'
1349 where MU and SIGMA are the mean and the covariance matrix of Y
1352 see also: TRAIN_SC, TEST_SC, COVM
1358 # name: <cell-element>
1362 MAHAL return the Mahalanobis' D-square distance between the
1367 # name: <cell-element>
1374 # name: <cell-element>
1378 This make.m is used for Matlab under Windows
1382 # name: <cell-element>
1390 # name: <cell-element>
1397 # name: <cell-element>
1401 MEAN calculates the mean of data elements.
1403 y = mean(x [,DIM] [,opt] [, W])
1408 N MEAN of N-th dimension
1409 default or []: first DIMENSION, with more than 1 element
1416 W weights to compute weighted mean (default: [])
1417 if W=[], all weights are 1.
1418 number of elements in W must match size(x,DIM)
1427 mean(x,DIM,opt,W); '
1430 - can deal with NaN's (missing values)
1432 - dimension argument also in Octave
1433 - compatible to Matlab and Octave
1435 see also: SUMSKIPNAN, MEAN, GEOMEAN, HARMMEAN
1440 # name: <cell-element>
1444 MEAN calculates the mean of data elements.
1448 # name: <cell-element>
1455 # name: <cell-element>
1459 MEANDEV estimates the Mean deviation
1460 (note that according to [1,2] this is the mean deviation;
1461 not the mean absolute deviation)
1464 calculates the mean deviation of x in dimension DIM
1469 default or []: first DIMENSION, with more than 1 element
1472 - can deal with NaN's (missing values)
1473 - dimension argument
1474 - compatible to Matlab and Octave
1476 see also: SUMSKIPNAN, VAR, STD, MAD
1479 [1] http://mathworld.wolfram.com/MeanDeviation.html
1480 [2] L. Sachs, "Applied Statistics: A Handbook of Techniques", Springer-Verlag, 1984, page 253.
1481 [3] http://mathworld.wolfram.com/MeanAbsoluteDeviation.html
1482 [4] Kenney, J. F. and Keeping, E. S. "Mean Absolute Deviation." §6.4 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 76-77 1962.
1486 # name: <cell-element>
1490 MEANDEV estimates the Mean deviation
1491 (note that according to [1,2] this is the
1495 # name: <cell-element>
1502 # name: <cell-element>
1506 MEANSQ calculates the mean of the squares
1513 N STD of N-th dimension
1514 default or []: first DIMENSION, with more than 1 element
1515 W weights to compute weighted mean (default: [])
1516 if W=[], all weights are 1.
1517 number of elements in W must match size(x,DIM)
1520 - can deal with NaN's (missing values)
1522 - dimension argument also in Octave
1523 - compatible to Matlab and Octave
1525 see also: SUMSQ, SUMSKIPNAN, MEAN, VAR, STD, RMS
1529 # name: <cell-element>
1533 MEANSQ calculates the mean of the squares
1538 # name: <cell-element>
1545 # name: <cell-element>
1549 medAbsDev calculates the median absolute deviation
1551 Usage: D = medAbsDev(X, DIM)
1552 or: [D, M] = medAbsDev(X, DIM)
1554 DIM: dimension along which mad should be calculated (1=columns, 2=rows)
1555 (optional, default=first dimension with more than 1 element
1556 Output: D : median absolute deviations
1557 M : medians (optional)
1561 # name: <cell-element>
1565 medAbsDev calculates the median absolute deviation
1570 # name: <cell-element>
1577 # name: <cell-element>
1581 MEDIAN data elements,
1582 [y]=median(x [,DIM])
1585 1: median of columns
1587 N: median of N-th dimension
1588 default or []: first DIMENSION, with more than 1 element
1591 - can deal with NaN's (missing values)
1592 - accepts dimension argument like in Matlab in Octave, too.
1593 - compatible to Matlab and Octave
1595 see also: SUMSKIPNAN
1599 # name: <cell-element>
1603 MEDIAN data elements,
1604 [y]=median(x [,DIM])
1609 # name: <cell-element>
1616 # name: <cell-element>
1620 MOMENT estimates the p-th moment
1622 M = moment(x, p [,opt] [,DIM])
1623 M = moment(H, p [,opt])
1624 calculates p-th central moment from data x in dimension DIM
1628 opt 'ac': absolute 'a' and/or central ('c') moment
1629 DEFAULT: '' raw moments are estimated
1633 default or []: first DIMENSION, with more than 1 element
1636 - can deal with NaN's (missing values)
1637 - dimension argument
1638 - compatible to Matlab and Octave
1640 see also: STD, VAR, SKEWNESS, KURTOSIS, STATISTIC,
1643 http://mathworld.wolfram.com/Moment.html
1647 # name: <cell-element>
1651 MOMENT estimates the p-th moment
1653 M = moment(x, p [,opt] [,DIM])
1658 # name: <cell-element>
1665 # name: <cell-element>
1669 NANCONV computes the convolution for data with missing values.
1670 X and Y can contain missing values encoded with NaN.
1671 NaN's are skipped, NaN do not result in a NaN output.
1672 The output gives NaN only if there are insufficient input data
1674 [...] = NANCONV(X,Y);
1675 calculates 2-dim convolution between X and Y
1678 WARNING: missing values can introduce aliasing - causing unintended results.
1679 Moreover, the behavior of bandpass and highpass filters in case of missing values
1680 is not fully understood, and might contain some pitfalls.
1682 see also: CONV, NANCONV2, NANFFT, NANFILTER
1686 # name: <cell-element>
1690 NANCONV computes the convolution for data with missing values.
1694 # name: <cell-element>
1701 # name: <cell-element>
1705 NANFFT calculates the Fourier-Transform of X for data with missing values.
1706 NANFFT is the same as FFT but X can contain missing values encoded with NaN.
1707 NaN's are skipped, NaN do not result in a NaN output.
1711 Y = NANFFT(X,[],DIM)
1714 returns the number of valid samples N
1717 WARNING: missing values can introduce aliasing - causing unintended results.
1718 Moreover, the behavior of bandpass and highpass filters in case of missing values
1719 is not fully understood, and might contain some pitfalls.
1721 see also: FFT, XCORR, NANCONV, NANFILTER
1725 # name: <cell-element>
1729 NANFFT calculates the Fourier-Transform of X for data with missing values.
1733 # name: <cell-element>
1740 # name: <cell-element>
1744 NANFILTER is able to filter data with missing values encoded as NaN.
1746 [Y,Z] = nanfilter(B,A,X [, Z]);
1748 If X contains no missing data, NANFILTER should behave like FILTER.
1749 NaN-values are handled gracefully.
1751 WARNING: missing values can introduce aliasing - causing unintended results.
1752 Moreover, the behavior of bandpass and highpass filters in case of missing values
1753 is not fully understood, and might contain some pitfalls.
1755 see also: FILTER, SUMSKIPNAN, NANFFT, NANCONV, NANFILTER1UC
1759 # name: <cell-element>
1763 NANFILTER is able to filter data with missing values encoded as NaN.
1767 # name: <cell-element>
1774 # name: <cell-element>
1778 NANFILTER1UC is an adaptive filter for data with missing values encoded as NaN.
1780 [Y,Z] = nanfilter1uc(uc,X [, Z]);
1782 if X contains no missing data, NANFILTER behaves like FILTER(uc,[1,uc-1],X[,Z]).
1784 see also: FILTER, NANFILTER, SUMSKIPNAN
1788 # name: <cell-element>
1792 NANFILTER1UC is an adaptive filter for data with missing values encoded as NaN.
1796 # name: <cell-element>
1803 # name: <cell-element>
1807 NANINSTTEST checks whether the functions from NaN-toolbox have been
1808 correctly installed.
1814 # name: <cell-element>
1818 NANINSTTEST checks whether the functions from NaN-toolbox have been
1823 # name: <cell-element>
1830 # name: <cell-element>
1834 NANMEAN same as SUM but ignores NaN's.
1835 NANMEAN is OBSOLETE; use MEAN instead. NANMEAN is included
1836 to provide backward compatibility
1838 Y = nanmean(x [,DIM])
1843 default or []: first DIMENSION with more than 1 element
1847 see also: MEAN, SUMSKIPNAN, NANSUM
1851 # name: <cell-element>
1855 NANMEAN same as SUM but ignores NaN's.
1859 # name: <cell-element>
1866 # name: <cell-element>
1870 NANSTD same as STD but ignores NaN's.
1871 NANSTD is OBSOLETE; use NaN/STD instead. NANSTD is included
1872 to fix a bug in alternative implementations and to
1873 provide some compatibility.
1875 Y = nanstd(x, FLAG, [,DIM])
1878 FLAG 0: [default] normalizes with (N-1), N = sample size
1879 FLAG 1: normalizes with N, N = sample size
1883 default or []: first DIMENSION with more than 1 element
1884 Y resulting standard deviation
1886 see also: SUM, SUMSKIPNAN, NANSUM, STD
1890 # name: <cell-element>
1894 NANSTD same as STD but ignores NaN's.
1898 # name: <cell-element>
1905 # name: <cell-element>
1909 NANSUM same as SUM but ignores NaN's.
1910 NANSUM is OBSOLETE; use SUMSKIPNAN instead. NANSUM is included
1911 to fix a bug in some other versions.
1913 Y = nansum(x [,DIM])
1918 default or []: first DIMENSION with more than 1 element
1922 see also: SUM, SUMSKIPNAN, NANSUM
1926 # name: <cell-element>
1930 NANSUM same as SUM but ignores NaN's.
1934 # name: <cell-element>
1941 # name: <cell-element>
1945 NANTEST checks several mathematical operations and a few
1946 statistical functions for their correctness related to NaN's.
1947 e.g. it checks norminv, normcdf, normpdf, sort, matrix division and multiplication.
1950 see also: NANINSTTEST
1953 [1] W. Kahan (1996) Lecture notes on the Status of "IEEE Standard 754 for
1954 Binary Floating-point Arithmetic.
1959 # name: <cell-element>
1963 NANTEST checks several mathematical operations and a few
1964 statistical function
1968 # name: <cell-element>
1975 # name: <cell-element>
1979 NORMCDF returns normal cumulative distribtion function
1981 cdf = normcdf(x,m,s);
1983 Computes the CDF of a the normal distribution
1984 with mean m and standard deviation s
1986 x,m,s must be matrices of same size, or any one can be a scalar.
1988 see also: NORMPDF, NORMINV
1992 # name: <cell-element>
1996 NORMCDF returns normal cumulative distribtion function
2001 # name: <cell-element>
2008 # name: <cell-element>
2012 NORMINV returns inverse cumulative function of the normal distribution
2016 Computes the quantile (inverse of the CDF) of a the normal
2017 cumulative distribution with mean m and standard deviation s
2019 p,m,s must be matrices of same size, or any one can be a scalar.
2021 see also: NORMPDF, NORMCDF
2025 # name: <cell-element>
2029 NORMINV returns inverse cumulative function of the normal distribution
2034 # name: <cell-element>
2041 # name: <cell-element>
2045 NORMPDF returns normal probability density
2047 pdf = normpdf(x,m,s);
2049 Computes the PDF of a the normal distribution
2050 with mean m and standard deviation s
2052 x,m,s must be matrices of same size, or any one can be a scalar.
2054 see also: NORMCDF, NORMINV
2058 # name: <cell-element>
2062 NORMPDF returns normal probability density
2067 # name: <cell-element>
2074 # name: <cell-element>
2078 PARTCORRCOEF calculates the partial correlation between X and Y
2079 after removing the influence of Z.
2080 X, Y and Z can contain missing values encoded with NaN.
2081 NaN's are skipped, NaN do not result in a NaN output.
2082 (Its assumed that the occurence of NaN's is uncorrelated)
2083 The output gives NaN, only if there are insufficient input data.
2085 The partial correlation is defined as
2086 pcc(xy|z)=(cc(x,y)-cc(x,z)*cc(y,z))/sqrt((1-cc(x,y)�)*((1-cc(x,z)�)))
2089 PARTCORRCOEF(X [,Mode]);
2090 calculates the (auto-)correlation matrix of X
2091 PARTCORRCOEF(X,Y,Z);
2092 PARTCORRCOEF(X,Y,Z,[]);
2093 PARTCORRCOEF(X,Y,Z,'Pearson');
2094 PARTCORRCOEF(X,Y,Z,'Rank');
2095 PARTCORRCOEF(X,Y,Z,'Spearman');
2098 removes from X and Y the part that can be explained by Z
2099 and computes the correlation of the remaining part.
2100 Ideally, this is equivalent to Mode='Pearson', however, in practice
2101 this is more accurate.
2102 Mode='Pearson' or 'parametric'
2105 computes the partial correlation based on cc(x,y),cc(x,z) and cc(y,z)
2106 with the respective mode.
2108 [R,p,ci1,ci2] = PARTCORRCOEF(...);
2109 r is the partialcorrelation matrix
2110 r(i,j) is the partial correlation coefficient r between X(:,i) and Y(:,j)
2111 when influence of Z is removed.
2112 p gives the significance of PCC
2113 It tests the null hypothesis that the product moment correlation coefficient is zero
2114 using Student's t-test on the statistic t = r sqrt(N-Nz-2)/sqrt(1-r^2)
2115 where N is the number of samples (Statistics, M. Spiegel, Schaum series).
2116 p > alpha: do not reject the Null hypothesis: "R is zero".
2117 p < alpha: The alternative hypothesis "R2 is larger than zero" is true with probability (1-alpha).
2118 ci1 lower 0.95 confidence interval
2119 ci2 upper 0.95 confidence interval
2121 see also: SUMSKIPNAN, COVM, COV, COR, SPEARMAN, RANKCORR, RANKS, CORRCOEF
2124 on the partial correlation coefficient
2125 [1] http://www.tufts.edu/~gdallal/partial.htm
2126 [2] http://www.nag.co.uk/numeric/fl/manual/pdf/G02/g02byf.pdf
2130 # name: <cell-element>
2134 PARTCORRCOEF calculates the partial correlation between X and Y
2139 # name: <cell-element>
2146 # name: <cell-element>
2150 PERCENTILE calculates the percentiles of histograms and sample arrays.
2153 Q = percentile(Y,q,DIM)
2154 returns the q-th percentile along dimension DIM of sample array Y.
2155 size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q)
2157 Q = percentile(HIS,q)
2158 returns the q-th percentile from the histogram HIS.
2159 HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
2160 If q is a vector, the each row of Q returns the q(i)-th percentile
2162 see also: HISTO2, HISTO3, QUANTILE
2166 # name: <cell-element>
2170 PERCENTILE calculates the percentiles of histograms and sample arrays.
2174 # name: <cell-element>
2181 # name: <cell-element>
2185 PRCTILE calculates the percentiles of histograms and sample arrays.
2186 (its the same than PERCENTILE.M)
2189 Q = prctile(Y,q,DIM)
2190 returns the q-th percentile along dimension DIM of sample array Y.
2191 size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q)
2194 returns the q-th percentile from the histogram HIS.
2195 HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
2196 If q is a vector, the each row of Q returns the q(i)-th percentile
2198 see also: HISTO2, HISTO3, QUANTILE
2202 # name: <cell-element>
2206 PRCTILE calculates the percentiles of histograms and sample arrays.
2210 # name: <cell-element>
2217 # name: <cell-element>
2221 QUANTILE calculates the quantiles of histograms and sample arrays.
2224 Q = quantile(Y,q,DIM)
2225 returns the q-th quantile along dimension DIM of sample array Y.
2226 size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q)
2229 returns the q-th quantile from the histogram HIS.
2230 HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
2231 If q is a vector, the each row of Q returns the q(i)-th quantile
2233 see also: HISTO2, HISTO3, PERCENTILE
2237 # name: <cell-element>
2241 QUANTILE calculates the quantiles of histograms and sample arrays.
2245 # name: <cell-element>
2252 # name: <cell-element>
2256 RANGE calculates the range of Y
2257 Missing values (encoded as NaN) are ignored.
2261 returns the range along dimension DIM of sample array Y.
2264 returns the RANGE from the histogram HIS.
2265 HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
2267 see also: IQR, MAD, HISTO2, HISTO3, PERCENTILE, QUANTILE
2271 # name: <cell-element>
2275 RANGE calculates the range of Y
2276 Missing values (encoded as NaN) are ignored.
2280 # name: <cell-element>
2287 # name: <cell-element>
2291 RANKCORR calculated the rank correlation coefficient.
2292 This function is replaced by CORRCOEF.
2293 Significance test and confidence intervals can be obtained from CORRCOEF, too.
2295 R = CORRCOEF(X, [Y, ] 'Rank');
2297 The rank correlation r = corrcoef(ranks(x)).
2298 is often confused with Spearman's rank correlation.
2299 Spearman's correlation is defined as
2300 r(x,y) = 1-6*sum((ranks(x)-ranks(y)).^2)/(N*(N*N-1))
2301 The results are different. Here, the former version is implemented.
2303 see also: CORRCOEF, SPEARMAN, RANKS
2306 [1] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html
2307 [2] http://mathworld.wolfram.com/CorrelationCoefficient.html
2311 # name: <cell-element>
2315 RANKCORR calculated the rank correlation coefficient.
2319 # name: <cell-element>
2326 # name: <cell-element>
2330 RANKS gives the rank of each element in a vector.
2331 This program uses an advanced algorithm with averge effort O(m.n.log(n))
2332 NaN in the input yields NaN in the output.
2335 if X is a vector, return the vector of ranks of X adjusted for ties.
2336 if X is matrix, the rank is calculated along dimension DIM.
2337 if DIM is zero or empty, the lowest dimension with more then 1 element is used.
2338 r = ranks(X,DIM,'traditional')
2339 implements the traditional algorithm with O(n^2) computational
2340 and O(n^2) memory effort
2341 r = ranks(X,DIM,'mtraditional')
2342 implements the traditional algorithm with O(n^2) computational
2343 and O(n) memory effort
2344 r = ranks(X,DIM,'advanced ')
2345 implements an advanced algorithm with O(n*log(n)) computational
2346 and O(n.log(n)) memory effort
2347 r = ranks(X,DIM,'advanced-ties')
2348 implements an advanced algorithm with O(n*log(n)) computational
2349 and O(n.log(n)) memory effort
2350 but without correction for ties
2351 This is the fastest algorithm
2353 see also: CORRCOEF, SPEARMAN, RANKCORR
2360 # name: <cell-element>
2364 RANKS gives the rank of each element in a vector.
2368 # name: <cell-element>
2375 # name: <cell-element>
2379 RMS calculates the root mean square
2380 can deal with complex data.
2387 N STD of N-th dimension
2388 default or []: first DIMENSION, with more than 1 element
2389 W weights to compute weighted s.d. (default: [])
2390 if W=[], all weights are 1.
2391 number of elements in W must match size(x,DIM)
2393 y estimated standard deviation
2396 - can deal with NaN's (missing values)
2398 - dimension argument also in Octave
2399 - compatible to Matlab and Octave
2401 see also: SUMSKIPNAN, MEAN
2405 # name: <cell-element>
2409 RMS calculates the root mean square
2410 can deal with complex data.
2414 # name: <cell-element>
2421 # name: <cell-element>
2425 ROW_COL_DELETION selects the rows and columns for removing any missing values.
2426 A heuristic based on maximizing the number of remaining sample values
2427 is used. In other words, if there are more rows than columns, it is
2428 more likely that a row-wise deletion will be applied and vice versa.
2430 [rix,cix] = row_col_deletion(d)
2431 [rix,cix] = row_col_deletion(d,c,w)
2434 d data (each row is a sample, each column a feature)
2435 c classlabels (not really used) [OPTIONAL]
2436 w weight for each sample vector [OPTIONAL]
2438 rix selected samples
2439 cix selected columns
2441 d(rix,cix) does not contain any NaN's i.e. missing values
2443 see also: TRAIN_SC, TEST_SC
2447 # name: <cell-element>
2451 ROW_COL_DELETION selects the rows and columns for removing any missing values.
2455 # name: <cell-element>
2462 # name: <cell-element>
2466 SEM calculates the standard error of the mean
2468 [SE,M] = SEM(x [, DIM [,W]])
2469 calculates the standard error (SE) in dimension DIM
2470 the default DIM is the first non-single dimension
2472 Can deal with complex data, too.
2477 N: SEM of N-th dimension
2478 default or []: first DIMENSION, with more than 1 element
2479 W weights to compute weighted mean and s.d. (default: [])
2480 if W=[], all weights are 1.
2481 number of elements in W must match size(x,DIM)
2484 - can deal with NaN's (missing values)
2486 - dimension argument
2487 - compatible to Matlab and Octave
2489 see also: SUMSKIPNAN, MEAN, VAR, STD
2493 # name: <cell-element>
2497 SEM calculates the standard error of the mean
2499 [SE,M] = SEM(x [, DIM [,W]])
2504 # name: <cell-element>
2511 # name: <cell-element>
2515 SKEWNESS estimates the skewness
2518 calculates skewness of x in dimension DIM
2523 default or []: first DIMENSION, with more than 1 element
2526 - can deal with NaN's (missing values)
2527 - dimension argument
2528 - compatible to Matlab and Octave
2530 see also: SUMSKIPNAN, STATISTIC
2533 http://mathworld.wolfram.com/
2537 # name: <cell-element>
2541 SKEWNESS estimates the skewness
2546 # name: <cell-element>
2553 # name: <cell-element>
2557 SPEARMAN Spearman's rank correlation coefficient.
2558 This function is replaced by CORRCOEF.
2559 Significance test and confidence intervals can be obtained from CORRCOEF.
2561 [R,p,ci1,ci2] = CORRCOEF(x, [y, ] 'Rank');
2563 For some (unknown) reason, in previous versions Spearman's rank correlation
2564 r = corrcoef(ranks(x)).
2565 But according to [1], Spearman's correlation is defined as
2566 r = 1-6*sum((ranks(x)-ranks(y)).^2)/(N*(N*N-1))
2567 The results are different. Here, the later version is implemented.
2569 see also: CORRCOEF, RANKCORR
2572 [1] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html
2573 [2] http://mathworld.wolfram.com/CorrelationCoefficient.html
2577 # name: <cell-element>
2581 SPEARMAN Spearman's rank correlation coefficient.
2585 # name: <cell-element>
2592 # name: <cell-element>
2596 STATISTIC estimates various statistics at once.
2598 R = STATISTIC(x,DIM)
2599 calculates all statistic (see list of fun) in dimension DIM
2600 R is a struct with all statistics
2602 y = STATISTIC(x,fun)
2603 estimate of fun on dimension DIM
2604 y gives the statistic of fun
2609 N: STATS of N-th dimension
2610 default or []: first DIMENSION, with more than 1 element
2613 'std' standard deviation
2615 'sem' standard error of the mean
2616 'rms' root mean square
2617 'meansq' mean of squares
2619 'sumsq' sum of squares
2620 'CM#' central moment of order #
2622 'kurtosis' excess coefficient (Fisher kurtosis)
2623 'mad' mean absolute deviation
2626 - can deal with NaN's (missing values)
2627 - dimension argument
2628 - compatible to Matlab and Octave
2630 see also: SUMSKIPNAN
2633 [1] http://www.itl.nist.gov/
2634 [2] http://mathworld.wolfram.com/
2638 # name: <cell-element>
2642 STATISTIC estimates various statistics at once.
2646 # name: <cell-element>
2653 # name: <cell-element>
2657 STD calculates the standard deviation.
2659 [y,v] = std(x [, opt[, DIM [, W]]])
2662 0: normalizes with N-1 [default]
2663 provides the square root of best unbiased estimator of the variance
2664 1: normalizes with N,
2665 this provides the square root of the second moment around the mean
2667 best unbiased estimator of the standard deviation (see [1])
2670 N STD of N-th dimension
2671 default or []: first DIMENSION, with more than 1 element
2672 W weights to compute weighted s.d. (default: [])
2673 if W=[], all weights are 1.
2674 number of elements in W must match size(x,DIM)
2676 y estimated standard deviation
2679 - provides an unbiased estimation of the S.D.
2680 - can deal with NaN's (missing values)
2682 - dimension argument also in Octave
2683 - compatible to Matlab and Octave
2685 see also: RMS, SUMSKIPNAN, MEAN, VAR, MEANSQ,
2689 [1] http://mathworld.wolfram.com/StandardDeviationDistribution.html
2693 # name: <cell-element>
2697 STD calculates the standard deviation.
2701 # name: <cell-element>
2708 # name: <cell-element>
2712 SUMSKIPNAN adds all non-NaN values.
2714 All NaN's are skipped; NaN's are considered as missing values.
2715 SUMSKIPNAN of NaN's only gives O; and the number of valid elements is return.
2716 SUMSKIPNAN is also the elementary function for calculating
2717 various statistics (e.g. MEAN, STD, VAR, RMS, MEANSQ, SKEWNESS,
2718 KURTOSIS, MOMENT, STATISTIC etc.) from data with missing values.
2719 SUMSKIPNAN implements the DIMENSION-argument for data with missing values.
2720 Also the second output argument return the number of valid elements (not NaNs)
2722 Y = sumskipnan(x [,DIM])
2723 [Y,N,SSQ] = sumskipnan(x [,DIM])
2724 [...] = sumskipnan(x, DIM, W)
2727 DIM dimension (default: [])
2728 empty DIM sets DIM to first non singleton dimension
2729 W weight vector for weighted sum, numel(W) must fit size(x,DIM)
2731 N number of valid (not missing) elements
2734 the function FLAG_NANS_OCCURED() returns whether any value in x
2735 is a not-a-number (NaN)
2738 - can deal with NaN's (missing values)
2739 - implements dimension argument.
2740 - computes weighted sum
2741 - compatible with Matlab and Octave
2743 see also: FLAG_NANS_OCCURED, SUM, NANSUM, MEAN, STD, VAR, RMS, MEANSQ,
2744 SSQ, MOMENT, SKEWNESS, KURTOSIS, SEM
2748 # name: <cell-element>
2752 SUMSKIPNAN adds all non-NaN values.
2756 # name: <cell-element>
2763 # name: <cell-element>
2767 SUMSQ calculates the sum of squares.
2769 [y] = sumsq(x [, DIM])
2772 N STD of N-th dimension
2773 default or []: first DIMENSION, with more than 1 element
2775 y estimated standard deviation
2778 - can deal with NaN's (missing values)
2779 - dimension argument also in Octave
2780 - compatible to Matlab and Octave
2782 see also: RMS, SUMSKIPNAN, MEAN, VAR, MEANSQ,
2789 # name: <cell-element>
2793 SUMSQ calculates the sum of squares.
2797 # name: <cell-element>
2804 # name: <cell-element>
2808 TCDF returns student cumulative distribtion function
2812 Computes the CDF of the students distribution
2813 with DF degrees of freedom
2814 x,DF must be matrices of same size, or any one can be a scalar.
2816 see also: NORMCDF, TPDF, TINV
2820 # name: <cell-element>
2824 TCDF returns student cumulative distribtion function
2829 # name: <cell-element>
2836 # name: <cell-element>
2840 TEST_SC: apply statistical and SVM classifier to test data
2842 R = test_sc(CC,D,TYPE [,target_Classlabel])
2843 R.output output: "signed" distance for each class.
2844 This represents the distances between sample D and the separating hyperplane
2845 The "signed distance" is possitive if it matches the target class, and
2846 and negative if it lays on the opposite side of the separating hyperplane.
2847 R.classlabel class for output data
2848 The target class is optional. If it is provided, the following values are returned.
2849 R.kappa Cohen's kappa coefficient
2850 R.ACC Classification accuracy
2851 R.H Confusion matrix
2853 The classifier CC is typically obtained by TRAIN_SC. If a statistical
2854 classifier is used, TYPE can be used to modify the classifier.
2855 TYPE = 'MDA' mahalanobis distance based classifier
2856 TYPE = 'MD2' mahalanobis distance based classifier
2857 TYPE = 'MD3' mahalanobis distance based classifier
2858 TYPE = 'GRB' Gaussian radial basis function
2859 TYPE = 'QDA' quadratic discriminant analysis
2860 TYPE = 'LD2' linear discriminant analysis
2861 TYPE = 'LD3', 'LDA', 'FDA, 'FLDA' (Fisher's) linear discriminant analysis
2862 TYPE = 'LD4' linear discriminant analysis
2863 TYPE = 'GDBC' general distance based classifier
2868 [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed.
2869 John Wiley & Sons, 2001.
2873 # name: <cell-element>
2877 TEST_SC: apply statistical and SVM classifier to test data
2882 # name: <cell-element>
2889 # name: <cell-element>
2893 TIEDRANK compute rank of samples, the mean value is used in case of ties
2894 this function is just a wrapper for RANKS, and provided for compatibility
2895 with the statistics toolbox of matlab(tm)
2898 computes the rank R of vector X
2904 # name: <cell-element>
2908 TIEDRANK compute rank of samples, the mean value is used in case of ties
2913 # name: <cell-element>
2920 # name: <cell-element>
2924 TINV returns inverse cumulative function of the student distribution
2928 Computes the quantile (inverse of the CDF) of a the student
2929 cumulative distribution with mean m and standard deviation s
2930 p,v must be matrices of same size, or any one can be a scalar.
2932 see also: TPDF, TCDF, NORMPDF, NORMCDF, NORMINV
2936 # name: <cell-element>
2940 TINV returns inverse cumulative function of the student distribution
2945 # name: <cell-element>
2952 # name: <cell-element>
2956 TPDF returns student probability density
2960 Computes the PDF of a the student distribution
2961 with DF degreas of freedom
2962 x,DF must be matrices of same size, or any one can be a scalar.
2964 see also: TINV, TCDF, NORMPDF, NORMCDF, NORMINV
2968 # name: <cell-element>
2972 TPDF returns student probability density
2977 # name: <cell-element>
2984 # name: <cell-element>
2988 Linear Discriminant Analysis for the Small Sample Size Problem as described in
2989 Algorithm 1 of J. Duintjer Tebbens, P. Schlesinger: 'Improving
2990 Implementation of Linear Discriminant Analysis for the High Dimension/Small Sample Size
2991 Problem', Computational Statistics and Data Analysis, vol. 52, no. 1, pp. 423-437, 2007.
2993 X ...... (sparse) training data matrix
2994 G ...... group coding matrix of the training data
2995 test ...... (sparse) test data matrix
2996 Gtest ...... group coding matrix of the test data
2997 par ...... if par = 0 then classification exploits sparsity too
2998 tol ...... tolerance to distinguish zero eigenvalues
3000 err ...... Wrong classification rate (in %)
3001 trafo ...... LDA transformation vectors
3004 J. Duintjer Tebbens, P. Schlesinger: 'Improving
3005 Implementation of Linear Discriminant Analysis for the High Dimension/Small Sample Size
3006 Problem', Computational Statistics and Data Analysis, vol. 52, no. 1,
3009 Copyright (C) by J. Duintjer Tebbens, Institute of Computer Science of the Academy of Sciences of the Czech Republic,
3010 Pod Vodarenskou vezi 2, 182 07 Praha 8 Liben, 18.July.2006.
3011 This work was supported by the Program Information Society under project
3015 Modified for the use with Matlab6.5 by A. Schloegl, 22.Aug.2006
3018 This function is part of the NaN-toolbox
3019 http://pub.ist.ac.at/~schloegl/matlab/NaN/
3023 # name: <cell-element>
3027 Linear Discriminant Analysis for the Small Sample Size Problem as described in
3032 # name: <cell-element>
3039 # name: <cell-element>
3043 Train a (statistical) classifier
3045 CC = train_sc(D,classlabel)
3046 CC = train_sc(D,classlabel,MODE)
3047 CC = train_sc(D,classlabel,MODE, W)
3048 weighting D(k,:) with weight W(k) (not all classifiers supported weighting)
3050 CC contains the model parameters of a classifier which can be applied
3051 to test data using test_sc.
3052 R = test_sc(CC,D,...)
3054 D training samples (each row is a sample, each column is a feature)
3055 classlabel labels of each sample, must have the same number of rows as D.
3056 Two different encodings are supported:
3057 {-1,1}-encoding (multiple classes with separate columns for each class) or
3059 So [1;2;3;1;4] is equivalent to
3065 Note, samples with classlabel=0 are ignored.
3067 The following classifier types are supported MODE.TYPE
3068 'MDA' mahalanobis distance based classifier [1]
3069 'MD2' mahalanobis distance based classifier [1]
3070 'MD3' mahalanobis distance based classifier [1]
3071 'GRB' Gaussian radial basis function [1]
3072 'QDA' quadratic discriminant analysis [1]
3073 'LD2' linear discriminant analysis (see LDBC2) [1]
3074 MODE.hyperparameter.gamma: regularization parameter [default 0]
3075 'LD3', 'FDA', 'LDA', 'FLDA'
3076 linear discriminant analysis (see LDBC3) [1]
3077 MODE.hyperparameter.gamma: regularization parameter [default 0]
3078 'LD4' linear discriminant analysis (see LDBC4) [1]
3079 MODE.hyperparameter.gamma: regularization parameter [default 0]
3080 'LD5' another LDA (motivated by CSP)
3081 MODE.hyperparameter.gamma: regularization parameter [default 0]
3082 'RDA' regularized discriminant analysis [7]
3083 MODE.hyperparameter.gamma: regularization parameter
3084 MODE.hyperparameter.lambda =
3085 gamma = 0, lambda = 0 : MDA
3086 gamma = 0, lambda = 1 : LDA [default]
3087 Hint: hyperparameter are used only in test_sc.m, testing different
3088 the hyperparameters do not need repetitive calls to train_sc,
3089 it is sufficient to modify CC.hyperparameter before calling test_sc.
3090 'GDBC' general distance based classifier [1]
3091 '' statistical classifier, requires Mode argument in TEST_SC
3092 '###/DELETION' if the data contains missing values (encoded as NaNs),
3093 a row-wise or column-wise deletion (depending on which method
3094 removes less data values) is applied;
3095 '###/GSVD' GSVD and statistical classifier [2,3],
3096 '###/sparse' sparse [5]
3097 '###' must be 'LDA' or any other classifier
3098 'PLS' (linear) partial least squares regression
3099 'REG' regression analysis;
3100 'WienerHopf' Wiener-Hopf equation
3101 'NBC' Naive Bayesian Classifier [6]
3102 'aNBC' Augmented Naive Bayesian Classifier [6]
3103 'NBPW' Naive Bayesian Parzen Window [9]
3105 'PLA' Perceptron Learning Algorithm [11]
3106 MODE.hyperparameter.alpha = alpha [default: 1]
3107 w = w + alpha * e'*x
3108 'LMS', 'AdaLine' Least mean squares, adaptive line element, Widrow-Hoff, delta rule
3109 MODE.hyperparameter.alpha = alpha [default: 1]
3110 'Winnow2' Winnow2 algorithm [12]
3112 'PSVM' Proximal SVM [8]
3113 MODE.hyperparameter.nu (default: 1.0)
3114 'LPM' Linear Programming Machine
3115 uses and requires train_LPM of the iLog CPLEX optimizer
3116 MODE.hyperparameter.c_value =
3117 'CSP' CommonSpatialPattern is very experimental and just a hack
3118 uses a smoothing window of 50 samples.
3119 'SVM','SVM1r' support vector machines, one-vs-rest
3120 MODE.hyperparameter.c_value =
3121 'SVM11' support vector machines, one-vs-one + voting
3122 MODE.hyperparameter.c_value =
3123 'RBF' Support Vector Machines with RBF Kernel
3124 MODE.hyperparameter.c_value =
3125 MODE.hyperparameter.gamma =
3126 'SVM:LIB' libSVM [default SVM algorithm)
3127 'SVM:bioinfo' uses and requires svmtrain from the bioinfo toolbox
3128 'SVM:OSU' uses and requires mexSVMTrain from the OSU-SVM toolbox
3129 'SVM:LOO' uses and requires svcm_train from the LOO-SVM toolbox
3130 'SVM:Gunn' uses and requires svc-functios from the Gunn-SVM toolbox
3131 'SVM:KM' uses and requires svmclass-function from the KM-SVM toolbox
3132 'SVM:LINz' LibLinear [10] (requires train.mex from LibLinear somewhere in the path)
3133 z=0 (default) LibLinear with -- L2-regularized logistic regression
3134 z=1 LibLinear with -- L2-loss support vector machines (dual)
3135 z=2 LibLinear with -- L2-loss support vector machines (primal)
3136 z=3 LibLinear with -- L1-loss support vector machines (dual)
3137 'SVM:LIN4' LibLinear with -- multi-class support vector machines by Crammer and Singer
3138 'DT' decision tree - not implemented yet.
3140 {'REG','MDA','MD2','QDA','QDA2','LD2','LD3','LD4','LD5','LD6','NBC','aNBC','WienerHopf','LDA/GSVD','MDA/GSVD', 'LDA/sparse','MDA/sparse', 'PLA', 'LMS','LDA/DELETION','MDA/DELETION','NBC/DELETION','RDA/DELETION','REG/DELETION','RDA','GDBC','SVM','RBF','PSVM','SVM11','SVM:LIN4','SVM:LIN0','SVM:LIN1','SVM:LIN2','SVM:LIN3','WINNOW', 'DT'};
3142 CC contains the model parameters of a classifier. Some time ago,
3143 CC was a statistical classifier containing the mean
3144 and the covariance of the data of each class (encoded in the
3145 so-called "extended covariance matrices". Nowadays, also other
3146 classifiers are supported.
3148 see also: TEST_SC, COVM, ROW_COL_DELETION
3151 [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed.
3152 John Wiley & Sons, 2001.
3153 [2] Peg Howland and Haesun Park,
3154 Generalizing Discriminant Analysis Using the Generalized Singular Value Decomposition
3155 IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 2004.
3156 dx.doi.org/10.1109/TPAMI.2004.46
3157 [3] http://www-static.cc.gatech.edu/~kihwan23/face_recog_gsvd.htm
3158 [4] Jieping Ye, Ravi Janardan, Cheong Hee Park, Haesun Park
3159 A new optimization criterion for generalized discriminant analysis on undersampled problems.
3160 The Third IEEE International Conference on Data Mining, Melbourne, Florida, USA
3161 November 19 - 22, 2003
3162 [5] J.D. Tebbens and P. Schlesinger (2006),
3163 Improving Implementation of Linear Discriminant Analysis for the Small Sample Size Problem
3164 Computational Statistics & Data Analysis, vol 52(1): 423-437, 2007
3165 http://www.cs.cas.cz/mweb/download/publi/JdtSchl2006.pdf
3166 [6] H. Zhang, The optimality of Naive Bayes,
3167 http://www.cs.unb.ca/profs/hzhang/publications/FLAIRS04ZhangH.pdf
3168 [7] J.H. Friedman. Regularized discriminant analysis.
3169 Journal of the American Statistical Association, 84:165–175, 1989.
3170 [8] G. Fung and O.L. Mangasarian, Proximal Support Vector Machine Classifiers, KDD 2001.
3171 Eds. F. Provost and R. Srikant, Proc. KDD-2001: Knowledge Discovery and Data Mining, August 26-29, 2001, San Francisco, CA.
3173 [9] Kai Keng Ang, Zhang Yang Chin, Haihong Zhang, Cuntai Guan.
3174 Filter Bank Common Spatial Pattern (FBCSP) in Brain-Computer Interface.
3175 IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence).
3176 1-8 June 2008 Page(s):2390 - 2397
3177 [10] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin.
3178 LIBLINEAR: A Library for Large Linear Classification, Journal of Machine Learning Research 9(2008), 1871-1874.
3179 Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear
3180 [11] http://en.wikipedia.org/wiki/Perceptron#Learning_algorithm
3181 [12] Littlestone, N. (1988)
3182 "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm"
3183 Machine Learning 285-318(2)
3184 http://en.wikipedia.org/wiki/Winnow_(algorithm)
3188 # name: <cell-element>
3192 Train a (statistical) classifier
3194 CC = train_sc(D,classlabel)
3199 # name: <cell-element>
3206 # name: <cell-element>
3210 TRIMEAN yields the weighted mean of the median and the quartiles
3213 The trimean is m = (Q1+2*MED+Q3)/4
3214 with quartile Q1 and Q3 and median MED
3216 N-dimensional data is supported
3219 [1] http://mathworld.wolfram.com/Trimean.html
3223 # name: <cell-element>
3227 TRIMEAN yields the weighted mean of the median and the quartiles
3232 # name: <cell-element>
3239 # name: <cell-element>
3243 TRIMMEAN calculates the trimmed mean by removing the fraction of p/2 upper and
3244 p/2 lower samples. Missing values (encoded as NaN) are ignored and not taken into account.
3245 The same number from the upper and lower values are removed, and is compatible to various
3246 spreadsheet programs including GNumeric [1], LibreOffice, OpenOffice and MS Excel.
3249 Q = trimmean(Y,p,DIM)
3250 returns the TRIMMEAN along dimension DIM of sample array Y.
3251 If p is a vector, the TRIMMEAN for each p is computed.
3253 see also: MAD, RANGE, HISTO2, HISTO3, PERCENTILE, QUANTILE
3256 [1] http://www.fifi.org/doc/gnumeric-doc/html/C/gnumeric-trimmean.html
3260 # name: <cell-element>
3264 TRIMMEAN calculates the trimmed mean by removing the fraction of p/2 upper and
3268 # name: <cell-element>
3275 # name: <cell-element>
3279 TTEST (paired) t-test
3280 For a sample X from a normal distribution with unknown mean and
3281 variance, perform a t-test of the null hypothesis `mean (X) == M'.
3282 Under the null, the test statistic T follows a Student
3283 distribution with `DF = length (X) - 1' degrees of freedom.
3285 TTEST treads NaNs as "Missing values" and ignores these.
3288 tests Null-hypothesis that mean of x is m.
3290 size of x and size of y must match, it is tested whether the
3291 difference x-y is significantly different to m=0;
3292 H = ttest(x,y,alpha)
3293 H = ttest(x,y,alpha,tail)
3294 H = ttest(x,y,alpha,tail,DIM)
3295 [H,PVAL] = ttest(...)
3297 H=1 indicates a rejection of the Null-hypothesis at a significance
3298 level of alpha (default alpha = 0.05).
3300 With the optional argument string TAIL, the alternative of interest
3301 can be selected. If TAIL is '!=' or '<>' or 'both', the null is tested
3302 against the two-sided Alternative `mean (X) ~= mean (Y)'. If TAIL
3303 is '>' or 'right', the one-sided Alternative `mean (X) > mean (Y)' is used.
3304 Similarly for '<' or 'left', the one-sided Alternative `mean (X) < mean
3305 (Y)' is used. The default is the two-sided case.
3307 H returns whether the Null-Hypotheses must be rejected.
3308 The p-value of the test is returned in PVAL.
3310 TTEST works on the first non-singleton dimension or on DIM.
3312 If no output argument is given, the p-value of the test is
3318 # name: <cell-element>
3322 TTEST (paired) t-test
3323 For a sample X from a normal distribution with unkno
3327 # name: <cell-element>
3334 # name: <cell-element>
3338 TTEST2 (unpaired) t-test
3339 For two samples x and y from normal distributions with unknown
3340 means and unknown equal variances, perform a two-sample t-test of
3341 the null hypothesis of equal means. Under the null, the test
3342 statistic T follows a Student distribution with DF degrees of
3345 TTEST2 treads NaNs as "Missing values" and ignores these.
3348 H = ttest2([x;y],C,W)
3349 H = ttest2(x,y,alpha)
3350 H = ttest2(x,y,alpha,tail)
3351 H = ttest2(x,y,alpha,tail,vartype)
3352 H = ttest2(x,y,alpha,tail,vartype,DIM)
3353 [H,PVAL] = ttest2(...)
3354 [h,p,ci,stats] = ttest2(...)
3356 H=1 indicates a rejection of the Null-hypothesis at a significance
3357 level of alpha (default alpha = 0.05).
3359 With the optional argument string TAIL, the Alternative of interest
3360 can be selected. If TAIL is '!=' or '<>' or 'both', the null is tested
3361 against the two-sided Alternative `mean (X) ~= mean (Y)'. If TAIL
3362 is '>' or 'right', the one-sided Alternative `mean (X) > mean (Y)' is used.
3363 Similarly for '<' or 'left', the one-sided Alternative `mean (X) < mean
3364 (Y)' is used. The default is the two-sided case.
3366 vartype support only 'equal' (default value); the value 'unequal' is not supported.
3368 H returns whether the Null-Hypotheses must be rejected.
3369 The p-value of the test is returned in PVAL.
3371 TTEST2 works on the first non-singleton dimension or on DIM.
3373 If no output argument is given, the p-value of the test is
3379 # name: <cell-element>
3383 TTEST2 (unpaired) t-test
3384 For two samples x and y from normal distributions
3388 # name: <cell-element>
3395 # name: <cell-element>
3399 VAR calculates the variance.
3401 y = var(x [, opt[, DIM]])
3402 calculates the variance in dimension DIM
3403 the default DIM is the first non-single dimension
3405 opt 0: normalizes with N-1 [default]
3406 1: normalizes with N
3410 N: VAR of N-th dimension
3411 default or []: first DIMENSION, with more than 1 element
3412 W weights to compute weighted variance (default: [])
3413 if W=[], all weights are 1.
3414 number of elements in W must match size(x,DIM)
3424 - can deal with NaN's (missing values)
3426 - dimension argument
3427 - compatible to Matlab and Octave
3429 see also: MEANSQ, SUMSQ, SUMSKIPNAN, MEAN, RMS, STD,
3433 # name: <cell-element>
3437 VAR calculates the variance.
3441 # name: <cell-element>
3448 # name: <cell-element>
3452 XCOVF generates cross-covariance function.
3453 XCOVF is the same as XCORR except
3454 X and Y can contain missing values encoded with NaN.
3455 NaN's are skipped, NaN do not result in a NaN output.
3456 The output gives NaN only if there are insufficient input data
3458 [C,N,LAGS] = xcovf(X,MAXLAG,SCALEOPT);
3459 calculates the (auto-)correlation function of X
3460 [C,N,LAGS] = xcovf(X,Y,MAXLAG,SCALEOPT);
3461 calculates the crosscorrelation function between X and Y
3463 SCALEOPT [character string] specifies the type of scaling applied
3464 to the correlation vector (or matrix). is one of:
3465 'none' return the unscaled correlation, R,
3466 'biased' return the biased average, R/N,
3467 'unbiased' return the unbiassed average, R(k)/(N-|k|),
3468 'coeff' return the correlation coefficient, R/(rms(x).rms(y)),
3469 where "k" is the lag, and "N" is the length of X.
3470 If omitted, the default value is "none".
3471 If Y is supplied but does not have the ame length as X,
3472 scale must be "none".
3475 see also: COVM, XCORR
3479 # name: <cell-element>
3483 XCOVF generates cross-covariance function.
3487 # name: <cell-element>
3494 # name: <cell-element>
3498 XPTOPEN read of several file formats and writing of the SAS Transport Format (*.xpt)
3499 Supported are ARFF, SAS-XPT and STATA files.
3500 XPTOPEN is a mex-file and must be compiled before use.
3501 More detailed help can be obtained by the command
3503 without an additional argument
3505 X = xptopen(filename)
3506 X = xptopen(filename,'r')
3507 read file with filename and return variables in struct X
3509 X = xptopen(filename,'w',X)
3510 save fields of struct X in filename.
3512 The fields of X must be column vectors of equal length.
3513 Each vector is either a numeric vector or a cell array of strings.
3514 The SAS-XPT format stores Date/Time as numeric value counting the number of days since 1960-01-01.
3518 # name: <cell-element>
3522 XPTOPEN read of several file formats and writing of the SAS Transport Format (*
3526 # name: <cell-element>
3533 # name: <cell-element>
3537 XVAL is used for crossvalidation
3539 [R,CC] = xval(D,classlabel)
3540 .. = xval(D,classlabel,CLASSIFIER)
3541 .. = xval(D,classlabel,CLASSIFIER,type)
3542 .. = xval(D,{classlabel,W},CLASSIFIER)
3543 .. = xval(D,{classlabel,W,NG},CLASSIFIER)
3546 load_fisheriris; %builtin iris dataset
3548 K = 5; NG = [1:length(C)]'*K/length(C);
3549 [R,CC] = xval(meas,{C,[],NG},'NBC');
3552 D: data features (one feature per column, one sample per row)
3553 classlabel labels of each sample, must have the same number of rows as D.
3554 Two different encodings are supported:
3555 {-1,1}-encoding (multiple classes with separate columns for each class) or
3557 So [1;2;3;1;4] is equivalent to
3563 Note, samples with classlabel=0 are ignored.
3565 CLASSIFIER can be any classifier supported by train_sc (default='LDA')
3566 {'REG','MDA','MD2','QDA','QDA2','LD2','LD3','LD4','LD5','LD6','NBC','aNBC','WienerHopf', 'RDA','GDBC',
3567 'SVM','RBF','PSVM','SVM11','SVM:LIN4','SVM:LIN0','SVM:LIN1','SVM:LIN2','SVM:LIN3','WINNOW'}
3568 these can be modified by ###/GSVD, ###/sparse and ###/DELETION.
3569 /DELETION removes in case of NaN's either the rows or the columns (which removes less data values) with any NaN
3570 /sparse and /GSVD preprocess the data an reduce it to some lower-dimensional space.
3571 Hyperparameters (like alpha for PLA, gamma/lambda for RDA, c_value for SVM, etc) can be defined as
3572 CLASSIFIER.hyperparameter.alpha, etc. and
3573 CLASSIFIER.TYPE = 'PLA' (as listed above).
3574 See train_sc for details.
3575 W: weights for each sample (row) in D.
3576 default: [] (i.e. all weights are 1)
3577 number of elements in W must match the number of rows of D
3578 NG: used to define the type of cross-valdiation
3579 Leave-One-Out-Method (LOOM): NG = [1:length(classlabel)]' (default)
3580 Leave-K-Out-Method: NG = ceil([1:length(classlabel)]'/K)
3581 K-fold XV: NG = ceil([1:length(classlabel)]'*K/length(classlabel))
3582 group-wise XV (if samples are not indepentent) can be also defined here
3583 samples from the same group (dependent samples) get the same identifier
3584 samples from different groups get different classifiers
3585 TYPE: defines the type of cross-validation procedure if NG is not specified
3586 'LOOM' leave-one-out-method
3587 k k-fold crossvalidation
3590 R contains the resulting performance metric
3591 CC contains the classifier
3593 plota(R) shows the confusion matrix of the results
3595 see also: TRAIN_SC, TEST_SC, CLASSIFY, PLOTA
3598 [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed.
3599 John Wiley & Sons, 2001.
3600 [2] A. Schlögl, J. Kronegg, J.E. Huggins, S. G. Mason;
3601 Evaluation criteria in BCI research.
3602 (Eds.) G. Dornhege, J.R. Millan, T. Hinterberger, D.J. McFarland, K.-R.Müller;
3603 Towards Brain-Computer Interfacing, MIT Press, 2007, p.327-342
3607 # name: <cell-element>
3611 XVAL is used for crossvalidation
3616 # name: <cell-element>
3623 # name: <cell-element>
3627 zScoreMedian removes the median and standardizes by the 1.483*median absolute deviation
3629 Usage: Z = zScoreMedian(X, DIM)
3631 DIM: dimension along which z-score should be calculated (1=columns, 2=rows)
3632 (optional, default=first dimension with more than 1 element
3633 Output: Z : z-scores
3637 # name: <cell-element>
3641 zScoreMedian removes the median and standardizes by the 1.
3645 # name: <cell-element>
3652 # name: <cell-element>
3656 ZSCORE removes the mean and normalizes the data
3657 to a variance of 1. Can be used for Pre-Whitening of the data, too.
3659 [z,r,m] = zscore(x,DIM)
3660 z z-score of x along dimension DIM
3661 r is the inverse of the standard deviation
3664 The data x can be reconstrated with
3665 x = z*diag(1./r) + repmat(m,size(z)./size(m))
3666 z = x*diag(r) - repmat(m.*v,size(z)./size(m))
3671 default or []: first DIMENSION, with more than 1 element
3673 see also: SUMSKIPNAN, MEAN, STD, DETREND
3676 [1] http://mathworld.wolfram.com/z-Score.html
3680 # name: <cell-element>
3684 ZSCORE removes the mean and normalizes the data