X-Git-Url: https://git.creatis.insa-lyon.fr/pubgit/?p=CreaPhase.git;a=blobdiff_plain;f=octave_packages%2Fnan-2.5.5%2Fdoc-cache;fp=octave_packages%2Fnan-2.5.5%2Fdoc-cache;h=d7ed2d4ef28fd543f560eeda180e04a91962c020;hp=0000000000000000000000000000000000000000;hb=f5f7a74bd8a4900f0b797da6783be80e11a68d86;hpb=1705066eceaaea976f010f669ce8e972f3734b05 diff --git a/octave_packages/nan-2.5.5/doc-cache b/octave_packages/nan-2.5.5/doc-cache new file mode 100644 index 0000000..d7ed2d4 --- /dev/null +++ b/octave_packages/nan-2.5.5/doc-cache @@ -0,0 +1,3690 @@ +# Created by Octave 3.6.1, Mon Apr 23 21:08:02 2012 UTC +# name: cache +# type: cell +# rows: 3 +# columns: 81 +# name: +# type: sq_string +# elements: 1 +# length: 12 +bland_altman + + +# name: +# type: sq_string +# elements: 1 +# length: 865 + BLAND_ALTMANN shows the Bland-Altman plot of two columns of measurements + and computes several summary results. + + bland_altman(m1, m2 [,group]) + bland_altman(data [, group]) + R = bland_altman(...) + + m1,m2 are two colums with the same number of elements + containing the measurements. m1,m2 can be also combined + in a single two column data matrix. + group [optional] indicates which measurements belong to the same group + This is useful to account for repeated measurements. + + + References: + [1] JM Bland and DG Altman, Measuring agreement in method comparison studies. + Statistical Methods in Medical Research, 1999; 8; 135. + doi:10.1177/09622802990080204 + [2] P.S. Myles, Using the Bland– Altman method to measure agreement with repeated measures + British Journal of Anaesthesia 99(3):309–11 (2007) + doi:10.1093/bja/aem214 + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + BLAND_ALTMANN shows the Bland-Altman plot of two columns of measurements + and + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +cat2bin + + +# name: +# type: sq_string +# elements: 1 +# length: 755 + CAT2BIN converts categorial into binary data + each category of each column in D is converted into a logical column + + B = cat2bin(C); + [B,BinLabel] = cat2bin(C,Label); + [B,BinLabel] = cat2bin(C,Label,MODE) + + C categorial data + B binary data + Label description of each column in C + BinLabel description of each column in B + MODE default [], ignores NaN + 'notIgnoreNAN' includes binary column for NaN + 'IgnoreZeros' zeros do not get a separate category + 'IgnoreZeros+NaN' zeros and NaN are ignored + + example: + cat2bin([1;2;5;1;5]) results in + 1 0 0 + 0 1 0 + 0 0 1 + 1 0 0 + 0 0 1 + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + CAT2BIN converts categorial into binary data + each category of each column i + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +cdfplot + + +# name: +# type: sq_string +# elements: 1 +# length: 565 + CDFPLOT plots empirical commulative distribution function + + cdfplot(X) + cdfplot(X, FMT) + cdfplot(X, PROPERTY, VALUE,...) + h = cdfplot(...) + [h,stats] = cdfplot(X) + + X contains the data vector + (matrix data is currently changed to a vector, this might change in future) + FMT,PROPERTY,VALUE + are used for formating; see HELP PLOT for more details + h graphics handle to the cdf curve + stats + a struct containing various summary statistics including + mean, std, median, min, max. + + see also: ecdf, median, statistics, hist2res, plot + + References: + + + +# name: +# type: sq_string +# elements: 1 +# length: 59 + CDFPLOT plots empirical commulative distribution function + + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +center + + +# name: +# type: sq_string +# elements: 1 +# length: 505 + CENTER removes the mean + + [z,mu] = center(x,DIM,W) + removes mean x along dimension DIM + + x input data + DIM dimension + 1: column + 2: row + default or []: first DIMENSION, with more than 1 element + W weights to computed weighted mean (default: [], all weights = 1) + numel(W) must be equal to size(x,DIM) + + features: + - can deal with NaN's (missing values) + - weighting of data + - dimension argument + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, MEAN, STD, DETREND, ZSCORE + + REFERENCE(S): + + + +# name: +# type: sq_string +# elements: 1 +# length: 26 + CENTER removes the mean + + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +classify + + +# name: +# type: sq_string +# elements: 1 +# length: 792 + CLASSIFY classifies sample data into categories + defined by the training data and its group information + + CLASS = classify(sample, training, group) + CLASS = classify(sample, training, group, TYPE) + [CLASS,ERR,POSTERIOR,LOGP,COEF] = CLASSIFY(...) + + CLASS contains the assigned group. + ERR is the classification error on the training set weighted by the + prior propability of each group. + + The same classifier as in TRAIN_SC are supported. + + ATTENTION: no cross-validation is applied, therefore the + classification error is too optimistic (overfitting). + Use XVAL instead to obtain cross-validated performance. + + see also: TRAIN_SC, TEST_SC, XVAL + + References: + [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. + John Wiley & Sons, 2001. + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + CLASSIFY classifies sample data into categories + defined by the training data + + + +# name: +# type: sq_string +# elements: 1 +# length: 24 +coefficient_of_variation + + +# name: +# type: sq_string +# elements: 1 +# length: 221 + COEFFICIENT_OF_VARIATION returns STD(X)/MEAN(X) + + cv=coefficient_of_variation(x [,DIM]) + cv=std(x)/mean(x) + + see also: SUMSKIPNAN, MEAN, STD + + REFERENCE(S): + http://mathworld.wolfram.com/VariationCoefficient.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + COEFFICIENT_OF_VARIATION returns STD(X)/MEAN(X) + + cv=coefficient_of_variation( + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +cor + + +# name: +# type: sq_string +# elements: 1 +# length: 576 + COR calculates the correlation matrix + X and Y can contain missing values encoded with NaN. + NaN's are skipped, NaN do not result in a NaN output. + (Its assumed that the occurence of NaN's is uncorrelated) + The output gives NaN only if there are insufficient input data + + COR(X); + calculates the (auto-)correlation matrix of X + COR(X,Y); + calculates the crosscorrelation between X and Y + + c = COR(...); + c is the correlation matrix + + W weights to compute weighted mean (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + COR calculates the correlation matrix + X and Y can contain missing values encod + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +corrcoef + + +# name: +# type: sq_string +# elements: 1 +# length: 4692 + CORRCOEF calculates the correlation matrix from pairwise correlations. + The input data can contain missing values encoded with NaN. + Missing data (NaN's) are handled by pairwise deletion [15]. + In order to avoid possible pitfalls, use case-wise deletion or + or check the correlation of NaN's with your data (see below). + A significance test for testing the Hypothesis + 'correlation coefficient R is significantly different to zero' + is included. + + [...] = CORRCOEF(X); + calculates the (auto-)correlation matrix of X + [...] = CORRCOEF(X,Y); + calculates the crosscorrelation between X and Y + + [...] = CORRCOEF(..., Mode); + Mode='Pearson' or 'parametric' [default] + gives the correlation coefficient + also known as the 'product-moment coefficient of correlation' + or 'Pearson''s correlation' [1] + Mode='Spearman' gives 'Spearman''s Rank Correlation Coefficient' + This replaces SPEARMAN.M + Mode='Rank' gives a nonparametric Rank Correlation Coefficient + This is the "Spearman rank correlation with proper handling of ties" + This replaces RANKCORR.M + + [...] = CORRCOEF(..., param1, value1, param2, value2, ... ); + param value + 'Mode' type of correlation + 'Pearson','parametric' + 'Spearman' + 'rank' + 'rows' how do deal with missing values encoded as NaN's. + 'complete': remove all rows with at least one NaN + 'pairwise': [default] + 'alpha' 0.01 : significance level to compute confidence interval + + [R,p,ci1,ci2,nansig] = CORRCOEF(...); + R is the correlation matrix + R(i,j) is the correlation coefficient r between X(:,i) and Y(:,j) + p gives the significance of R + It tests the null hypothesis that the product moment correlation coefficient is zero + using Student's t-test on the statistic t = r*sqrt(N-2)/sqrt(1-r^2) + where N is the number of samples (Statistics, M. Spiegel, Schaum series). + p > alpha: do not reject the Null hypothesis: 'R is zero'. + p < alpha: The alternative hypothesis 'R is larger than zero' is true with probability (1-alpha). + ci1 lower (1-alpha) confidence interval + ci2 upper (1-alpha) confidence interval + If no alpha is provided, the default alpha is 0.01. This can be changed with function flag_implicit_significance. + nan_sig p-value whether H0: 'NaN''s are not correlated' could be correct + if nan_sig < alpha, H1 ('NaNs are correlated') is very likely. + + The result is only valid if the occurence of NaN's is uncorrelated. In + order to avoid this pitfall, the correlation of NaN's should be checked + or case-wise deletion should be applied. + Case-Wise deletion can be implemented + ix = ~any(isnan([X,Y]),2); + [...] = CORRCOEF(X(ix,:),Y(ix,:),...); + + Correlation (non-random distribution) of NaN's can be checked with + [nan_R,nan_sig]=corrcoef(X,isnan(X)) + or [nan_R,nan_sig]=corrcoef([X,Y],isnan([X,Y])) + or [R,p,ci1,ci2] = CORRCOEF(...); + + Further recommandation related to the correlation coefficient: + + LOOK AT THE SCATTERPLOTS to make sure that the relationship is linear + + Correlation is not causation because + it is not clear which parameter is 'cause' and which is 'effect' and + the observed correlation between two variables might be due to the action of other, unobserved variables. + + see also: SUMSKIPNAN, COVM, COV, COR, SPEARMAN, RANKCORR, RANKS, + PARTCORRCOEF, flag_implicit_significance + + REFERENCES: + on the correlation coefficient + [ 1] http://mathworld.wolfram.com/CorrelationCoefficient.html + [ 2] http://www.geography.btinternet.co.uk/spearman.htm + [ 3] Hogg, R. V. and Craig, A. T. Introduction to Mathematical Statistics, 5th ed. New York: Macmillan, pp. 338 and 400, 1995. + [ 4] Lehmann, E. L. and D'Abrera, H. J. M. Nonparametrics: Statistical Methods Based on Ranks, rev. ed. Englewood Cliffs, NJ: Prentice-Hall, pp. 292, 300, and 323, 1998. + [ 5] Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 634-637, 1992 + [ 6] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html + on the significance test of the correlation coefficient + [11] http://www.met.rdg.ac.uk/cag/STATS/corr.html + [12] http://www.janda.org/c10/Lectures/topic06/L24-significanceR.htm + [13] http://faculty.vassar.edu/lowry/ch4apx.html + [14] http://davidmlane.com/hyperstat/B134689.html + [15] http://www.statsoft.com/textbook/stbasic.html%Correlations + others + [20] http://www.tufts.edu/~gdallal/corr.htm + [21] Fisher transformation http://en.wikipedia.org/wiki/Fisher_transformation + + + +# name: +# type: sq_string +# elements: 1 +# length: 71 + CORRCOEF calculates the correlation matrix from pairwise correlations. + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +cov + + +# name: +# type: sq_string +# elements: 1 +# length: 1606 + COV covariance matrix + X and Y can contain missing values encoded with NaN. + NaN's are skipped, NaN do not result in a NaN output. + The output gives NaN only if there are insufficient input data + The mean is removed from the data. + + Remark: for data contains missing values, the resulting + matrix might not be positiv definite, and its elements have magnitudes + larger than one. This ill-behavior is more likely for small sample + sizes, but there is no garantee that the result "behaves well" for larger + sample sizes. If you want the a "well behaved" result (i.e. positive + definiteness and magnitude of elements not larger than 1), use CORRCOEF. + However, COV is faster than CORRCOEF and might be good enough in some cases. + + C = COV(X [,Mode]); + calculates the (auto-)correlation matrix of X + C = COV(X,Y [,Mode]); + calculates the crosscorrelation between X and Y. + C(i,j) is the correlation between the i-th and jth + column of X and Y, respectively. + NOTE: Octave and Matlab have (in some special cases) incompatible implemenations. + This implementation follows Octave. If the result could be ambigous or + incompatible, a warning will be presented in Matlab. To avoid this warning use: + a) use COV([X(:),Y(:)]) if you want the traditional Matlab result. + b) use C = COV([X,Y]), C = C(1:size(X,2),size(X,2)+1:size(C,2)); if you want to be compatible with this software. + + Mode = 0 [default] scales C by (N-1) + Mode = 1 scales C by N. + + see also: COVM, COR, CORRCOEF, SUMSKIPNAN + + REFERENCES: + http://mathworld.wolfram.com/Covariance.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 76 + COV covariance matrix + X and Y can contain missing values encoded with NaN. + + + +# name: +# type: sq_string +# elements: 1 +# length: 4 +covm + + +# name: +# type: sq_string +# elements: 1 +# length: 1182 + COVM generates covariance matrix + X and Y can contain missing values encoded with NaN. + NaN's are skipped, NaN do not result in a NaN output. + The output gives NaN only if there are insufficient input data + + COVM(X,Mode); + calculates the (auto-)correlation matrix of X + COVM(X,Y,Mode); + calculates the crosscorrelation between X and Y + COVM(...,W); + weighted crosscorrelation + + Mode = 'M' minimum or standard mode [default] + C = X'*X; or X'*Y correlation matrix + + Mode = 'E' extended mode + C = [1 X]'*[1 X]; % l is a matching column of 1's + C is additive, i.e. it can be applied to subsequent blocks and summed up afterwards + the mean (or sum) is stored on the 1st row and column of C + + Mode = 'D' or 'D0' detrended mode + the mean of X (and Y) is removed. If combined with extended mode (Mode='DE'), + the mean (or sum) is stored in the 1st row and column of C. + The default scaling is factor (N-1). + Mode = 'D1' is the same as 'D' but uses N for scaling. + + C = covm(...); + C is the scaled by N in Mode M and by (N-1) in mode D. + [C,N] = covm(...); + C is not scaled, provides the scaling factor N + C./N gives the scaled version. + + see also: DECOVM, XCOVF + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + COVM generates covariance matrix + X and Y can contain missing values encoded wi + + + +# name: +# type: sq_string +# elements: 1 +# length: 13 +cumsumskipnan + + +# name: +# type: sq_string +# elements: 1 +# length: 249 + CUMSUMSKIPNAN Cumulative sum while skiping NaN's. + If DIM is omitted, it defaults to the first non-singleton dimension. + + Y = cumsumskipnan(x [,DIM]) + + x input data + DIM dimension (default: []) + y resulting sum + + see also: CUMSUM, SUMSKIPNAN + + + +# name: +# type: sq_string +# elements: 1 +# length: 51 + CUMSUMSKIPNAN Cumulative sum while skiping NaN's. + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +decovm + + +# name: +# type: sq_string +# elements: 1 +# length: 384 + decompose extended covariance matrix into mean (mu), + standard deviation, the (pure) Covariance (COV), + correlation (xc) matrix and the correlation coefficients R2. + NaN's are condsidered as missing values. + [mu,sd,COV,xc,N,R2]=decovm(ECM[,NN]) + + ECM is the extended covariance matrix + NN is the number of elements, each estimate (in ECM) is based on + + see also: MDBC, COVM, R2 + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + decompose extended covariance matrix into mean (mu), + standard deviation, the + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +detrend + + +# name: +# type: sq_string +# elements: 1 +# length: 837 + DETREND removes the trend from data, NaN's are considered as missing values + + DETREND is fully compatible to previous Matlab and Octave DETREND with the following features added: + - handles NaN's by assuming that these are missing values + - handles unequally spaced data + - second output parameter gives the trend of the data + - compatible to Matlab and Octave + + [...]=detrend([t,] X [,p]) + removes trend for unequally spaced data + t represents the time points + X(i) is the value at time t(i) + p must be a scalar + + [...]=detrend(X,0) + [...]=detrend(X,'constant') + removes the mean + + [...]=detrend(X,p) + removes polynomial of order p (default p=1) + + [...]=detrend(X,1) - default + [...]=detrend(X,'linear') + removes linear trend + + [X,T]=detrend(...) + + X is the detrended data + T is the removed trend + + see also: SUMSKIPNAN, ZSCORE + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + DETREND removes the trend from data, NaN's are considered as missing values + + + + + +# name: +# type: sq_string +# elements: 1 +# length: 4 +ecdf + + +# name: +# type: sq_string +# elements: 1 +# length: 443 + ECDF empirical cumulative function + NaN's are considered Missing values and are ignored. + + [F,X] = ecdf(Y) + calculates empirical cumulative distribution functions (i.e Kaplan-Meier estimate) + ecdf(Y) + ecdf(gca,Y) + without output arguments plots the empirical cdf, in axis gca. + + Y input data + must be a vector or matrix, in case Y is a matrix, the ecdf for every column is computed. + + see also: HISTO2, HISTO3, PERCENTILE, QUANTILE + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + ECDF empirical cumulative function + NaN's are considered Missing values and + + + +# name: +# type: sq_string +# elements: 1 +# length: 19 +flag_accuracy_level + + +# name: +# type: sq_string +# elements: 1 +# length: 1033 + FLAG_ACCURACY_LEVEL sets and gets accuracy level + used in SUMSKIPNAN_MEX and COVM_MEX + The error margin of the naive summation is N*eps (N is the number of samples), + the error margin is only 2*eps if Kahan's summation is used [1]. + + 0: maximum speed [default] + accuracy of double (64bit) with naive summation (error = N*2^-52) + 1: accuracy of extended (80bit) with naive summation (error = N*2^-64) + 2: accuracy of double (64bit) with Kahan summation (error = 2^-52) + 3: accuracy of extended (80bit) with Kahan summation (error = 2^-64) + + Please note, level 3 might be equally accurate but slower than 1 or 2 on + some platforms. In order to determine what is good for you, you might want + to run ACCTEST. + + FLAG = flag_accuracy_level() + gets current level + flag_accuracy_level(FLAG) + sets accuracy level + + see also: ACCTEST + + Reference: + [1] David Goldberg, + What Every Computer Scientist Should Know About Floating-Point Arithmetic + ACM Computing Surveys, Vol 23, No 1, March 1991. + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + FLAG_ACCURACY_LEVEL sets and gets accuracy level + used in SUMSKIPNAN_MEX and + + + +# name: +# type: sq_string +# elements: 1 +# length: 26 +flag_implicit_significance + + +# name: +# type: sq_string +# elements: 1 +# length: 928 + The use of FLAG_IMPLICIT_SIGNIFICANCE is in experimental state. + flag_implicit_significance might even become obsolete. + + FLAG_IMPLICIT_SIGNIFICANCE sets and gets default alpha (level) of any significance test + The default alpha-level is stored in the global variable FLAG_implicit_significance + The idea is that the significance must not be assigned explicitely. + This might yield more readable code. + + Choose alpha low enough, because in alpha*100% of the cases, you will + reject the Null hypothesis just by change. For this reason, the default + alpha is 0.01. + + flag_implicit_significance(0.01) + sets the alpha-level for the significance test + + alpha = flag_implicit_significance() + gets default alpha + + flag_implicit_significance(alpha) + sets default alpha-level + + alpha = flag_implicit_significance(alpha) + gets and sets alpha + + features: + - compatible to Matlab and Octave + + see also: CORRCOEF, PARTCORRCOEF + + + +# name: +# type: sq_string +# elements: 1 +# length: 64 + The use of FLAG_IMPLICIT_SIGNIFICANCE is in experimental state. + + + +# name: +# type: sq_string +# elements: 1 +# length: 22 +flag_implicit_skip_nan + + +# name: +# type: sq_string +# elements: 1 +# length: 934 + FLAG_IMPLICIT_SKIP_NAN sets and gets default mode for handling NaNs + 1 skips NaN's (the default mode if no mode is set) + 0 NaNs are propagated; input NaN's give NaN's at the output + + FLAG = flag_implicit_skip_nan() + gets current mode + + flag_implicit_skip_nan(FLAG) + sets mode + + prevFLAG = flag_implicit_skip_nan(nextFLAG) + gets previous set FLAG and sets FLAG for the future + flag_implicit_skip_nan(prevFLAG) + resets FLAG to previous mode + + It is used in: + SUMSKIPNAN, MEDIAN, QUANTILES, TRIMEAN + and affects many other functions like: + CENTER, KURTOSIS, MAD, MEAN, MOMENT, RMS, SEM, SKEWNESS, + STATISTIC, STD, VAR, ZSCORE etc. + + The mode is stored in the global variable FLAG_implicit_skip_nan + It is recommended to use flag_implicit_skip_nan(1) as default and + flag_implicit_skip_nan(0) should be used for exceptional cases only. + This feature might disappear without further notice, so you should really not + rely on it. + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + FLAG_IMPLICIT_SKIP_NAN sets and gets default mode for handling NaNs + 1 skips Na + + + +# name: +# type: sq_string +# elements: 1 +# length: 17 +flag_nans_occured + + +# name: +# type: sq_string +# elements: 1 +# length: 430 + FLAG_NANS_OCCURED checks whether the last call(s) to sumskipnan or covm + contained any not-a-numbers in the input argument. Because many other + functions like mean, std, etc. are also using sumskipnan, + also these functions can be checked for NaN's in the input data. + + A call to FLAG_NANS_OCCURED() resets also the flag whether NaN's occured. + Only sumskipnan or covm can set the flag again. + + see also: SUMSKIPNAN, COVM + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + FLAG_NANS_OCCURED checks whether the last call(s) to sumskipnan or covm + conta + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +fss + + +# name: +# type: sq_string +# elements: 1 +# length: 1739 + FSS - feature subset selection and feature ranking + the method is motivated by the max-relevance-min-redundancy (mRMR) + approach [1]. However, the default method uses partial correlation, + which has been developed from scratch. PCCM [3] describes + a similar idea, but is more complicated. + An alternative method based on FSDD is implemented, too. + + [idx,score] = fss(D,cl) + [idx,score] = fss(D,cl,MODE) + [idx,score] = fss(D,cl,MODE) + + D data - each column represents a feature + cl classlabel + Mode 'Pearson' [default] correlation + 'rank' correlation + 'FSDD' feature selection algorithm based on a distance discriminant [2] + %%% 'MRMR','MID','MIQ' max-relevance, min redundancy [1] - not supported yet. + + score score of the feature + idx ranking of the feature + [tmp,idx]=sort(-score) + + see also: TRAIN_SC, XVAL, ROW_COL_DELETION + + REFERENCES: + [1] Peng, H.C., Long, F., and Ding, C., + Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, + IEEE Transactions on Pattern Analysis and Machine Intelligence, + Vol. 27, No. 8, pp.1226-1238, 2005. + [2] Jianning Liang, Su Yang, Adam Winstanley, + Invariant optimal feature selection: A distance discriminant and feature ranking based solution, + Pattern Recognition, Volume 41, Issue 5, May 2008, Pages 1429-1439. + ISSN 0031-3203, DOI: 10.1016/j.patcog.2007.10.018. + [3] K. Raghuraj Rao and S. Lakshminarayanan + Partial correlation based variable selection approach for multivariate data classification methods + Chemometrics and Intelligent Laboratory Systems + Volume 86, Issue 1, 15 March 2007, Pages 68-81 + http://dx.doi.org/10.1016/j.chemolab.2006.08.007 + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + FSS - feature subset selection and feature ranking + the method is motivated + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +geomean + + +# name: +# type: sq_string +# elements: 1 +# length: 1207 + GEOMEAN calculates the geomentric mean of data elements. + + y = geomean(x [,DIM [,W]]) is the same as + y = mean(x,'G' [,DIM]) + + DIM dimension + 1 STD of columns + 2 STD of rows + default or []: first DIMENSION, with more than 1 element + W weights to compute weighted mean (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + features: + - can deal with NaN's (missing values) + - weighting of data + - dimension argument also in Octave + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, MEAN, HARMMEAN + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; If not, see . + + + +# name: +# type: sq_string +# elements: 1 +# length: 57 + GEOMEAN calculates the geomentric mean of data elements. + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +gscatter + + +# name: +# type: sq_string +# elements: 1 +# length: 471 + GSCATTER scatter plot of groups + + gscatter(x,y,group) + gscatter(x,y,group,clr,sym,siz) + gscatter(x,y,group,clr,sym,siz,doleg) + gscatter(x,y,group,clr,sym,siz,doleg,xname,yname) + h = gscatter(...) + + x,y, group: vectors with equal length + clf: color vector, default 'bgrcmyk' + sym: symbol, default '.' + siz: size of Marker + doleg: 'on' (default) shows legend, 'off' turns of legend + xname, yname: name of axis + + + see also: ecdf, cdfplot + + References: + + + +# name: +# type: sq_string +# elements: 1 +# length: 34 + GSCATTER scatter plot of groups + + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +harmmean + + +# name: +# type: sq_string +# elements: 1 +# length: 629 + HARMMEAN calculates the harmonic mean of data elements. + The harmonic mean is the inverse of the mean of the inverse elements. + + y = harmmean(x [,DIM [,W]]) is the same as + y = mean(x,'H' [,DIM [,W]]) + + DIM dimension + 1 STD of columns + 2 STD of rows + default or []: first DIMENSION, with more than 1 element + W weights to compute weighted mean (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + features: + - can deal with NaN's (missing values) + - weighting of data + - dimension argument also in Octave + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, MEAN, GEOMEAN + + + + +# name: +# type: sq_string +# elements: 1 +# length: 56 + HARMMEAN calculates the harmonic mean of data elements. + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +hist2res + + +# name: +# type: sq_string +# elements: 1 +# length: 700 + Evaluates Histogram data + [R]=hist2res(H) + + [y]=hist2res(H,fun) + estimates fun-statistic + + fun 'mean' mean + 'std' standard deviation + 'var' variance + 'sem' standard error of the mean + 'rms' root mean square + 'meansq' mean of squares + 'sum' sum + 'sumsq' sum of squares + 'CM#' central moment of order # + 'skewness' skewness + 'kurtosis' excess coefficient (Fisher kurtosis) + + see also: NaN/statistic + + REFERENCES: + [1] C.L. Nikias and A.P. Petropulu "Higher-Order Spectra Analysis" Prentice Hall, 1993. + [2] C.E. Shannon and W. Weaver "The mathematical theory of communication" University of Illinois Press, Urbana 1949 (reprint 1963). + [3] http://www.itl.nist.gov/ + [4] http://mathworld.wolfram.com/ + + + +# name: +# type: sq_string +# elements: 1 +# length: 43 + Evaluates Histogram data + [R]=hist2res(H) + + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +iqr + + +# name: +# type: sq_string +# elements: 1 +# length: 372 + IQR calculates the interquartile range + Missing values (encoded as NaN) are ignored. + + Q = iqr(Y) + Q = iqr(Y,DIM) + returns the IQR along dimension DIM of sample array Y. + + Q = iqr(HIS) + returns the IQR from the histogram HIS. + HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3. + + see also: MAD, RANGE, HISTO2, HISTO3, PERCENTILE, QUANTILE + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + IQR calculates the interquartile range + Missing values (encoded as NaN) are + + + +# name: +# type: sq_string +# elements: 1 +# length: 5 +kappa + + +# name: +# type: sq_string +# elements: 1 +# length: 1760 + KAPPA estimates Cohen's kappa coefficient + and related statistics + + [...] = kappa(d1,d2); + NaN's are handled as missing values and are ignored + [...] = kappa(d1,d2,'notIgnoreNAN'); + NaN's are handled as just another Label. + [kap,sd,H,z,ACC,sACC,MI] = kappa(...); + X = kappa(...); + + d1 data of scorer 1 + d2 data of scorer 2 + + kap Cohen's kappa coefficient point + se standard error of the kappa estimate + H Concordance matrix, i.e. confusion matrix + z z-score + ACC overall agreement (accuracy) + sACC specific accuracy + MI Mutual information or transfer information (in [bits]) + X is a struct containing all the fields above + For two classes, a number of additional summary statistics including + TPR, FPR, FDR, PPV, NPF, F1, dprime, Matthews Correlation coefficient (MCC) or + Phi coefficient (PHI=MCC), Specificity and Sensitivity + are provided. Note, the positive category must the larger label (in d and c), otherwise + the confusion matrix becomes transposed and the summary statistics are messed up. + + + Reference(s): + [1] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46. + [2] J Bortz, GA Lienert (1998) Kurzgefasste Statistik f|r die klassische Forschung, Springer Berlin - Heidelberg. + Kapitel 6: Uebereinstimmungsmasze fuer subjektive Merkmalsurteile. p. 265-270. + [3] http://www.cmis.csiro.au/Fiona.Evans/personal/msc/html/chapter3.html + [4] Kraemer, H. C. (1982). Kappa coefficient. In S. Kotz and N. L. Johnson (Eds.), + Encyclopedia of Statistical Sciences. New York: John Wiley & Sons. + [5] http://ourworld.compuserve.com/homepages/jsuebersax/kappa.htm + [6] http://en.wikipedia.org/wiki/Receiver_operating_characteristic + + + +# name: +# type: sq_string +# elements: 1 +# length: 70 + KAPPA estimates Cohen's kappa coefficient + and related statistics + + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +kurtosis + + +# name: +# type: sq_string +# elements: 1 +# length: 461 + KURTOSIS estimates the kurtosis + + y = kurtosis(x,DIM) + calculates kurtosis of x in dimension DIM + + DIM dimension + 1: STATS of columns + 2: STATS of rows + default or []: first DIMENSION, with more than 1 element + + features: + - can deal with NaN's (missing values) + - dimension argument + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, VAR, STD, VAR, SKEWNESS, MOMENT, STATISTIC, + IMPLICIT_SKIP_NAN + + REFERENCE(S): + http://mathworld.wolfram.com/ + + + +# name: +# type: sq_string +# elements: 1 +# length: 33 + KURTOSIS estimates the kurtosis + + + + +# name: +# type: sq_string +# elements: 1 +# length: 15 +load_fisheriris + + +# name: +# type: sq_string +# elements: 1 +# length: 446 + LOAD_FISHERIRIS + loads famous iris data set from Fisher, 1936 [1]. + + References: + [1] Fisher,R.A. "The use of multiple measurements in taxonomic problems" + Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950). + [2] Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis. + (Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218. + + + +# name: +# type: sq_string +# elements: 1 +# length: 75 + LOAD_FISHERIRIS + loads famous iris data set from Fisher, 1936 [1]. + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +mad + + +# name: +# type: sq_string +# elements: 1 +# length: 855 + MAD estimates the Mean Absolute deviation + (note that according to [1,2] this is the mean deviation; + not the mean absolute deviation) + + y = mad(x,DIM) + calculates the mean deviation of x in dimension DIM + + DIM dimension + 1: STATS of columns + 2: STATS of rows + default or []: first DIMENSION, with more than 1 element + + features: + - can deal with NaN's (missing values) + - dimension argument + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, VAR, STD, + + REFERENCE(S): + [1] http://mathworld.wolfram.com/MeanDeviation.html + [2] L. Sachs, "Applied Statistics: A Handbook of Techniques", Springer-Verlag, 1984, page 253. + + [3] http://mathworld.wolfram.com/MeanAbsoluteDeviation.html + [4] Kenney, J. F. and Keeping, E. S. "Mean Absolute Deviation." §6.4 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 76-77 1962. + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + MAD estimates the Mean Absolute deviation + (note that according to [1,2] this i + + + +# name: +# type: sq_string +# elements: 1 +# length: 5 +mahal + + +# name: +# type: sq_string +# elements: 1 +# length: 395 + MAHAL return the Mahalanobis' D-square distance between the + multivariate samples x and y, which must have the same number + of components (columns), but may have a different number of observations (rows). + + d = mahal(X,Y) + + d(k) = (X(k,:)-MU)*inv(SIGMA)*(X(k,:)-MU)' + + where MU and SIGMA are the mean and the covariance matrix of Y + + + see also: TRAIN_SC, TEST_SC, COVM + + References: + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + MAHAL return the Mahalanobis' D-square distance between the + multivariate samp + + + +# name: +# type: sq_string +# elements: 1 +# length: 4 +make + + +# name: +# type: sq_string +# elements: 1 +# length: 46 + This make.m is used for Matlab under Windows + + + +# name: +# type: sq_string +# elements: 1 +# length: 11 + This make. + + + +# name: +# type: sq_string +# elements: 1 +# length: 4 +mean + + +# name: +# type: sq_string +# elements: 1 +# length: 735 + MEAN calculates the mean of data elements. + + y = mean(x [,DIM] [,opt] [, W]) + + DIM dimension + 1 MEAN of columns + 2 MEAN of rows + N MEAN of N-th dimension + default or []: first DIMENSION, with more than 1 element + + opt options + 'A' arithmetic mean + 'G' geometric mean + 'H' harmonic mean + + W weights to compute weighted mean (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + usage: + mean(x) + mean(x,DIM) + mean(x,opt) + mean(x,opt,DIM) + mean(x,DIM,opt) + mean(x,DIM,W) + mean(x,DIM,opt,W); ' + + features: + - can deal with NaN's (missing values) + - weighting of data + - dimension argument also in Octave + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, MEAN, GEOMEAN, HARMMEAN + + + + +# name: +# type: sq_string +# elements: 1 +# length: 43 + MEAN calculates the mean of data elements. + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +meandev + + +# name: +# type: sq_string +# elements: 1 +# length: 856 + MEANDEV estimates the Mean deviation + (note that according to [1,2] this is the mean deviation; + not the mean absolute deviation) + + y = meandev(x,DIM) + calculates the mean deviation of x in dimension DIM + + DIM dimension + 1: STATS of columns + 2: STATS of rows + default or []: first DIMENSION, with more than 1 element + + features: + - can deal with NaN's (missing values) + - dimension argument + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, VAR, STD, MAD + + REFERENCE(S): + [1] http://mathworld.wolfram.com/MeanDeviation.html + [2] L. Sachs, "Applied Statistics: A Handbook of Techniques", Springer-Verlag, 1984, page 253. + [3] http://mathworld.wolfram.com/MeanAbsoluteDeviation.html + [4] Kenney, J. F. and Keeping, E. S. "Mean Absolute Deviation." §6.4 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 76-77 1962. + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + MEANDEV estimates the Mean deviation + (note that according to [1,2] this is the + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +meansq + + +# name: +# type: sq_string +# elements: 1 +# length: 527 + MEANSQ calculates the mean of the squares + + y = meansq(x,DIM,W) + + DIM dimension + 1 STD of columns + 2 STD of rows + N STD of N-th dimension + default or []: first DIMENSION, with more than 1 element + W weights to compute weighted mean (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + features: + - can deal with NaN's (missing values) + - weighting of data + - dimension argument also in Octave + - compatible to Matlab and Octave + + see also: SUMSQ, SUMSKIPNAN, MEAN, VAR, STD, RMS + + + +# name: +# type: sq_string +# elements: 1 +# length: 43 + MEANSQ calculates the mean of the squares + + + + +# name: +# type: sq_string +# elements: 1 +# length: 9 +medAbsDev + + +# name: +# type: sq_string +# elements: 1 +# length: 373 + medAbsDev calculates the median absolute deviation + + Usage: D = medAbsDev(X, DIM) + or: [D, M] = medAbsDev(X, DIM) + Input: X : data + DIM: dimension along which mad should be calculated (1=columns, 2=rows) + (optional, default=first dimension with more than 1 element + Output: D : median absolute deviations + M : medians (optional) + + + +# name: +# type: sq_string +# elements: 1 +# length: 53 + medAbsDev calculates the median absolute deviation + + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +median + + +# name: +# type: sq_string +# elements: 1 +# length: 366 + MEDIAN data elements, + [y]=median(x [,DIM]) + + DIM dimension + 1: median of columns + 2: median of rows + N: median of N-th dimension + default or []: first DIMENSION, with more than 1 element + + features: + - can deal with NaN's (missing values) + - accepts dimension argument like in Matlab in Octave, too. + - compatible to Matlab and Octave + + see also: SUMSKIPNAN + + + +# name: +# type: sq_string +# elements: 1 +# length: 46 + MEDIAN data elements, + [y]=median(x [,DIM]) + + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +moment + + +# name: +# type: sq_string +# elements: 1 +# length: 627 + MOMENT estimates the p-th moment + + M = moment(x, p [,opt] [,DIM]) + M = moment(H, p [,opt]) + calculates p-th central moment from data x in dimension DIM + of from Histogram H + + p moment of order p + opt 'ac': absolute 'a' and/or central ('c') moment + DEFAULT: '' raw moments are estimated + DIM dimension + 1: STATS of columns + 2: STATS of rows + default or []: first DIMENSION, with more than 1 element + + features: + - can deal with NaN's (missing values) + - dimension argument + - compatible to Matlab and Octave + + see also: STD, VAR, SKEWNESS, KURTOSIS, STATISTIC, + + REFERENCE(S): + http://mathworld.wolfram.com/Moment.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + MOMENT estimates the p-th moment + + M = moment(x, p [,opt] [,DIM]) + M = moment + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +nanconv + + +# name: +# type: sq_string +# elements: 1 +# length: 616 + NANCONV computes the convolution for data with missing values. + X and Y can contain missing values encoded with NaN. + NaN's are skipped, NaN do not result in a NaN output. + The output gives NaN only if there are insufficient input data + + [...] = NANCONV(X,Y); + calculates 2-dim convolution between X and Y + [C] = NANCONV(X,Y); + + WARNING: missing values can introduce aliasing - causing unintended results. + Moreover, the behavior of bandpass and highpass filters in case of missing values + is not fully understood, and might contain some pitfalls. + + see also: CONV, NANCONV2, NANFFT, NANFILTER + + + +# name: +# type: sq_string +# elements: 1 +# length: 63 + NANCONV computes the convolution for data with missing values. + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +nanfft + + +# name: +# type: sq_string +# elements: 1 +# length: 618 + NANFFT calculates the Fourier-Transform of X for data with missing values. + NANFFT is the same as FFT but X can contain missing values encoded with NaN. + NaN's are skipped, NaN do not result in a NaN output. + + Y = NANFFT(X) + Y = NANFFT(X,N) + Y = NANFFT(X,[],DIM) + + [Y,N] = NANFFT(...) + returns the number of valid samples N + + + WARNING: missing values can introduce aliasing - causing unintended results. + Moreover, the behavior of bandpass and highpass filters in case of missing values + is not fully understood, and might contain some pitfalls. + + see also: FFT, XCORR, NANCONV, NANFILTER + + + +# name: +# type: sq_string +# elements: 1 +# length: 75 + NANFFT calculates the Fourier-Transform of X for data with missing values. + + + +# name: +# type: sq_string +# elements: 1 +# length: 9 +nanfilter + + +# name: +# type: sq_string +# elements: 1 +# length: 519 + NANFILTER is able to filter data with missing values encoded as NaN. + + [Y,Z] = nanfilter(B,A,X [, Z]); + + If X contains no missing data, NANFILTER should behave like FILTER. + NaN-values are handled gracefully. + + WARNING: missing values can introduce aliasing - causing unintended results. + Moreover, the behavior of bandpass and highpass filters in case of missing values + is not fully understood, and might contain some pitfalls. + + see also: FILTER, SUMSKIPNAN, NANFFT, NANCONV, NANFILTER1UC + + + +# name: +# type: sq_string +# elements: 1 +# length: 69 + NANFILTER is able to filter data with missing values encoded as NaN. + + + +# name: +# type: sq_string +# elements: 1 +# length: 12 +nanfilter1uc + + +# name: +# type: sq_string +# elements: 1 +# length: 257 + NANFILTER1UC is an adaptive filter for data with missing values encoded as NaN. + + [Y,Z] = nanfilter1uc(uc,X [, Z]); + + if X contains no missing data, NANFILTER behaves like FILTER(uc,[1,uc-1],X[,Z]). + + see also: FILTER, NANFILTER, SUMSKIPNAN + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + NANFILTER1UC is an adaptive filter for data with missing values encoded as NaN. + + + +# name: +# type: sq_string +# elements: 1 +# length: 11 +naninsttest + + +# name: +# type: sq_string +# elements: 1 +# length: 112 + NANINSTTEST checks whether the functions from NaN-toolbox have been + correctly installed. + + see also: NANTEST + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + NANINSTTEST checks whether the functions from NaN-toolbox have been + correctly + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +nanmean + + +# name: +# type: sq_string +# elements: 1 +# length: 330 + NANMEAN same as SUM but ignores NaN's. + NANMEAN is OBSOLETE; use MEAN instead. NANMEAN is included + to provide backward compatibility + + Y = nanmean(x [,DIM]) + + DIM dimension + 1 sum of columns + 2 sum of rows + default or []: first DIMENSION with more than 1 element + Y resulting mean + + + see also: MEAN, SUMSKIPNAN, NANSUM + + + +# name: +# type: sq_string +# elements: 1 +# length: 39 + NANMEAN same as SUM but ignores NaN's. + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +nanstd + + +# name: +# type: sq_string +# elements: 1 +# length: 518 + NANSTD same as STD but ignores NaN's. + NANSTD is OBSOLETE; use NaN/STD instead. NANSTD is included + to fix a bug in alternative implementations and to + provide some compatibility. + + Y = nanstd(x, FLAG, [,DIM]) + + x data + FLAG 0: [default] normalizes with (N-1), N = sample size + FLAG 1: normalizes with N, N = sample size + DIM dimension + 1 sum of columns + 2 sum of rows + default or []: first DIMENSION with more than 1 element + Y resulting standard deviation + + see also: SUM, SUMSKIPNAN, NANSUM, STD + + + +# name: +# type: sq_string +# elements: 1 +# length: 38 + NANSTD same as STD but ignores NaN's. + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +nansum + + +# name: +# type: sq_string +# elements: 1 +# length: 333 + NANSUM same as SUM but ignores NaN's. + NANSUM is OBSOLETE; use SUMSKIPNAN instead. NANSUM is included + to fix a bug in some other versions. + + Y = nansum(x [,DIM]) + + DIM dimension + 1 sum of columns + 2 sum of rows + default or []: first DIMENSION with more than 1 element + Y resulting sum + + + see also: SUM, SUMSKIPNAN, NANSUM + + + +# name: +# type: sq_string +# elements: 1 +# length: 38 + NANSUM same as SUM but ignores NaN's. + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +nantest + + +# name: +# type: sq_string +# elements: 1 +# length: 366 + NANTEST checks several mathematical operations and a few + statistical functions for their correctness related to NaN's. + e.g. it checks norminv, normcdf, normpdf, sort, matrix division and multiplication. + + + see also: NANINSTTEST + + REFERENCE(S): + [1] W. Kahan (1996) Lecture notes on the Status of "IEEE Standard 754 for + Binary Floating-point Arithmetic. + + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + NANTEST checks several mathematical operations and a few + statistical function + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +normcdf + + +# name: +# type: sq_string +# elements: 1 +# length: 290 + NORMCDF returns normal cumulative distribtion function + + cdf = normcdf(x,m,s); + + Computes the CDF of a the normal distribution + with mean m and standard deviation s + default: m=0; s=1; + x,m,s must be matrices of same size, or any one can be a scalar. + + see also: NORMPDF, NORMINV + + + +# name: +# type: sq_string +# elements: 1 +# length: 56 + NORMCDF returns normal cumulative distribtion function + + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +norminv + + +# name: +# type: sq_string +# elements: 1 +# length: 341 + NORMINV returns inverse cumulative function of the normal distribution + + x = norminv(p,m,s); + + Computes the quantile (inverse of the CDF) of a the normal + cumulative distribution with mean m and standard deviation s + default: m=0; s=1; + p,m,s must be matrices of same size, or any one can be a scalar. + + see also: NORMPDF, NORMCDF + + + +# name: +# type: sq_string +# elements: 1 +# length: 72 + NORMINV returns inverse cumulative function of the normal distribution + + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +normpdf + + +# name: +# type: sq_string +# elements: 1 +# length: 279 + NORMPDF returns normal probability density + + pdf = normpdf(x,m,s); + + Computes the PDF of a the normal distribution + with mean m and standard deviation s + default: m=0; s=1; + x,m,s must be matrices of same size, or any one can be a scalar. + + see also: NORMCDF, NORMINV + + + +# name: +# type: sq_string +# elements: 1 +# length: 45 + NORMPDF returns normal probability density + + + + +# name: +# type: sq_string +# elements: 1 +# length: 12 +partcorrcoef + + +# name: +# type: sq_string +# elements: 1 +# length: 2015 + PARTCORRCOEF calculates the partial correlation between X and Y + after removing the influence of Z. + X, Y and Z can contain missing values encoded with NaN. + NaN's are skipped, NaN do not result in a NaN output. + (Its assumed that the occurence of NaN's is uncorrelated) + The output gives NaN, only if there are insufficient input data. + + The partial correlation is defined as + pcc(xy|z)=(cc(x,y)-cc(x,z)*cc(y,z))/sqrt((1-cc(x,y)�)*((1-cc(x,z)�))) + + + PARTCORRCOEF(X [,Mode]); + calculates the (auto-)correlation matrix of X + PARTCORRCOEF(X,Y,Z); + PARTCORRCOEF(X,Y,Z,[]); + PARTCORRCOEF(X,Y,Z,'Pearson'); + PARTCORRCOEF(X,Y,Z,'Rank'); + PARTCORRCOEF(X,Y,Z,'Spearman'); + + Mode=[] [default] + removes from X and Y the part that can be explained by Z + and computes the correlation of the remaining part. + Ideally, this is equivalent to Mode='Pearson', however, in practice + this is more accurate. + Mode='Pearson' or 'parametric' + Mode='Spearman' + Mode='Rank' + computes the partial correlation based on cc(x,y),cc(x,z) and cc(y,z) + with the respective mode. + + [R,p,ci1,ci2] = PARTCORRCOEF(...); + r is the partialcorrelation matrix + r(i,j) is the partial correlation coefficient r between X(:,i) and Y(:,j) + when influence of Z is removed. + p gives the significance of PCC + It tests the null hypothesis that the product moment correlation coefficient is zero + using Student's t-test on the statistic t = r sqrt(N-Nz-2)/sqrt(1-r^2) + where N is the number of samples (Statistics, M. Spiegel, Schaum series). + p > alpha: do not reject the Null hypothesis: "R is zero". + p < alpha: The alternative hypothesis "R2 is larger than zero" is true with probability (1-alpha). + ci1 lower 0.95 confidence interval + ci2 upper 0.95 confidence interval + + see also: SUMSKIPNAN, COVM, COV, COR, SPEARMAN, RANKCORR, RANKS, CORRCOEF + + REFERENCES: + on the partial correlation coefficient + [1] http://www.tufts.edu/~gdallal/partial.htm + [2] http://www.nag.co.uk/numeric/fl/manual/pdf/G02/g02byf.pdf + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + PARTCORRCOEF calculates the partial correlation between X and Y + after removing + + + +# name: +# type: sq_string +# elements: 1 +# length: 10 +percentile + + +# name: +# type: sq_string +# elements: 1 +# length: 554 + PERCENTILE calculates the percentiles of histograms and sample arrays. + + Q = percentile(Y,q) + Q = percentile(Y,q,DIM) + returns the q-th percentile along dimension DIM of sample array Y. + size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q) + + Q = percentile(HIS,q) + returns the q-th percentile from the histogram HIS. + HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3. + If q is a vector, the each row of Q returns the q(i)-th percentile + + see also: HISTO2, HISTO3, QUANTILE + + + +# name: +# type: sq_string +# elements: 1 +# length: 71 + PERCENTILE calculates the percentiles of histograms and sample arrays. + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +prctile + + +# name: +# type: sq_string +# elements: 1 +# length: 576 + PRCTILE calculates the percentiles of histograms and sample arrays. + (its the same than PERCENTILE.M) + + Q = prctile(Y,q) + Q = prctile(Y,q,DIM) + returns the q-th percentile along dimension DIM of sample array Y. + size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q) + + Q = prctile(HIS,q) + returns the q-th percentile from the histogram HIS. + HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3. + If q is a vector, the each row of Q returns the q(i)-th percentile + + see also: HISTO2, HISTO3, QUANTILE + + + +# name: +# type: sq_string +# elements: 1 +# length: 68 + PRCTILE calculates the percentiles of histograms and sample arrays. + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +quantile + + +# name: +# type: sq_string +# elements: 1 +# length: 528 + QUANTILE calculates the quantiles of histograms and sample arrays. + + Q = quantile(Y,q) + Q = quantile(Y,q,DIM) + returns the q-th quantile along dimension DIM of sample array Y. + size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q) + + Q = quantile(HIS,q) + returns the q-th quantile from the histogram HIS. + HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3. + If q is a vector, the each row of Q returns the q(i)-th quantile + + see also: HISTO2, HISTO3, PERCENTILE + + + +# name: +# type: sq_string +# elements: 1 +# length: 67 + QUANTILE calculates the quantiles of histograms and sample arrays. + + + +# name: +# type: sq_string +# elements: 1 +# length: 5 +range + + +# name: +# type: sq_string +# elements: 1 +# length: 371 + RANGE calculates the range of Y + Missing values (encoded as NaN) are ignored. + + Q = range(Y) + Q = range(Y,DIM) + returns the range along dimension DIM of sample array Y. + + Q = range(HIS) + returns the RANGE from the histogram HIS. + HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3. + + see also: IQR, MAD, HISTO2, HISTO3, PERCENTILE, QUANTILE + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + RANGE calculates the range of Y + Missing values (encoded as NaN) are ignored. + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +rankcorr + + +# name: +# type: sq_string +# elements: 1 +# length: 668 + RANKCORR calculated the rank correlation coefficient. + This function is replaced by CORRCOEF. + Significance test and confidence intervals can be obtained from CORRCOEF, too. + + R = CORRCOEF(X, [Y, ] 'Rank'); + + The rank correlation r = corrcoef(ranks(x)). + is often confused with Spearman's rank correlation. + Spearman's correlation is defined as + r(x,y) = 1-6*sum((ranks(x)-ranks(y)).^2)/(N*(N*N-1)) + The results are different. Here, the former version is implemented. + + see also: CORRCOEF, SPEARMAN, RANKS + + REFERENCES: + [1] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html + [2] http://mathworld.wolfram.com/CorrelationCoefficient.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 54 + RANKCORR calculated the rank correlation coefficient. + + + +# name: +# type: sq_string +# elements: 1 +# length: 5 +ranks + + +# name: +# type: sq_string +# elements: 1 +# length: 1062 + RANKS gives the rank of each element in a vector. + This program uses an advanced algorithm with averge effort O(m.n.log(n)) + NaN in the input yields NaN in the output. + + r = ranks(X[,DIM]) + if X is a vector, return the vector of ranks of X adjusted for ties. + if X is matrix, the rank is calculated along dimension DIM. + if DIM is zero or empty, the lowest dimension with more then 1 element is used. + r = ranks(X,DIM,'traditional') + implements the traditional algorithm with O(n^2) computational + and O(n^2) memory effort + r = ranks(X,DIM,'mtraditional') + implements the traditional algorithm with O(n^2) computational + and O(n) memory effort + r = ranks(X,DIM,'advanced ') + implements an advanced algorithm with O(n*log(n)) computational + and O(n.log(n)) memory effort + r = ranks(X,DIM,'advanced-ties') + implements an advanced algorithm with O(n*log(n)) computational + and O(n.log(n)) memory effort + but without correction for ties + This is the fastest algorithm + + see also: CORRCOEF, SPEARMAN, RANKCORR + + REFERENCES: + -- + + + +# name: +# type: sq_string +# elements: 1 +# length: 50 + RANKS gives the rank of each element in a vector. + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +rms + + +# name: +# type: sq_string +# elements: 1 +# length: 560 + RMS calculates the root mean square + can deal with complex data. + + y = rms(x,DIM,W) + + DIM dimension + 1 STD of columns + 2 STD of rows + N STD of N-th dimension + default or []: first DIMENSION, with more than 1 element + W weights to compute weighted s.d. (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + y estimated standard deviation + + features: + - can deal with NaN's (missing values) + - weighting of data + - dimension argument also in Octave + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, MEAN + + + +# name: +# type: sq_string +# elements: 1 +# length: 67 + RMS calculates the root mean square + can deal with complex data. + + + +# name: +# type: sq_string +# elements: 1 +# length: 16 +row_col_deletion + + +# name: +# type: sq_string +# elements: 1 +# length: 739 + ROW_COL_DELETION selects the rows and columns for removing any missing values. + A heuristic based on maximizing the number of remaining sample values + is used. In other words, if there are more rows than columns, it is + more likely that a row-wise deletion will be applied and vice versa. + + [rix,cix] = row_col_deletion(d) + [rix,cix] = row_col_deletion(d,c,w) + + Input: + d data (each row is a sample, each column a feature) + c classlabels (not really used) [OPTIONAL] + w weight for each sample vector [OPTIONAL] + Output: + rix selected samples + cix selected columns + + d(rix,cix) does not contain any NaN's i.e. missing values + + see also: TRAIN_SC, TEST_SC + + + +# name: +# type: sq_string +# elements: 1 +# length: 79 + ROW_COL_DELETION selects the rows and columns for removing any missing values. + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +sem + + +# name: +# type: sq_string +# elements: 1 +# length: 695 + SEM calculates the standard error of the mean + + [SE,M] = SEM(x [, DIM [,W]]) + calculates the standard error (SE) in dimension DIM + the default DIM is the first non-single dimension + M returns the mean. + Can deal with complex data, too. + + DIM dimension + 1: SEM of columns + 2: SEM of rows + N: SEM of N-th dimension + default or []: first DIMENSION, with more than 1 element + W weights to compute weighted mean and s.d. (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + features: + - can deal with NaN's (missing values) + - weighting of data + - dimension argument + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, MEAN, VAR, STD + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + SEM calculates the standard error of the mean + + [SE,M] = SEM(x [, DIM [,W]]) + + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +skewness + + +# name: +# type: sq_string +# elements: 1 +# length: 405 + SKEWNESS estimates the skewness + + y = skewness(x,DIM) + calculates skewness of x in dimension DIM + + DIM dimension + 1: STATS of columns + 2: STATS of rows + default or []: first DIMENSION, with more than 1 element + + features: + - can deal with NaN's (missing values) + - dimension argument + - compatible to Matlab and Octave + + see also: SUMSKIPNAN, STATISTIC + + REFERENCE(S): + http://mathworld.wolfram.com/ + + + +# name: +# type: sq_string +# elements: 1 +# length: 34 + SKEWNESS estimates the skewness + + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +spearman + + +# name: +# type: sq_string +# elements: 1 +# length: 683 + SPEARMAN Spearman's rank correlation coefficient. + This function is replaced by CORRCOEF. + Significance test and confidence intervals can be obtained from CORRCOEF. + + [R,p,ci1,ci2] = CORRCOEF(x, [y, ] 'Rank'); + + For some (unknown) reason, in previous versions Spearman's rank correlation + r = corrcoef(ranks(x)). + But according to [1], Spearman's correlation is defined as + r = 1-6*sum((ranks(x)-ranks(y)).^2)/(N*(N*N-1)) + The results are different. Here, the later version is implemented. + + see also: CORRCOEF, RANKCORR + + REFERENCES: + [1] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html + [2] http://mathworld.wolfram.com/CorrelationCoefficient.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 50 + SPEARMAN Spearman's rank correlation coefficient. + + + +# name: +# type: sq_string +# elements: 1 +# length: 9 +statistic + + +# name: +# type: sq_string +# elements: 1 +# length: 938 + STATISTIC estimates various statistics at once. + + R = STATISTIC(x,DIM) + calculates all statistic (see list of fun) in dimension DIM + R is a struct with all statistics + + y = STATISTIC(x,fun) + estimate of fun on dimension DIM + y gives the statistic of fun + + DIM dimension + 1: STATS of columns + 2: STATS of rows + N: STATS of N-th dimension + default or []: first DIMENSION, with more than 1 element + + fun 'mean' mean + 'std' standard deviation + 'var' variance + 'sem' standard error of the mean + 'rms' root mean square + 'meansq' mean of squares + 'sum' sum + 'sumsq' sum of squares + 'CM#' central moment of order # + 'skewness' skewness + 'kurtosis' excess coefficient (Fisher kurtosis) + 'mad' mean absolute deviation + + features: + - can deal with NaN's (missing values) + - dimension argument + - compatible to Matlab and Octave + + see also: SUMSKIPNAN + + REFERENCE(S): + [1] http://www.itl.nist.gov/ + [2] http://mathworld.wolfram.com/ + + + +# name: +# type: sq_string +# elements: 1 +# length: 48 + STATISTIC estimates various statistics at once. + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +std + + +# name: +# type: sq_string +# elements: 1 +# length: 983 + STD calculates the standard deviation. + + [y,v] = std(x [, opt[, DIM [, W]]]) + + opt option + 0: normalizes with N-1 [default] + provides the square root of best unbiased estimator of the variance + 1: normalizes with N, + this provides the square root of the second moment around the mean + otherwise: + best unbiased estimator of the standard deviation (see [1]) + + DIM dimension + N STD of N-th dimension + default or []: first DIMENSION, with more than 1 element + W weights to compute weighted s.d. (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + y estimated standard deviation + + features: + - provides an unbiased estimation of the S.D. + - can deal with NaN's (missing values) + - weighting of data + - dimension argument also in Octave + - compatible to Matlab and Octave + + see also: RMS, SUMSKIPNAN, MEAN, VAR, MEANSQ, + + + References(s): + [1] http://mathworld.wolfram.com/StandardDeviationDistribution.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 39 + STD calculates the standard deviation. + + + +# name: +# type: sq_string +# elements: 1 +# length: 10 +sumskipnan + + +# name: +# type: sq_string +# elements: 1 +# length: 1234 + SUMSKIPNAN adds all non-NaN values. + + All NaN's are skipped; NaN's are considered as missing values. + SUMSKIPNAN of NaN's only gives O; and the number of valid elements is return. + SUMSKIPNAN is also the elementary function for calculating + various statistics (e.g. MEAN, STD, VAR, RMS, MEANSQ, SKEWNESS, + KURTOSIS, MOMENT, STATISTIC etc.) from data with missing values. + SUMSKIPNAN implements the DIMENSION-argument for data with missing values. + Also the second output argument return the number of valid elements (not NaNs) + + Y = sumskipnan(x [,DIM]) + [Y,N,SSQ] = sumskipnan(x [,DIM]) + [...] = sumskipnan(x, DIM, W) + + x input data + DIM dimension (default: []) + empty DIM sets DIM to first non singleton dimension + W weight vector for weighted sum, numel(W) must fit size(x,DIM) + Y resulting sum + N number of valid (not missing) elements + SSQ sum of squares + + the function FLAG_NANS_OCCURED() returns whether any value in x + is a not-a-number (NaN) + + features: + - can deal with NaN's (missing values) + - implements dimension argument. + - computes weighted sum + - compatible with Matlab and Octave + + see also: FLAG_NANS_OCCURED, SUM, NANSUM, MEAN, STD, VAR, RMS, MEANSQ, + SSQ, MOMENT, SKEWNESS, KURTOSIS, SEM + + + +# name: +# type: sq_string +# elements: 1 +# length: 36 + SUMSKIPNAN adds all non-NaN values. + + + +# name: +# type: sq_string +# elements: 1 +# length: 5 +sumsq + + +# name: +# type: sq_string +# elements: 1 +# length: 391 + SUMSQ calculates the sum of squares. + + [y] = sumsq(x [, DIM]) + + DIM dimension + N STD of N-th dimension + default or []: first DIMENSION, with more than 1 element + + y estimated standard deviation + + features: + - can deal with NaN's (missing values) + - dimension argument also in Octave + - compatible to Matlab and Octave + + see also: RMS, SUMSKIPNAN, MEAN, VAR, MEANSQ, + + + References(s): + + + +# name: +# type: sq_string +# elements: 1 +# length: 37 + SUMSQ calculates the sum of squares. + + + +# name: +# type: sq_string +# elements: 1 +# length: 4 +tcdf + + +# name: +# type: sq_string +# elements: 1 +# length: 254 + TCDF returns student cumulative distribtion function + + cdf = tcdf(x,DF); + + Computes the CDF of the students distribution + with DF degrees of freedom + x,DF must be matrices of same size, or any one can be a scalar. + + see also: NORMCDF, TPDF, TINV + + + +# name: +# type: sq_string +# elements: 1 +# length: 54 + TCDF returns student cumulative distribtion function + + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +test_sc + + +# name: +# type: sq_string +# elements: 1 +# length: 1441 + TEST_SC: apply statistical and SVM classifier to test data + + R = test_sc(CC,D,TYPE [,target_Classlabel]) + R.output output: "signed" distance for each class. + This represents the distances between sample D and the separating hyperplane + The "signed distance" is possitive if it matches the target class, and + and negative if it lays on the opposite side of the separating hyperplane. + R.classlabel class for output data + The target class is optional. If it is provided, the following values are returned. + R.kappa Cohen's kappa coefficient + R.ACC Classification accuracy + R.H Confusion matrix + + The classifier CC is typically obtained by TRAIN_SC. If a statistical + classifier is used, TYPE can be used to modify the classifier. + TYPE = 'MDA' mahalanobis distance based classifier + TYPE = 'MD2' mahalanobis distance based classifier + TYPE = 'MD3' mahalanobis distance based classifier + TYPE = 'GRB' Gaussian radial basis function + TYPE = 'QDA' quadratic discriminant analysis + TYPE = 'LD2' linear discriminant analysis + TYPE = 'LD3', 'LDA', 'FDA, 'FLDA' (Fisher's) linear discriminant analysis + TYPE = 'LD4' linear discriminant analysis + TYPE = 'GDBC' general distance based classifier + + see also: TRAIN_SC + + References: + [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. + John Wiley & Sons, 2001. + + + +# name: +# type: sq_string +# elements: 1 +# length: 61 + TEST_SC: apply statistical and SVM classifier to test data + + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +tiedrank + + +# name: +# type: sq_string +# elements: 1 +# length: 272 + TIEDRANK compute rank of samples, the mean value is used in case of ties + this function is just a wrapper for RANKS, and provided for compatibility + with the statistics toolbox of matlab(tm) + + R = tiedrank(X) + computes the rank R of vector X + + see also: RANKS + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + TIEDRANK compute rank of samples, the mean value is used in case of ties + this + + + +# name: +# type: sq_string +# elements: 1 +# length: 4 +tinv + + +# name: +# type: sq_string +# elements: 1 +# length: 330 + TINV returns inverse cumulative function of the student distribution + + x = tinv(p,v); + + Computes the quantile (inverse of the CDF) of a the student + cumulative distribution with mean m and standard deviation s + p,v must be matrices of same size, or any one can be a scalar. + + see also: TPDF, TCDF, NORMPDF, NORMCDF, NORMINV + + + +# name: +# type: sq_string +# elements: 1 +# length: 70 + TINV returns inverse cumulative function of the student distribution + + + + +# name: +# type: sq_string +# elements: 1 +# length: 4 +tpdf + + +# name: +# type: sq_string +# elements: 1 +# length: 261 + TPDF returns student probability density + + pdf = tpdf(x,DF); + + Computes the PDF of a the student distribution + with DF degreas of freedom + x,DF must be matrices of same size, or any one can be a scalar. + + see also: TINV, TCDF, NORMPDF, NORMCDF, NORMINV + + + +# name: +# type: sq_string +# elements: 1 +# length: 43 + TPDF returns student probability density + + + + +# name: +# type: sq_string +# elements: 1 +# length: 16 +train_lda_sparse + + +# name: +# type: sq_string +# elements: 1 +# length: 1689 + Linear Discriminant Analysis for the Small Sample Size Problem as described in + Algorithm 1 of J. Duintjer Tebbens, P. Schlesinger: 'Improving + Implementation of Linear Discriminant Analysis for the High Dimension/Small Sample Size + Problem', Computational Statistics and Data Analysis, vol. 52, no. 1, pp. 423-437, 2007. + Input: + X ...... (sparse) training data matrix + G ...... group coding matrix of the training data + test ...... (sparse) test data matrix + Gtest ...... group coding matrix of the test data + par ...... if par = 0 then classification exploits sparsity too + tol ...... tolerance to distinguish zero eigenvalues + Output: + err ...... Wrong classification rate (in %) + trafo ...... LDA transformation vectors + + Reference(s): + J. Duintjer Tebbens, P. Schlesinger: 'Improving + Implementation of Linear Discriminant Analysis for the High Dimension/Small Sample Size + Problem', Computational Statistics and Data Analysis, vol. 52, no. 1, + pp. 423-437, 2007. + + Copyright (C) by J. Duintjer Tebbens, Institute of Computer Science of the Academy of Sciences of the Czech Republic, + Pod Vodarenskou vezi 2, 182 07 Praha 8 Liben, 18.July.2006. + This work was supported by the Program Information Society under project + 1ET400300415. + + + Modified for the use with Matlab6.5 by A. Schloegl, 22.Aug.2006 + + $Id$ + This function is part of the NaN-toolbox + http://pub.ist.ac.at/~schloegl/matlab/NaN/ + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + Linear Discriminant Analysis for the Small Sample Size Problem as described in + + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +train_sc + + +# name: +# type: sq_string +# elements: 1 +# length: 7655 + Train a (statistical) classifier + + CC = train_sc(D,classlabel) + CC = train_sc(D,classlabel,MODE) + CC = train_sc(D,classlabel,MODE, W) + weighting D(k,:) with weight W(k) (not all classifiers supported weighting) + + CC contains the model parameters of a classifier which can be applied + to test data using test_sc. + R = test_sc(CC,D,...) + + D training samples (each row is a sample, each column is a feature) + classlabel labels of each sample, must have the same number of rows as D. + Two different encodings are supported: + {-1,1}-encoding (multiple classes with separate columns for each class) or + 1..M encoding. + So [1;2;3;1;4] is equivalent to + [+1,-1,-1,-1; + [-1,+1,-1,-1; + [-1,-1,+1,-1; + [+1,-1,-1,-1] + [-1,-1,-1,+1] + Note, samples with classlabel=0 are ignored. + + The following classifier types are supported MODE.TYPE + 'MDA' mahalanobis distance based classifier [1] + 'MD2' mahalanobis distance based classifier [1] + 'MD3' mahalanobis distance based classifier [1] + 'GRB' Gaussian radial basis function [1] + 'QDA' quadratic discriminant analysis [1] + 'LD2' linear discriminant analysis (see LDBC2) [1] + MODE.hyperparameter.gamma: regularization parameter [default 0] + 'LD3', 'FDA', 'LDA', 'FLDA' + linear discriminant analysis (see LDBC3) [1] + MODE.hyperparameter.gamma: regularization parameter [default 0] + 'LD4' linear discriminant analysis (see LDBC4) [1] + MODE.hyperparameter.gamma: regularization parameter [default 0] + 'LD5' another LDA (motivated by CSP) + MODE.hyperparameter.gamma: regularization parameter [default 0] + 'RDA' regularized discriminant analysis [7] + MODE.hyperparameter.gamma: regularization parameter + MODE.hyperparameter.lambda = + gamma = 0, lambda = 0 : MDA + gamma = 0, lambda = 1 : LDA [default] + Hint: hyperparameter are used only in test_sc.m, testing different + the hyperparameters do not need repetitive calls to train_sc, + it is sufficient to modify CC.hyperparameter before calling test_sc. + 'GDBC' general distance based classifier [1] + '' statistical classifier, requires Mode argument in TEST_SC + '###/DELETION' if the data contains missing values (encoded as NaNs), + a row-wise or column-wise deletion (depending on which method + removes less data values) is applied; + '###/GSVD' GSVD and statistical classifier [2,3], + '###/sparse' sparse [5] + '###' must be 'LDA' or any other classifier + 'PLS' (linear) partial least squares regression + 'REG' regression analysis; + 'WienerHopf' Wiener-Hopf equation + 'NBC' Naive Bayesian Classifier [6] + 'aNBC' Augmented Naive Bayesian Classifier [6] + 'NBPW' Naive Bayesian Parzen Window [9] + + 'PLA' Perceptron Learning Algorithm [11] + MODE.hyperparameter.alpha = alpha [default: 1] + w = w + alpha * e'*x + 'LMS', 'AdaLine' Least mean squares, adaptive line element, Widrow-Hoff, delta rule + MODE.hyperparameter.alpha = alpha [default: 1] + 'Winnow2' Winnow2 algorithm [12] + + 'PSVM' Proximal SVM [8] + MODE.hyperparameter.nu (default: 1.0) + 'LPM' Linear Programming Machine + uses and requires train_LPM of the iLog CPLEX optimizer + MODE.hyperparameter.c_value = + 'CSP' CommonSpatialPattern is very experimental and just a hack + uses a smoothing window of 50 samples. + 'SVM','SVM1r' support vector machines, one-vs-rest + MODE.hyperparameter.c_value = + 'SVM11' support vector machines, one-vs-one + voting + MODE.hyperparameter.c_value = + 'RBF' Support Vector Machines with RBF Kernel + MODE.hyperparameter.c_value = + MODE.hyperparameter.gamma = + 'SVM:LIB' libSVM [default SVM algorithm) + 'SVM:bioinfo' uses and requires svmtrain from the bioinfo toolbox + 'SVM:OSU' uses and requires mexSVMTrain from the OSU-SVM toolbox + 'SVM:LOO' uses and requires svcm_train from the LOO-SVM toolbox + 'SVM:Gunn' uses and requires svc-functios from the Gunn-SVM toolbox + 'SVM:KM' uses and requires svmclass-function from the KM-SVM toolbox + 'SVM:LINz' LibLinear [10] (requires train.mex from LibLinear somewhere in the path) + z=0 (default) LibLinear with -- L2-regularized logistic regression + z=1 LibLinear with -- L2-loss support vector machines (dual) + z=2 LibLinear with -- L2-loss support vector machines (primal) + z=3 LibLinear with -- L1-loss support vector machines (dual) + 'SVM:LIN4' LibLinear with -- multi-class support vector machines by Crammer and Singer + 'DT' decision tree - not implemented yet. + + {'REG','MDA','MD2','QDA','QDA2','LD2','LD3','LD4','LD5','LD6','NBC','aNBC','WienerHopf','LDA/GSVD','MDA/GSVD', 'LDA/sparse','MDA/sparse', 'PLA', 'LMS','LDA/DELETION','MDA/DELETION','NBC/DELETION','RDA/DELETION','REG/DELETION','RDA','GDBC','SVM','RBF','PSVM','SVM11','SVM:LIN4','SVM:LIN0','SVM:LIN1','SVM:LIN2','SVM:LIN3','WINNOW', 'DT'}; + + CC contains the model parameters of a classifier. Some time ago, + CC was a statistical classifier containing the mean + and the covariance of the data of each class (encoded in the + so-called "extended covariance matrices". Nowadays, also other + classifiers are supported. + + see also: TEST_SC, COVM, ROW_COL_DELETION + + References: + [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. + John Wiley & Sons, 2001. + [2] Peg Howland and Haesun Park, + Generalizing Discriminant Analysis Using the Generalized Singular Value Decomposition + IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 2004. + dx.doi.org/10.1109/TPAMI.2004.46 + [3] http://www-static.cc.gatech.edu/~kihwan23/face_recog_gsvd.htm + [4] Jieping Ye, Ravi Janardan, Cheong Hee Park, Haesun Park + A new optimization criterion for generalized discriminant analysis on undersampled problems. + The Third IEEE International Conference on Data Mining, Melbourne, Florida, USA + November 19 - 22, 2003 + [5] J.D. Tebbens and P. Schlesinger (2006), + Improving Implementation of Linear Discriminant Analysis for the Small Sample Size Problem + Computational Statistics & Data Analysis, vol 52(1): 423-437, 2007 + http://www.cs.cas.cz/mweb/download/publi/JdtSchl2006.pdf + [6] H. Zhang, The optimality of Naive Bayes, + http://www.cs.unb.ca/profs/hzhang/publications/FLAIRS04ZhangH.pdf + [7] J.H. Friedman. Regularized discriminant analysis. + Journal of the American Statistical Association, 84:165–175, 1989. + [8] G. Fung and O.L. Mangasarian, Proximal Support Vector Machine Classifiers, KDD 2001. + Eds. F. Provost and R. Srikant, Proc. KDD-2001: Knowledge Discovery and Data Mining, August 26-29, 2001, San Francisco, CA. + p. 77-86. + [9] Kai Keng Ang, Zhang Yang Chin, Haihong Zhang, Cuntai Guan. + Filter Bank Common Spatial Pattern (FBCSP) in Brain-Computer Interface. + IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). + 1-8 June 2008 Page(s):2390 - 2397 + [10] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. + LIBLINEAR: A Library for Large Linear Classification, Journal of Machine Learning Research 9(2008), 1871-1874. + Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear + [11] http://en.wikipedia.org/wiki/Perceptron#Learning_algorithm + [12] Littlestone, N. (1988) + "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm" + Machine Learning 285-318(2) + http://en.wikipedia.org/wiki/Winnow_(algorithm) + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + Train a (statistical) classifier + + CC = train_sc(D,classlabel) + CC = train_s + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +trimean + + +# name: +# type: sq_string +# elements: 1 +# length: 266 + TRIMEAN yields the weighted mean of the median and the quartiles + m = TRIMEAN(y). + + The trimean is m = (Q1+2*MED+Q3)/4 + with quartile Q1 and Q3 and median MED + + N-dimensional data is supported + + REFERENCES: + [1] http://mathworld.wolfram.com/Trimean.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + TRIMEAN yields the weighted mean of the median and the quartiles + m = TRIMEA + + + +# name: +# type: sq_string +# elements: 1 +# length: 8 +trimmean + + +# name: +# type: sq_string +# elements: 1 +# length: 664 + TRIMMEAN calculates the trimmed mean by removing the fraction of p/2 upper and + p/2 lower samples. Missing values (encoded as NaN) are ignored and not taken into account. + The same number from the upper and lower values are removed, and is compatible to various + spreadsheet programs including GNumeric [1], LibreOffice, OpenOffice and MS Excel. + + Q = trimmean(Y,p) + Q = trimmean(Y,p,DIM) + returns the TRIMMEAN along dimension DIM of sample array Y. + If p is a vector, the TRIMMEAN for each p is computed. + + see also: MAD, RANGE, HISTO2, HISTO3, PERCENTILE, QUANTILE + + References: + [1] http://www.fifi.org/doc/gnumeric-doc/html/C/gnumeric-trimmean.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + TRIMMEAN calculates the trimmed mean by removing the fraction of p/2 upper and + + + +# name: +# type: sq_string +# elements: 1 +# length: 5 +ttest + + +# name: +# type: sq_string +# elements: 1 +# length: 1474 + TTEST (paired) t-test + For a sample X from a normal distribution with unknown mean and + variance, perform a t-test of the null hypothesis `mean (X) == M'. + Under the null, the test statistic T follows a Student + distribution with `DF = length (X) - 1' degrees of freedom. + + TTEST treads NaNs as "Missing values" and ignores these. + + H = ttest(x,m) + tests Null-hypothesis that mean of x is m. + H = ttest(x,y) + size of x and size of y must match, it is tested whether the + difference x-y is significantly different to m=0; + H = ttest(x,y,alpha) + H = ttest(x,y,alpha,tail) + H = ttest(x,y,alpha,tail,DIM) + [H,PVAL] = ttest(...) + + H=1 indicates a rejection of the Null-hypothesis at a significance + level of alpha (default alpha = 0.05). + + With the optional argument string TAIL, the alternative of interest + can be selected. If TAIL is '!=' or '<>' or 'both', the null is tested + against the two-sided Alternative `mean (X) ~= mean (Y)'. If TAIL + is '>' or 'right', the one-sided Alternative `mean (X) > mean (Y)' is used. + Similarly for '<' or 'left', the one-sided Alternative `mean (X) < mean + (Y)' is used. The default is the two-sided case. + + H returns whether the Null-Hypotheses must be rejected. + The p-value of the test is returned in PVAL. + + TTEST works on the first non-singleton dimension or on DIM. + + If no output argument is given, the p-value of the test is + displayed. + + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + TTEST (paired) t-test + For a sample X from a normal distribution with unkno + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +ttest2 + + +# name: +# type: sq_string +# elements: 1 +# length: 1514 + TTEST2 (unpaired) t-test + For two samples x and y from normal distributions with unknown + means and unknown equal variances, perform a two-sample t-test of + the null hypothesis of equal means. Under the null, the test + statistic T follows a Student distribution with DF degrees of + freedom. + + TTEST2 treads NaNs as "Missing values" and ignores these. + + H = ttest2(x,y) + H = ttest2([x;y],C,W) + H = ttest2(x,y,alpha) + H = ttest2(x,y,alpha,tail) + H = ttest2(x,y,alpha,tail,vartype) + H = ttest2(x,y,alpha,tail,vartype,DIM) + [H,PVAL] = ttest2(...) + [h,p,ci,stats] = ttest2(...) + + H=1 indicates a rejection of the Null-hypothesis at a significance + level of alpha (default alpha = 0.05). + + With the optional argument string TAIL, the Alternative of interest + can be selected. If TAIL is '!=' or '<>' or 'both', the null is tested + against the two-sided Alternative `mean (X) ~= mean (Y)'. If TAIL + is '>' or 'right', the one-sided Alternative `mean (X) > mean (Y)' is used. + Similarly for '<' or 'left', the one-sided Alternative `mean (X) < mean + (Y)' is used. The default is the two-sided case. + + vartype support only 'equal' (default value); the value 'unequal' is not supported. + + H returns whether the Null-Hypotheses must be rejected. + The p-value of the test is returned in PVAL. + + TTEST2 works on the first non-singleton dimension or on DIM. + + If no output argument is given, the p-value of the test is + displayed. + + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + TTEST2 (unpaired) t-test + For two samples x and y from normal distributions + + + +# name: +# type: sq_string +# elements: 1 +# length: 3 +var + + +# name: +# type: sq_string +# elements: 1 +# length: 772 + VAR calculates the variance. + + y = var(x [, opt[, DIM]]) + calculates the variance in dimension DIM + the default DIM is the first non-single dimension + + opt 0: normalizes with N-1 [default] + 1: normalizes with N + DIM dimension + 1: VAR of columns + 2: VAR of rows + N: VAR of N-th dimension + default or []: first DIMENSION, with more than 1 element + W weights to compute weighted variance (default: []) + if W=[], all weights are 1. + number of elements in W must match size(x,DIM) + + usage: + var(x) + var(x, opt, DIM) + var(x, [], DIM) + var(x, W, DIM) + var(x, opt, DIM, W) + + features: + - can deal with NaN's (missing values) + - weighting of data + - dimension argument + - compatible to Matlab and Octave + + see also: MEANSQ, SUMSQ, SUMSKIPNAN, MEAN, RMS, STD, + + + +# name: +# type: sq_string +# elements: 1 +# length: 29 + VAR calculates the variance. + + + +# name: +# type: sq_string +# elements: 1 +# length: 5 +xcovf + + +# name: +# type: sq_string +# elements: 1 +# length: 1059 + XCOVF generates cross-covariance function. + XCOVF is the same as XCORR except + X and Y can contain missing values encoded with NaN. + NaN's are skipped, NaN do not result in a NaN output. + The output gives NaN only if there are insufficient input data + + [C,N,LAGS] = xcovf(X,MAXLAG,SCALEOPT); + calculates the (auto-)correlation function of X + [C,N,LAGS] = xcovf(X,Y,MAXLAG,SCALEOPT); + calculates the crosscorrelation function between X and Y + + SCALEOPT [character string] specifies the type of scaling applied + to the correlation vector (or matrix). is one of: + 'none' return the unscaled correlation, R, + 'biased' return the biased average, R/N, + 'unbiased' return the unbiassed average, R(k)/(N-|k|), + 'coeff' return the correlation coefficient, R/(rms(x).rms(y)), + where "k" is the lag, and "N" is the length of X. + If omitted, the default value is "none". + If Y is supplied but does not have the ame length as X, + scale must be "none". + + + see also: COVM, XCORR + + + +# name: +# type: sq_string +# elements: 1 +# length: 43 + XCOVF generates cross-covariance function. + + + +# name: +# type: sq_string +# elements: 1 +# length: 7 +xptopen + + +# name: +# type: sq_string +# elements: 1 +# length: 723 + XPTOPEN read of several file formats and writing of the SAS Transport Format (*.xpt) + Supported are ARFF, SAS-XPT and STATA files. + XPTOPEN is a mex-file and must be compiled before use. + More detailed help can be obtained by the command + xptopen + without an additional argument + + X = xptopen(filename) + X = xptopen(filename,'r') + read file with filename and return variables in struct X + + X = xptopen(filename,'w',X) + save fields of struct X in filename. + + The fields of X must be column vectors of equal length. + Each vector is either a numeric vector or a cell array of strings. + The SAS-XPT format stores Date/Time as numeric value counting the number of days since 1960-01-01. + + + +# name: +# type: sq_string +# elements: 1 +# length: 80 + XPTOPEN read of several file formats and writing of the SAS Transport Format (* + + + +# name: +# type: sq_string +# elements: 1 +# length: 4 +xval + + +# name: +# type: sq_string +# elements: 1 +# length: 2980 + XVAL is used for crossvalidation + + [R,CC] = xval(D,classlabel) + .. = xval(D,classlabel,CLASSIFIER) + .. = xval(D,classlabel,CLASSIFIER,type) + .. = xval(D,{classlabel,W},CLASSIFIER) + .. = xval(D,{classlabel,W,NG},CLASSIFIER) + + example: + load_fisheriris; %builtin iris dataset + C = species; + K = 5; NG = [1:length(C)]'*K/length(C); + [R,CC] = xval(meas,{C,[],NG},'NBC'); + + Input: + D: data features (one feature per column, one sample per row) + classlabel labels of each sample, must have the same number of rows as D. + Two different encodings are supported: + {-1,1}-encoding (multiple classes with separate columns for each class) or + 1..M encoding. + So [1;2;3;1;4] is equivalent to + [+1,-1,-1,-1; + [-1,+1,-1,-1; + [-1,-1,+1,-1; + [+1,-1,-1,-1] + [-1,-1,-1,+1] + Note, samples with classlabel=0 are ignored. + + CLASSIFIER can be any classifier supported by train_sc (default='LDA') + {'REG','MDA','MD2','QDA','QDA2','LD2','LD3','LD4','LD5','LD6','NBC','aNBC','WienerHopf', 'RDA','GDBC', + 'SVM','RBF','PSVM','SVM11','SVM:LIN4','SVM:LIN0','SVM:LIN1','SVM:LIN2','SVM:LIN3','WINNOW'} + these can be modified by ###/GSVD, ###/sparse and ###/DELETION. + /DELETION removes in case of NaN's either the rows or the columns (which removes less data values) with any NaN + /sparse and /GSVD preprocess the data an reduce it to some lower-dimensional space. + Hyperparameters (like alpha for PLA, gamma/lambda for RDA, c_value for SVM, etc) can be defined as + CLASSIFIER.hyperparameter.alpha, etc. and + CLASSIFIER.TYPE = 'PLA' (as listed above). + See train_sc for details. + W: weights for each sample (row) in D. + default: [] (i.e. all weights are 1) + number of elements in W must match the number of rows of D + NG: used to define the type of cross-valdiation + Leave-One-Out-Method (LOOM): NG = [1:length(classlabel)]' (default) + Leave-K-Out-Method: NG = ceil([1:length(classlabel)]'/K) + K-fold XV: NG = ceil([1:length(classlabel)]'*K/length(classlabel)) + group-wise XV (if samples are not indepentent) can be also defined here + samples from the same group (dependent samples) get the same identifier + samples from different groups get different classifiers + TYPE: defines the type of cross-validation procedure if NG is not specified + 'LOOM' leave-one-out-method + k k-fold crossvalidation + + OUTPUT: + R contains the resulting performance metric + CC contains the classifier + + plota(R) shows the confusion matrix of the results + + see also: TRAIN_SC, TEST_SC, CLASSIFY, PLOTA + + References: + [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. + John Wiley & Sons, 2001. + [2] A. Schlögl, J. Kronegg, J.E. Huggins, S. G. Mason; + Evaluation criteria in BCI research. + (Eds.) G. Dornhege, J.R. Millan, T. Hinterberger, D.J. McFarland, K.-R.Müller; + Towards Brain-Computer Interfacing, MIT Press, 2007, p.327-342 + + + +# name: +# type: sq_string +# elements: 1 +# length: 35 + XVAL is used for crossvalidation + + + + +# name: +# type: sq_string +# elements: 1 +# length: 12 +zScoreMedian + + +# name: +# type: sq_string +# elements: 1 +# length: 326 + zScoreMedian removes the median and standardizes by the 1.483*median absolute deviation + + Usage: Z = zScoreMedian(X, DIM) + Input: X : data + DIM: dimension along which z-score should be calculated (1=columns, 2=rows) + (optional, default=first dimension with more than 1 element + Output: Z : z-scores + + + +# name: +# type: sq_string +# elements: 1 +# length: 59 + zScoreMedian removes the median and standardizes by the 1. + + + +# name: +# type: sq_string +# elements: 1 +# length: 6 +zscore + + +# name: +# type: sq_string +# elements: 1 +# length: 622 + ZSCORE removes the mean and normalizes the data + to a variance of 1. Can be used for Pre-Whitening of the data, too. + + [z,r,m] = zscore(x,DIM) + z z-score of x along dimension DIM + r is the inverse of the standard deviation + m is the mean of x + + The data x can be reconstrated with + x = z*diag(1./r) + repmat(m,size(z)./size(m)) + z = x*diag(r) - repmat(m.*v,size(z)./size(m)) + + DIM dimension + 1: STATS of columns + 2: STATS of rows + default or []: first DIMENSION, with more than 1 element + + see also: SUMSKIPNAN, MEAN, STD, DETREND + + REFERENCE(S): + [1] http://mathworld.wolfram.com/z-Score.html + + + +# name: +# type: sq_string +# elements: 1 +# length: 70 + ZSCORE removes the mean and normalizes the data + to a variance of 1. + + + + +