octave_packages/nan-2.5.5/doc-cache

   1 # Created by Octave 3.6.1, Mon Apr 23 21:08:02 2012 UTC <root@brouzouf>
   2 # name: cache
   3 # type: cell
   4 # rows: 3
   5 # columns: 81
   6 # name: <cell-element>
   7 # type: sq_string
   8 # elements: 1
   9 # length: 12
  10 bland_altman
  11
  12
  13 # name: <cell-element>
  14 # type: sq_string
  15 # elements: 1
  16 # length: 865
  17  BLAND_ALTMANN shows the Bland-Altman plot of two columns of measurements
  18    and computes several summary results.
  19
  20    bland_altman(m1, m2 [,group])
  21    bland_altman(data [, group])
  22    R = bland_altman(...)
  23
  24    m1,m2 are two colums with the same number of elements
  25         containing the measurements. m1,m2 can be also combined
  26        in a single two column data matrix.
  27    group [optional] indicates which measurements belong to the same group
  28         This is useful to account for repeated measurements.
  29
  30
  31  References:
  32  [1] JM Bland and DG Altman, Measuring agreement in method comparison studies.
  33        Statistical Methods in Medical Research, 1999; 8; 135.
  34        doi:10.1177/09622802990080204
  35  [2] P.S. Myles, Using the Bland– Altman method to measure agreement with repeated measures
  36         British Journal of Anaesthesia 99(3):309–11 (2007)
  37         doi:10.1093/bja/aem214
  38
  39
  40
  41 # name: <cell-element>
  42 # type: sq_string
  43 # elements: 1
  44 # length: 80
  45  BLAND_ALTMANN shows the Bland-Altman plot of two columns of measurements
  46    and
  47
  48
  49
  50 # name: <cell-element>
  51 # type: sq_string
  52 # elements: 1
  53 # length: 7
  54 cat2bin
  55
  56
  57 # name: <cell-element>
  58 # type: sq_string
  59 # elements: 1
  60 # length: 755
  61  CAT2BIN converts categorial into binary data
  62    each category of each column in D is converted into a logical column
  63
  64    B = cat2bin(C);
  65    [B,BinLabel] = cat2bin(C,Label);
  66    [B,BinLabel] = cat2bin(C,Label,MODE)
  67
  68   C        categorial data
  69   B        binary data
  70   Label    description of each column in C
  71   BinLabel description of each column in B
  72   MODE     default [], ignores NaN
  73            'notIgnoreNAN' includes binary column for NaN
  74            'IgnoreZeros'  zeros do not get a separate category
  75            'IgnoreZeros+NaN' zeros and NaN are ignored
  76
  77   example:
  78      cat2bin([1;2;5;1;5]) results in
  79              1     0     0
  80              0     1     0
  81              0     0     1
  82              1     0     0
  83              0     0     1
  84
  85
  86
  87 # name: <cell-element>
  88 # type: sq_string
  89 # elements: 1
  90 # length: 80
  91  CAT2BIN converts categorial into binary data
  92    each category of each column i
  93
  94
  95
  96 # name: <cell-element>
  97 # type: sq_string
  98 # elements: 1
  99 # length: 7
 100 cdfplot
 101
 102
 103 # name: <cell-element>
 104 # type: sq_string
 105 # elements: 1
 106 # length: 565
 107  CDFPLOT plots empirical commulative distribution function
 108
 109    cdfplot(X)
 110    cdfplot(X, FMT)
 111    cdfplot(X, PROPERTY, VALUE,...)
 112    h = cdfplot(...)
 113    [h,stats] = cdfplot(X)
 114
 115   X contains the data vector
 116         (matrix data is currently changed to a vector, this might change in future)
 117   FMT,PROPERTY,VALUE
 118         are used for formating; see HELP PLOT for more details
 119   h     graphics handle to the cdf curve
 120   stats
 121         a struct containing various summary statistics including
 122         mean, std, median, min, max.
 123
 124  see also: ecdf, median, statistics, hist2res, plot
 125
 126  References:
 127
 128
 129
 130 # name: <cell-element>
 131 # type: sq_string
 132 # elements: 1
 133 # length: 59
 134  CDFPLOT plots empirical commulative distribution function
 135
 136
 137
 138
 139 # name: <cell-element>
 140 # type: sq_string
 141 # elements: 1
 142 # length: 6
 143 center
 144
 145
 146 # name: <cell-element>
 147 # type: sq_string
 148 # elements: 1
 149 # length: 505
 150  CENTER removes the mean
 151
 152  [z,mu] = center(x,DIM,W)
 153    removes mean x along dimension DIM
 154
 155  x      input data
 156  DIM    dimension
 157         1: column
 158         2: row
 159         default or []: first DIMENSION, with more than 1 element
 160  W      weights to computed weighted mean (default: [], all weights = 1)
 161         numel(W) must be equal to size(x,DIM)
 162
 163  features:
 164  - can deal with NaN's (missing values)
 165  - weighting of data
 166  - dimension argument
 167  - compatible to Matlab and Octave
 168
 169  see also: SUMSKIPNAN, MEAN, STD, DETREND, ZSCORE
 170
 171  REFERENCE(S):
 172
 173
 174
 175 # name: <cell-element>
 176 # type: sq_string
 177 # elements: 1
 178 # length: 26
 179  CENTER removes the mean
 180
 181
 182
 183
 184 # name: <cell-element>
 185 # type: sq_string
 186 # elements: 1
 187 # length: 8
 188 classify
 189
 190
 191 # name: <cell-element>
 192 # type: sq_string
 193 # elements: 1
 194 # length: 792
 195  CLASSIFY classifies sample data into categories
 196  defined by the training data and its group information
 197
 198   CLASS = classify(sample, training, group)
 199   CLASS = classify(sample, training, group, TYPE)
 200   [CLASS,ERR,POSTERIOR,LOGP,COEF] = CLASSIFY(...)
 201
 202   CLASS contains the assigned group.
 203   ERR is the classification error on the training set weighted by the
 204         prior propability of each group.
 205
 206   The same classifier as in TRAIN_SC are supported.
 207
 208  ATTENTION: no cross-validation is applied, therefore the
 209     classification error is too optimistic (overfitting).
 210     Use XVAL instead to obtain cross-validated performance.
 211
 212  see also: TRAIN_SC, TEST_SC, XVAL
 213
 214  References:
 215  [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed.
 216        John Wiley & Sons, 2001.
 217
 218
 219
 220 # name: <cell-element>
 221 # type: sq_string
 222 # elements: 1
 223 # length: 80
 224  CLASSIFY classifies sample data into categories
 225  defined by the training data
 226
 227
 228
 229 # name: <cell-element>
 230 # type: sq_string
 231 # elements: 1
 232 # length: 24
 233 coefficient_of_variation
 234
 235
 236 # name: <cell-element>
 237 # type: sq_string
 238 # elements: 1
 239 # length: 221
 240  COEFFICIENT_OF_VARIATION returns STD(X)/MEAN(X)
 241
 242  cv=coefficient_of_variation(x [,DIM])
 243   cv=std(x)/mean(x)
 244
 245  see also: SUMSKIPNAN, MEAN, STD
 246
 247    REFERENCE(S):
 248    http://mathworld.wolfram.com/VariationCoefficient.html
 249
 250
 251
 252 # name: <cell-element>
 253 # type: sq_string
 254 # elements: 1
 255 # length: 80
 256  COEFFICIENT_OF_VARIATION returns STD(X)/MEAN(X)
 257
 258  cv=coefficient_of_variation(
 259
 260
 261
 262 # name: <cell-element>
 263 # type: sq_string
 264 # elements: 1
 265 # length: 3
 266 cor
 267
 268
 269 # name: <cell-element>
 270 # type: sq_string
 271 # elements: 1
 272 # length: 576
 273  COR calculates the correlation matrix
 274  X and Y can contain missing values encoded with NaN.
 275  NaN's are skipped, NaN do not result in a NaN output.
 276  (Its assumed that the occurence of NaN's is uncorrelated)
 277  The output gives NaN only if there are insufficient input data
 278
 279  COR(X);
 280       calculates the (auto-)correlation matrix of X
 281  COR(X,Y);
 282       calculates the crosscorrelation between X and Y
 283
 284  c = COR(...);
 285         c is the correlation matrix
 286
 287  W      weights to compute weighted mean (default: [])
 288         if W=[], all weights are 1.
 289         number of elements in W must match size(x,DIM)
 290
 291
 292
 293 # name: <cell-element>
 294 # type: sq_string
 295 # elements: 1
 296 # length: 80
 297  COR calculates the correlation matrix
 298  X and Y can contain missing values encod
 299
 300
 301
 302 # name: <cell-element>
 303 # type: sq_string
 304 # elements: 1
 305 # length: 8
 306 corrcoef
 307
 308
 309 # name: <cell-element>
 310 # type: sq_string
 311 # elements: 1
 312 # length: 4692
 313  CORRCOEF calculates the correlation matrix from pairwise correlations.
 314    The input data can contain missing values encoded with NaN.
 315    Missing data (NaN's) are handled by pairwise deletion [15].
 316    In order to avoid possible pitfalls, use case-wise deletion or
 317    or check the correlation of NaN's with your data (see below).
 318    A significance test for testing the Hypothesis
 319    'correlation coefficient R is significantly different to zero'
 320    is included.
 321
 322  [...] = CORRCOEF(X);
 323       calculates the (auto-)correlation matrix of X
 324  [...] = CORRCOEF(X,Y);
 325       calculates the crosscorrelation between X and Y
 326
 327  [...] = CORRCOEF(..., Mode);
 328        Mode='Pearson' or 'parametric' [default]
 329                gives the correlation coefficient
 330                also known as the 'product-moment coefficient of correlation'
 331                or 'Pearson''s correlation' [1]
 332        Mode='Spearman'  gives 'Spearman''s Rank Correlation Coefficient'
 333                This replaces SPEARMAN.M
 334        Mode='Rank'              gives a nonparametric Rank Correlation Coefficient
 335                This is the "Spearman rank correlation with proper handling of ties"
 336                This replaces RANKCORR.M
 337
 338  [...] = CORRCOEF(..., param1, value1, param2, value2, ... );
 339         param           value
 340         'Mode'          type of correlation
 341                 'Pearson','parametric'
 342                 'Spearman'
 343                 'rank'
 344         'rows'          how do deal with missing values encoded as NaN's.
 345                 'complete': remove all rows with at least one NaN
 346                 'pairwise': [default]
 347         'alpha'         0.01    : significance level to compute confidence interval
 348
 349  [R,p,ci1,ci2,nansig] = CORRCOEF(...);
 350         R is the correlation matrix
 351         R(i,j) is the correlation coefficient r between X(:,i) and Y(:,j)
 352   p    gives the significance of R
 353         It tests the null hypothesis that the product moment correlation coefficient is zero
 354        using Student's t-test on the statistic t = r*sqrt(N-2)/sqrt(1-r^2)
 355        where N is the number of samples (Statistics, M. Spiegel, Schaum series).
 356   p > alpha: do not reject the Null hypothesis: 'R is zero'.
 357   p < alpha: The alternative hypothesis 'R is larger than zero' is true with probability (1-alpha).
 358   ci1   lower (1-alpha) confidence interval
 359   ci2   upper (1-alpha) confidence interval
 360         If no alpha is provided, the default alpha is 0.01. This can be changed with function flag_implicit_significance.
 361   nan_sig       p-value whether H0: 'NaN''s are not correlated' could be correct
 362        if nan_sig < alpha, H1 ('NaNs are correlated') is very likely.
 363
 364  The result is only valid if the occurence of NaN's is uncorrelated. In
 365  order to avoid this pitfall, the correlation of NaN's should be checked
 366  or case-wise deletion should be applied.
 367    Case-Wise deletion can be implemented
 368     ix = ~any(isnan([X,Y]),2);
 369     [...] = CORRCOEF(X(ix,:),Y(ix,:),...);
 370
 371   Correlation (non-random distribution) of NaN's can be checked with
 372        [nan_R,nan_sig]=corrcoef(X,isnan(X))
 373    or  [nan_R,nan_sig]=corrcoef([X,Y],isnan([X,Y]))
 374    or  [R,p,ci1,ci2] = CORRCOEF(...);
 375
 376  Further recommandation related to the correlation coefficient:
 377  + LOOK AT THE SCATTERPLOTS to make sure that the relationship is linear
 378  + Correlation is not causation because
 379         it is not clear which parameter is 'cause' and which is 'effect' and
 380        the observed correlation between two variables might be due to the action of other, unobserved variables.
 381
 382  see also: SUMSKIPNAN, COVM, COV, COR, SPEARMAN, RANKCORR, RANKS,
 383        PARTCORRCOEF, flag_implicit_significance
 384
 385  REFERENCES:
 386  on the correlation coefficient
 387  [ 1] http://mathworld.wolfram.com/CorrelationCoefficient.html
 388  [ 2] http://www.geography.btinternet.co.uk/spearman.htm
 389  [ 3] Hogg, R. V. and Craig, A. T. Introduction to Mathematical Statistics, 5th ed.  New York: Macmillan, pp. 338 and 400, 1995.
 390  [ 4] Lehmann, E. L. and D'Abrera, H. J. M. Nonparametrics: Statistical Methods Based on Ranks, rev. ed. Englewood Cliffs, NJ: Prentice-Hall, pp. 292, 300, and 323, 1998.
 391  [ 5] Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 634-637, 1992
 392  [ 6] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html
 393  on the significance test of the correlation coefficient
 394  [11] http://www.met.rdg.ac.uk/cag/STATS/corr.html
 395  [12] http://www.janda.org/c10/Lectures/topic06/L24-significanceR.htm
 396  [13] http://faculty.vassar.edu/lowry/ch4apx.html
 397  [14] http://davidmlane.com/hyperstat/B134689.html
 398  [15] http://www.statsoft.com/textbook/stbasic.html%Correlations
 399  others
 400  [20] http://www.tufts.edu/~gdallal/corr.htm
 401  [21] Fisher transformation http://en.wikipedia.org/wiki/Fisher_transformation
 402
 403
 404
 405 # name: <cell-element>
 406 # type: sq_string
 407 # elements: 1
 408 # length: 71
 409  CORRCOEF calculates the correlation matrix from pairwise correlations.
 410
 411
 412
 413 # name: <cell-element>
 414 # type: sq_string
 415 # elements: 1
 416 # length: 3
 417 cov
 418
 419
 420 # name: <cell-element>
 421 # type: sq_string
 422 # elements: 1
 423 # length: 1606
 424  COV covariance matrix
 425  X and Y can contain missing values encoded with NaN.
 426  NaN's are skipped, NaN do not result in a NaN output.
 427  The output gives NaN only if there are insufficient input data
 428  The mean is removed from the data.
 429
 430  Remark: for data contains missing values, the resulting
 431  matrix might not be positiv definite, and its elements have magnitudes
 432  larger than one. This ill-behavior is more likely for small sample
 433  sizes, but there is no garantee that the result "behaves well" for larger
 434  sample sizes. If you want the a "well behaved" result (i.e. positive
 435  definiteness and magnitude of elements not larger than 1), use CORRCOEF.
 436  However, COV is faster than CORRCOEF and might be good enough in some cases.
 437
 438  C = COV(X [,Mode]);
 439       calculates the (auto-)correlation matrix of X
 440  C = COV(X,Y [,Mode]);
 441       calculates the crosscorrelation between X and Y.
 442       C(i,j) is the correlation between the i-th and jth
 443       column of X and Y, respectively.
 444    NOTE: Octave and Matlab have (in some special cases) incompatible implemenations.
 445        This implementation follows Octave. If the result could be ambigous or
 446        incompatible, a warning will be presented in Matlab. To avoid this warning use:
 447        a) use COV([X(:),Y(:)]) if you want the traditional Matlab result.
 448        b) use C = COV([X,Y]), C = C(1:size(X,2),size(X,2)+1:size(C,2)); if you want to be compatible with this software.
 449
 450  Mode = 0 [default] scales C by (N-1)
 451  Mode = 1 scales C by N.
 452
 453  see also: COVM, COR, CORRCOEF, SUMSKIPNAN
 454
 455  REFERENCES:
 456  http://mathworld.wolfram.com/Covariance.html
 457
 458
 459
 460 # name: <cell-element>
 461 # type: sq_string
 462 # elements: 1
 463 # length: 76
 464  COV covariance matrix
 465  X and Y can contain missing values encoded with NaN.
 466
 467
 468
 469 # name: <cell-element>
 470 # type: sq_string
 471 # elements: 1
 472 # length: 4
 473 covm
 474
 475
 476 # name: <cell-element>
 477 # type: sq_string
 478 # elements: 1
 479 # length: 1182
 480  COVM generates covariance matrix
 481  X and Y can contain missing values encoded with NaN.
 482  NaN's are skipped, NaN do not result in a NaN output.
 483  The output gives NaN only if there are insufficient input data
 484
 485  COVM(X,Mode);
 486       calculates the (auto-)correlation matrix of X
 487  COVM(X,Y,Mode);
 488       calculates the crosscorrelation between X and Y
 489  COVM(...,W);
 490         weighted crosscorrelation
 491
 492  Mode = 'M' minimum or standard mode [default]
 493         C = X'*X; or X'*Y correlation matrix
 494
 495  Mode = 'E' extended mode
 496         C = [1 X]'*[1 X]; % l is a matching column of 1's
 497         C is additive, i.e. it can be applied to subsequent blocks and summed up afterwards
 498         the mean (or sum) is stored on the 1st row and column of C
 499
 500  Mode = 'D' or 'D0' detrended mode
 501         the mean of X (and Y) is removed. If combined with extended mode (Mode='DE'),
 502         the mean (or sum) is stored in the 1st row and column of C.
 503         The default scaling is factor (N-1).
 504  Mode = 'D1' is the same as 'D' but uses N for scaling.
 505
 506  C = covm(...);
 507         C is the scaled by N in Mode M and by (N-1) in mode D.
 508  [C,N] = covm(...);
 509         C is not scaled, provides the scaling factor N
 510         C./N gives the scaled version.
 511
 512  see also: DECOVM, XCOVF
 513
 514
 515
 516 # name: <cell-element>
 517 # type: sq_string
 518 # elements: 1
 519 # length: 80
 520  COVM generates covariance matrix
 521  X and Y can contain missing values encoded wi
 522
 523
 524
 525 # name: <cell-element>
 526 # type: sq_string
 527 # elements: 1
 528 # length: 13
 529 cumsumskipnan
 530
 531
 532 # name: <cell-element>
 533 # type: sq_string
 534 # elements: 1
 535 # length: 249
 536  CUMSUMSKIPNAN  Cumulative sum while skiping NaN's.
 537  If DIM is omitted, it defaults to the first non-singleton dimension.
 538
 539  Y = cumsumskipnan(x [,DIM])
 540
 541  x      input data
 542  DIM    dimension (default: [])
 543  y      resulting sum
 544
 545  see also: CUMSUM, SUMSKIPNAN
 546
 547
 548
 549 # name: <cell-element>
 550 # type: sq_string
 551 # elements: 1
 552 # length: 51
 553  CUMSUMSKIPNAN  Cumulative sum while skiping NaN's.
 554
 555
 556
 557 # name: <cell-element>
 558 # type: sq_string
 559 # elements: 1
 560 # length: 6
 561 decovm
 562
 563
 564 # name: <cell-element>
 565 # type: sq_string
 566 # elements: 1
 567 # length: 384
 568  decompose extended covariance matrix into mean (mu),
 569  standard deviation, the (pure) Covariance (COV),
 570  correlation (xc) matrix and the correlation coefficients R2.
 571  NaN's are condsidered as missing values.
 572  [mu,sd,COV,xc,N,R2]=decovm(ECM[,NN])
 573
 574  ECM    is the extended covariance matrix
 575  NN     is the number of elements, each estimate (in ECM) is based on
 576
 577  see also: MDBC, COVM, R2
 578
 579
 580
 581 # name: <cell-element>
 582 # type: sq_string
 583 # elements: 1
 584 # length: 80
 585  decompose extended covariance matrix into mean (mu),
 586  standard deviation, the
 587
 588
 589
 590 # name: <cell-element>
 591 # type: sq_string
 592 # elements: 1
 593 # length: 7
 594 detrend
 595
 596
 597 # name: <cell-element>
 598 # type: sq_string
 599 # elements: 1
 600 # length: 837
 601  DETREND removes the trend from data, NaN's are considered as missing values
 602
 603  DETREND is fully compatible to previous Matlab and Octave DETREND with the following features added:
 604  - handles NaN's by assuming that these are missing values
 605  - handles unequally spaced data
 606  - second output parameter gives the trend of the data
 607  - compatible to Matlab and Octave
 608
 609  [...]=detrend([t,] X [,p])
 610         removes trend for unequally spaced data
 611         t represents the time points
 612         X(i) is the value at time t(i)
 613         p must be a scalar
 614
 615  [...]=detrend(X,0)
 616  [...]=detrend(X,'constant')
 617         removes the mean
 618
 619  [...]=detrend(X,p)
 620         removes polynomial of order p (default p=1)
 621
 622  [...]=detrend(X,1) - default
 623  [...]=detrend(X,'linear')
 624         removes linear trend
 625
 626  [X,T]=detrend(...)
 627
 628  X is the detrended data
 629  T is the removed trend
 630
 631  see also: SUMSKIPNAN, ZSCORE
 632
 633
 634
 635 # name: <cell-element>
 636 # type: sq_string
 637 # elements: 1
 638 # length: 80
 639  DETREND removes the trend from data, NaN's are considered as missing values
 640
 641
 642
 643
 644
 645 # name: <cell-element>
 646 # type: sq_string
 647 # elements: 1
 648 # length: 4
 649 ecdf
 650
 651
 652 # name: <cell-element>
 653 # type: sq_string
 654 # elements: 1
 655 # length: 443
 656  ECDF empirical cumulative function
 657   NaN's are considered Missing values and are ignored.
 658
 659   [F,X] = ecdf(Y)
 660         calculates empirical cumulative distribution functions (i.e Kaplan-Meier estimate)
 661   ecdf(Y)
 662   ecdf(gca,Y)
 663         without output arguments plots the empirical cdf, in axis gca.
 664
 665  Y      input data
 666         must be a vector or matrix, in case Y is a matrix, the ecdf for every column is computed.
 667
 668  see also: HISTO2, HISTO3, PERCENTILE, QUANTILE
 669
 670
 671
 672 # name: <cell-element>
 673 # type: sq_string
 674 # elements: 1
 675 # length: 80
 676  ECDF empirical cumulative function
 677   NaN's are considered Missing values and
 678
 679
 680
 681 # name: <cell-element>
 682 # type: sq_string
 683 # elements: 1
 684 # length: 19
 685 flag_accuracy_level
 686
 687
 688 # name: <cell-element>
 689 # type: sq_string
 690 # elements: 1
 691 # length: 1033
 692  FLAG_ACCURACY_LEVEL sets and gets accuracy level
 693    used in SUMSKIPNAN_MEX and COVM_MEX
 694    The error margin of the naive summation is N*eps (N is the number of samples),
 695    the error margin is only 2*eps if Kahan's summation is used [1].
 696
 697         0: maximum speed [default]
 698            accuracy of double (64bit) with naive summation (error = N*2^-52)
 699         1: accuracy of extended (80bit) with naive summation (error = N*2^-64)
 700         2: accuracy of double (64bit) with Kahan summation (error = 2^-52)
 701         3: accuracy of extended (80bit) with Kahan summation  (error = 2^-64)
 702
 703    Please note, level 3 might be equally accurate but slower than 1 or 2 on
 704    some platforms. In order to determine what is good for you, you might want
 705    to run ACCTEST.
 706
 707  FLAG = flag_accuracy_level()
 708         gets current level
 709  flag_accuracy_level(FLAG)
 710         sets accuracy level
 711
 712  see also: ACCTEST
 713
 714  Reference:
 715  [1] David Goldberg,
 716        What Every Computer Scientist Should Know About Floating-Point Arithmetic
 717        ACM Computing Surveys, Vol 23, No 1, March 1991.
 718
 719
 720
 721 # name: <cell-element>
 722 # type: sq_string
 723 # elements: 1
 724 # length: 80
 725  FLAG_ACCURACY_LEVEL sets and gets accuracy level
 726    used in SUMSKIPNAN_MEX and
 727
 728
 729
 730 # name: <cell-element>
 731 # type: sq_string
 732 # elements: 1
 733 # length: 26
 734 flag_implicit_significance
 735
 736
 737 # name: <cell-element>
 738 # type: sq_string
 739 # elements: 1
 740 # length: 928
 741  The use of FLAG_IMPLICIT_SIGNIFICANCE is in experimental state.
 742  flag_implicit_significance might even become obsolete.
 743
 744  FLAG_IMPLICIT_SIGNIFICANCE sets and gets default alpha (level) of any significance test
 745  The default alpha-level is stored in the global variable FLAG_implicit_significance
 746  The idea is that the significance must not be assigned explicitely.
 747  This might yield more readable code.
 748
 749  Choose alpha low enough, because in alpha*100% of the cases, you will
 750  reject the Null hypothesis just by change. For this reason, the default
 751  alpha is 0.01.
 752
 753    flag_implicit_significance(0.01)
 754         sets the alpha-level for the significance test
 755
 756  alpha = flag_implicit_significance()
 757         gets default alpha
 758
 759  flag_implicit_significance(alpha)
 760         sets default alpha-level
 761
 762  alpha = flag_implicit_significance(alpha)
 763         gets and sets alpha
 764
 765  features:
 766  - compatible to Matlab and Octave
 767
 768  see also: CORRCOEF, PARTCORRCOEF
 769
 770
 771
 772 # name: <cell-element>
 773 # type: sq_string
 774 # elements: 1
 775 # length: 64
 776  The use of FLAG_IMPLICIT_SIGNIFICANCE is in experimental state.
 777
 778
 779
 780 # name: <cell-element>
 781 # type: sq_string
 782 # elements: 1
 783 # length: 22
 784 flag_implicit_skip_nan
 785
 786
 787 # name: <cell-element>
 788 # type: sq_string
 789 # elements: 1
 790 # length: 934
 791  FLAG_IMPLICIT_SKIP_NAN sets and gets default mode for handling NaNs
 792         1 skips NaN's (the default mode if no mode is set)
 793         0 NaNs are propagated; input NaN's give NaN's at the output
 794
 795  FLAG = flag_implicit_skip_nan()
 796         gets current mode
 797
 798  flag_implicit_skip_nan(FLAG)
 799  sets mode
 800
 801  prevFLAG = flag_implicit_skip_nan(nextFLAG)
 802         gets previous set FLAG and sets FLAG for the future
 803  flag_implicit_skip_nan(prevFLAG)
 804         resets FLAG to previous mode
 805
 806  It is used in:
 807         SUMSKIPNAN, MEDIAN, QUANTILES, TRIMEAN
 808  and affects many other functions like:
 809         CENTER, KURTOSIS, MAD, MEAN, MOMENT, RMS, SEM, SKEWNESS,
 810         STATISTIC, STD, VAR, ZSCORE etc.
 811
 812  The mode is stored in the global variable FLAG_implicit_skip_nan
 813  It is recommended to use flag_implicit_skip_nan(1) as default and
 814  flag_implicit_skip_nan(0) should be used for exceptional cases only.
 815  This feature might disappear without further notice, so you should really not
 816  rely on it.
 817
 818
 819
 820 # name: <cell-element>
 821 # type: sq_string
 822 # elements: 1
 823 # length: 80
 824  FLAG_IMPLICIT_SKIP_NAN sets and gets default mode for handling NaNs
 825         1 skips Na
 826
 827
 828
 829 # name: <cell-element>
 830 # type: sq_string
 831 # elements: 1
 832 # length: 17
 833 flag_nans_occured
 834
 835
 836 # name: <cell-element>
 837 # type: sq_string
 838 # elements: 1
 839 # length: 430
 840  FLAG_NANS_OCCURED checks whether the last call(s) to sumskipnan or covm
 841  contained any not-a-numbers in the input argument. Because many other
 842  functions like mean, std, etc. are also using sumskipnan,
 843  also these functions can be checked for NaN's in the input data.
 844
 845  A call to FLAG_NANS_OCCURED() resets also the flag whether NaN's occured.
 846  Only sumskipnan or covm can set the flag again.
 847
 848  see also: SUMSKIPNAN, COVM
 849
 850
 851
 852 # name: <cell-element>
 853 # type: sq_string
 854 # elements: 1
 855 # length: 80
 856  FLAG_NANS_OCCURED checks whether the last call(s) to sumskipnan or covm
 857  conta
 858
 859
 860
 861 # name: <cell-element>
 862 # type: sq_string
 863 # elements: 1
 864 # length: 3
 865 fss
 866
 867
 868 # name: <cell-element>
 869 # type: sq_string
 870 # elements: 1
 871 # length: 1739
 872  FSS - feature subset selection and feature ranking
 873    the method is motivated by the max-relevance-min-redundancy (mRMR)
 874    approach [1]. However, the default method uses partial correlation,
 875    which has been developed from scratch. PCCM [3] describes
 876    a similar idea, but is more complicated.
 877    An alternative method based on FSDD is implemented, too.
 878
 879   [idx,score] = fss(D,cl)
 880   [idx,score] = fss(D,cl,MODE)
 881   [idx,score] = fss(D,cl,MODE)
 882
 883  D      data - each column represents a feature
 884  cl     classlabel
 885  Mode   'Pearson' [default] correlation
 886         'rank' correlation
 887        'FSDD' feature selection algorithm based on a distance discriminant [2]
 888        %%% 'MRMR','MID','MIQ' max-relevance, min redundancy [1] - not supported yet.
 889
 890  score score of the feature
 891  idx    ranking of the feature
 892        [tmp,idx]=sort(-score)
 893
 894  see also: TRAIN_SC, XVAL, ROW_COL_DELETION
 895
 896  REFERENCES:
 897  [1] Peng, H.C., Long, F., and Ding, C.,
 898    Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,
 899    IEEE Transactions on Pattern Analysis and Machine Intelligence,
 900    Vol. 27, No. 8, pp.1226-1238, 2005.
 901  [2] Jianning Liang, Su Yang, Adam Winstanley,
 902    Invariant optimal feature selection: A distance discriminant and feature ranking based solution,
 903    Pattern Recognition, Volume 41, Issue 5, May 2008, Pages 1429-1439.
 904    ISSN 0031-3203, DOI: 10.1016/j.patcog.2007.10.018.
 905  [3] K. Raghuraj Rao and S. Lakshminarayanan
 906    Partial correlation based variable selection approach for multivariate data classification methods
 907    Chemometrics and Intelligent Laboratory Systems
 908    Volume 86, Issue 1, 15 March 2007, Pages 68-81
 909    http://dx.doi.org/10.1016/j.chemolab.2006.08.007
 910
 911
 912
 913 # name: <cell-element>
 914 # type: sq_string
 915 # elements: 1
 916 # length: 80
 917  FSS - feature subset selection and feature ranking
 918    the method is motivated
 919
 920
 921
 922 # name: <cell-element>
 923 # type: sq_string
 924 # elements: 1
 925 # length: 7
 926 geomean
 927
 928
 929 # name: <cell-element>
 930 # type: sq_string
 931 # elements: 1
 932 # length: 1207
 933  GEOMEAN calculates the geomentric mean of data elements.
 934
 935         y = geomean(x [,DIM [,W]])   is the same as
 936         y = mean(x,'G' [,DIM])
 937
 938  DIM    dimension
 939         1 STD of columns
 940         2 STD of rows
 941         default or []: first DIMENSION, with more than 1 element
 942  W      weights to compute weighted mean (default: [])
 943         if W=[], all weights are 1.
 944         number of elements in W must match size(x,DIM)
 945
 946  features:
 947  - can deal with NaN's (missing values)
 948  - weighting of data
 949  - dimension argument also in Octave
 950  - compatible to Matlab and Octave
 951
 952  see also: SUMSKIPNAN, MEAN, HARMMEAN
 953
 954     This program is free software; you can redistribute it and/or modify
 955     it under the terms of the GNU General Public License as published by
 956     the Free Software Foundation; either version 3 of the License, or
 957     (at your option) any later version.
 958
 959     This program is distributed in the hope that it will be useful,
 960     but WITHOUT ANY WARRANTY; without even the implied warranty of
 961     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 962     GNU General Public License for more details.
 963
 964     You should have received a copy of the GNU General Public License
 965     along with this program; If not, see <http://www.gnu.org/licenses/>.
 966
 967
 968
 969 # name: <cell-element>
 970 # type: sq_string
 971 # elements: 1
 972 # length: 57
 973  GEOMEAN calculates the geomentric mean of data elements.
 974
 975
 976
 977 # name: <cell-element>
 978 # type: sq_string
 979 # elements: 1
 980 # length: 8
 981 gscatter
 982
 983
 984 # name: <cell-element>
 985 # type: sq_string
 986 # elements: 1
 987 # length: 471
 988  GSCATTER scatter plot of groups
 989
 990   gscatter(x,y,group)
 991   gscatter(x,y,group,clr,sym,siz)
 992   gscatter(x,y,group,clr,sym,siz,doleg)
 993   gscatter(x,y,group,clr,sym,siz,doleg,xname,yname)
 994   h = gscatter(...)
 995
 996   x,y, group:   vectors with equal length
 997   clf:  color vector, default 'bgrcmyk'
 998   sym:          symbol, default '.'
 999   siz:  size of Marker
1000   doleg:  'on' (default) shows legend, 'off' turns of legend
1001   xname, yname: name of axis
1002
1003
1004  see also: ecdf, cdfplot
1005
1006  References:
1007
1008
1009
1010 # name: <cell-element>
1011 # type: sq_string
1012 # elements: 1
1013 # length: 34
1014  GSCATTER scatter plot of groups
1015
1016
1017
1018
1019 # name: <cell-element>
1020 # type: sq_string
1021 # elements: 1
1022 # length: 8
1023 harmmean
1024
1025
1026 # name: <cell-element>
1027 # type: sq_string
1028 # elements: 1
1029 # length: 629
1030  HARMMEAN calculates the harmonic mean of data elements.
1031  The harmonic mean is the inverse of the mean of the inverse elements.
1032
1033         y = harmmean(x [,DIM [,W]]) is the same as
1034         y = mean(x,'H' [,DIM [,W]])
1035
1036  DIM    dimension
1037         1 STD of columns
1038         2 STD of rows
1039         default or []: first DIMENSION, with more than 1 element
1040  W      weights to compute weighted mean (default: [])
1041         if W=[], all weights are 1.
1042         number of elements in W must match size(x,DIM)
1043
1044  features:
1045  - can deal with NaN's (missing values)
1046  - weighting of data
1047  - dimension argument also in Octave
1048  - compatible to Matlab and Octave
1049
1050  see also: SUMSKIPNAN, MEAN, GEOMEAN
1051
1052
1053
1054
1055 # name: <cell-element>
1056 # type: sq_string
1057 # elements: 1
1058 # length: 56
1059  HARMMEAN calculates the harmonic mean of data elements.
1060
1061
1062
1063 # name: <cell-element>
1064 # type: sq_string
1065 # elements: 1
1066 # length: 8
1067 hist2res
1068
1069
1070 # name: <cell-element>
1071 # type: sq_string
1072 # elements: 1
1073 # length: 700
1074  Evaluates Histogram data
1075  [R]=hist2res(H)
1076
1077  [y]=hist2res(H,fun)
1078         estimates fun-statistic
1079
1080  fun    'mean'  mean
1081         'std'   standard deviation
1082         'var'   variance
1083         'sem'   standard error of the mean
1084         'rms'   root mean square
1085         'meansq' mean of squares
1086         'sum'   sum
1087         'sumsq' sum of squares
1088         'CM#'   central moment of order #
1089         'skewness' skewness
1090         'kurtosis' excess coefficient (Fisher kurtosis)
1091
1092  see also: NaN/statistic
1093
1094  REFERENCES:
1095  [1] C.L. Nikias and A.P. Petropulu "Higher-Order Spectra Analysis" Prentice Hall, 1993.
1096  [2] C.E. Shannon and W. Weaver "The mathematical theory of communication" University of Illinois Press, Urbana 1949 (reprint 1963).
1097  [3] http://www.itl.nist.gov/
1098  [4] http://mathworld.wolfram.com/
1099
1100
1101
1102 # name: <cell-element>
1103 # type: sq_string
1104 # elements: 1
1105 # length: 43
1106  Evaluates Histogram data
1107  [R]=hist2res(H)
1108
1109
1110
1111
1112 # name: <cell-element>
1113 # type: sq_string
1114 # elements: 1
1115 # length: 3
1116 iqr
1117
1118
1119 # name: <cell-element>
1120 # type: sq_string
1121 # elements: 1
1122 # length: 372
1123  IQR calculates the interquartile range
1124   Missing values (encoded as NaN) are ignored.
1125
1126   Q = iqr(Y)
1127   Q = iqr(Y,DIM)
1128      returns the IQR along dimension DIM of sample array Y.
1129
1130   Q = iqr(HIS)
1131      returns the IQR from the histogram HIS.
1132      HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
1133
1134  see also: MAD, RANGE, HISTO2, HISTO3, PERCENTILE, QUANTILE
1135
1136
1137
1138 # name: <cell-element>
1139 # type: sq_string
1140 # elements: 1
1141 # length: 80
1142  IQR calculates the interquartile range
1143   Missing values (encoded as NaN) are
1144
1145
1146
1147 # name: <cell-element>
1148 # type: sq_string
1149 # elements: 1
1150 # length: 5
1151 kappa
1152
1153
1154 # name: <cell-element>
1155 # type: sq_string
1156 # elements: 1
1157 # length: 1760
1158  KAPPA estimates Cohen's kappa coefficient
1159    and related statistics
1160
1161  [...] = kappa(d1,d2);
1162         NaN's are handled as missing values and are ignored
1163  [...] = kappa(d1,d2,'notIgnoreNAN');
1164         NaN's are handled as just another Label.
1165  [kap,sd,H,z,ACC,sACC,MI] = kappa(...);
1166  X = kappa(...);
1167
1168  d1    data of scorer 1
1169  d2    data of scorer 2
1170
1171  kap    Cohen's kappa coefficient point
1172  se     standard error of the kappa estimate
1173  H      Concordance matrix, i.e. confusion matrix
1174  z      z-score
1175  ACC    overall agreement (accuracy)
1176  sACC   specific accuracy
1177  MI     Mutual information or transfer information (in [bits])
1178  X      is a struct containing all the fields above
1179        For two classes, a number of additional summary statistics including
1180          TPR, FPR, FDR, PPV, NPF, F1, dprime, Matthews Correlation coefficient (MCC) or
1181         Phi coefficient (PHI=MCC), Specificity and Sensitivity
1182        are provided. Note, the positive category must the larger label (in d and c), otherwise
1183        the confusion matrix becomes transposed and the summary statistics are messed up.
1184
1185
1186  Reference(s):
1187  [1] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.
1188  [2] J Bortz, GA Lienert (1998) Kurzgefasste Statistik f|r die klassische Forschung, Springer Berlin - Heidelberg.
1189         Kapitel 6: Uebereinstimmungsmasze fuer subjektive Merkmalsurteile. p. 265-270.
1190  [3] http://www.cmis.csiro.au/Fiona.Evans/personal/msc/html/chapter3.html
1191  [4] Kraemer, H. C. (1982). Kappa coefficient. In S. Kotz and N. L. Johnson (Eds.),
1192         Encyclopedia of Statistical Sciences. New York: John Wiley & Sons.
1193  [5] http://ourworld.compuserve.com/homepages/jsuebersax/kappa.htm
1194  [6] http://en.wikipedia.org/wiki/Receiver_operating_characteristic
1195
1196
1197
1198 # name: <cell-element>
1199 # type: sq_string
1200 # elements: 1
1201 # length: 70
1202  KAPPA estimates Cohen's kappa coefficient
1203    and related statistics
1204
1205
1206
1207
1208 # name: <cell-element>
1209 # type: sq_string
1210 # elements: 1
1211 # length: 8
1212 kurtosis
1213
1214
1215 # name: <cell-element>
1216 # type: sq_string
1217 # elements: 1
1218 # length: 461
1219  KURTOSIS estimates the kurtosis
1220
1221  y = kurtosis(x,DIM)
1222    calculates kurtosis of x in dimension DIM
1223
1224  DIM    dimension
1225         1: STATS of columns
1226         2: STATS of rows
1227         default or []: first DIMENSION, with more than 1 element
1228
1229  features:
1230  - can deal with NaN's (missing values)
1231  - dimension argument
1232  - compatible to Matlab and Octave
1233
1234  see also: SUMSKIPNAN, VAR, STD, VAR, SKEWNESS, MOMENT, STATISTIC,
1235     IMPLICIT_SKIP_NAN
1236
1237  REFERENCE(S):
1238  http://mathworld.wolfram.com/
1239
1240
1241
1242 # name: <cell-element>
1243 # type: sq_string
1244 # elements: 1
1245 # length: 33
1246  KURTOSIS estimates the kurtosis
1247
1248
1249
1250
1251 # name: <cell-element>
1252 # type: sq_string
1253 # elements: 1
1254 # length: 15
1255 load_fisheriris
1256
1257
1258 # name: <cell-element>
1259 # type: sq_string
1260 # elements: 1
1261 # length: 446
1262  LOAD_FISHERIRIS
1263         loads famous iris data set from Fisher, 1936 [1].
1264
1265  References:
1266  [1] Fisher,R.A. "The use of multiple measurements in taxonomic problems"
1267         Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).
1268  [2] Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
1269         (Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
1270
1271
1272
1273 # name: <cell-element>
1274 # type: sq_string
1275 # elements: 1
1276 # length: 75
1277  LOAD_FISHERIRIS
1278         loads famous iris data set from Fisher, 1936 [1].
1279
1280
1281
1282 # name: <cell-element>
1283 # type: sq_string
1284 # elements: 1
1285 # length: 3
1286 mad
1287
1288
1289 # name: <cell-element>
1290 # type: sq_string
1291 # elements: 1
1292 # length: 855
1293  MAD estimates the Mean Absolute deviation
1294  (note that according to [1,2] this is the mean deviation;
1295  not the mean absolute deviation)
1296
1297  y = mad(x,DIM)
1298    calculates the mean deviation of x in dimension DIM
1299
1300  DIM    dimension
1301         1: STATS of columns
1302         2: STATS of rows
1303         default or []: first DIMENSION, with more than 1 element
1304
1305  features:
1306  - can deal with NaN's (missing values)
1307  - dimension argument
1308  - compatible to Matlab and Octave
1309
1310  see also: SUMSKIPNAN, VAR, STD,
1311
1312  REFERENCE(S):
1313  [1] http://mathworld.wolfram.com/MeanDeviation.html
1314  [2] L. Sachs, "Applied Statistics: A Handbook of Techniques", Springer-Verlag, 1984, page 253.
1315
1316  [3] http://mathworld.wolfram.com/MeanAbsoluteDeviation.html
1317  [4] Kenney, J. F. and Keeping, E. S. "Mean Absolute Deviation." §6.4 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 76-77 1962.
1318
1319
1320
1321 # name: <cell-element>
1322 # type: sq_string
1323 # elements: 1
1324 # length: 80
1325  MAD estimates the Mean Absolute deviation
1326  (note that according to [1,2] this i
1327
1328
1329
1330 # name: <cell-element>
1331 # type: sq_string
1332 # elements: 1
1333 # length: 5
1334 mahal
1335
1336
1337 # name: <cell-element>
1338 # type: sq_string
1339 # elements: 1
1340 # length: 395
1341  MAHAL return the Mahalanobis' D-square distance between the
1342  multivariate samples x and y, which must have the same number
1343  of components (columns), but may have a different number of observations (rows).
1344
1345   d = mahal(X,Y)
1346
1347    d(k) = (X(k,:)-MU)*inv(SIGMA)*(X(k,:)-MU)'
1348
1349    where MU and SIGMA are the mean and the covariance matrix of Y
1350
1351
1352  see also: TRAIN_SC, TEST_SC, COVM
1353
1354  References:
1355
1356
1357
1358 # name: <cell-element>
1359 # type: sq_string
1360 # elements: 1
1361 # length: 80
1362  MAHAL return the Mahalanobis' D-square distance between the
1363  multivariate samp
1364
1365
1366
1367 # name: <cell-element>
1368 # type: sq_string
1369 # elements: 1
1370 # length: 4
1371 make
1372
1373
1374 # name: <cell-element>
1375 # type: sq_string
1376 # elements: 1
1377 # length: 46
1378  This make.m is used for Matlab under Windows
1379
1380
1381
1382 # name: <cell-element>
1383 # type: sq_string
1384 # elements: 1
1385 # length: 11
1386  This make.
1387
1388
1389
1390 # name: <cell-element>
1391 # type: sq_string
1392 # elements: 1
1393 # length: 4
1394 mean
1395
1396
1397 # name: <cell-element>
1398 # type: sq_string
1399 # elements: 1
1400 # length: 735
1401  MEAN calculates the mean of data elements.
1402
1403   y = mean(x [,DIM] [,opt] [, W])
1404
1405  DIM    dimension
1406         1 MEAN of columns
1407         2 MEAN of rows
1408         N MEAN of  N-th dimension
1409         default or []: first DIMENSION, with more than 1 element
1410
1411  opt    options
1412         'A' arithmetic mean
1413         'G' geometric mean
1414         'H' harmonic mean
1415
1416  W      weights to compute weighted mean (default: [])
1417         if W=[], all weights are 1.
1418         number of elements in W must match size(x,DIM)
1419
1420  usage:
1421         mean(x)
1422         mean(x,DIM)
1423         mean(x,opt)
1424         mean(x,opt,DIM)
1425         mean(x,DIM,opt)
1426         mean(x,DIM,W)
1427         mean(x,DIM,opt,W); '
1428
1429  features:
1430  - can deal with NaN's (missing values)
1431  - weighting of data
1432  - dimension argument also in Octave
1433  - compatible to Matlab and Octave
1434
1435  see also: SUMSKIPNAN, MEAN, GEOMEAN, HARMMEAN
1436
1437
1438
1439
1440 # name: <cell-element>
1441 # type: sq_string
1442 # elements: 1
1443 # length: 43
1444  MEAN calculates the mean of data elements.
1445
1446
1447
1448 # name: <cell-element>
1449 # type: sq_string
1450 # elements: 1
1451 # length: 7
1452 meandev
1453
1454
1455 # name: <cell-element>
1456 # type: sq_string
1457 # elements: 1
1458 # length: 856
1459  MEANDEV estimates the Mean deviation
1460  (note that according to [1,2] this is the mean deviation;
1461  not the mean absolute deviation)
1462
1463  y = meandev(x,DIM)
1464    calculates the mean deviation of x in dimension DIM
1465
1466  DIM    dimension
1467         1: STATS of columns
1468         2: STATS of rows
1469         default or []: first DIMENSION, with more than 1 element
1470
1471  features:
1472  - can deal with NaN's (missing values)
1473  - dimension argument
1474  - compatible to Matlab and Octave
1475
1476  see also: SUMSKIPNAN, VAR, STD, MAD
1477
1478  REFERENCE(S):
1479  [1] http://mathworld.wolfram.com/MeanDeviation.html
1480  [2] L. Sachs, "Applied Statistics: A Handbook of Techniques", Springer-Verlag, 1984, page 253.
1481  [3] http://mathworld.wolfram.com/MeanAbsoluteDeviation.html
1482  [4] Kenney, J. F. and Keeping, E. S. "Mean Absolute Deviation." §6.4 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 76-77 1962.
1483
1484
1485
1486 # name: <cell-element>
1487 # type: sq_string
1488 # elements: 1
1489 # length: 80
1490  MEANDEV estimates the Mean deviation
1491  (note that according to [1,2] this is the
1492
1493
1494
1495 # name: <cell-element>
1496 # type: sq_string
1497 # elements: 1
1498 # length: 6
1499 meansq
1500
1501
1502 # name: <cell-element>
1503 # type: sq_string
1504 # elements: 1
1505 # length: 527
1506  MEANSQ calculates the mean of the squares
1507
1508  y = meansq(x,DIM,W)
1509
1510  DIM    dimension
1511         1 STD of columns
1512         2 STD of rows
1513         N STD of  N-th dimension
1514         default or []: first DIMENSION, with more than 1 element
1515  W      weights to compute weighted mean (default: [])
1516         if W=[], all weights are 1.
1517         number of elements in W must match size(x,DIM)
1518
1519  features:
1520  - can deal with NaN's (missing values)
1521  - weighting of data
1522  - dimension argument also in Octave
1523  - compatible to Matlab and Octave
1524
1525  see also: SUMSQ, SUMSKIPNAN, MEAN, VAR, STD, RMS
1526
1527
1528
1529 # name: <cell-element>
1530 # type: sq_string
1531 # elements: 1
1532 # length: 43
1533  MEANSQ calculates the mean of the squares
1534
1535
1536
1537
1538 # name: <cell-element>
1539 # type: sq_string
1540 # elements: 1
1541 # length: 9
1542 medAbsDev
1543
1544
1545 # name: <cell-element>
1546 # type: sq_string
1547 # elements: 1
1548 # length: 373
1549  medAbsDev calculates the median absolute deviation
1550
1551  Usage:  D = medAbsDev(X, DIM)
1552     or:  [D, M] = medAbsDev(X, DIM)
1553  Input:  X  : data
1554          DIM: dimension along which mad should be calculated (1=columns, 2=rows)
1555                (optional, default=first dimension with more than 1 element
1556  Output: D  : median absolute deviations
1557          M  : medians (optional)
1558
1559
1560
1561 # name: <cell-element>
1562 # type: sq_string
1563 # elements: 1
1564 # length: 53
1565  medAbsDev calculates the median absolute deviation
1566
1567
1568
1569
1570 # name: <cell-element>
1571 # type: sq_string
1572 # elements: 1
1573 # length: 6
1574 median
1575
1576
1577 # name: <cell-element>
1578 # type: sq_string
1579 # elements: 1
1580 # length: 366
1581  MEDIAN data elements,
1582  [y]=median(x [,DIM])
1583
1584  DIM    dimension
1585         1: median of columns
1586         2: median of rows
1587         N: median of  N-th dimension
1588         default or []: first DIMENSION, with more than 1 element
1589
1590  features:
1591  - can deal with NaN's (missing values)
1592  - accepts dimension argument like in Matlab in Octave, too.
1593  - compatible to Matlab and Octave
1594
1595  see also: SUMSKIPNAN
1596
1597
1598
1599 # name: <cell-element>
1600 # type: sq_string
1601 # elements: 1
1602 # length: 46
1603  MEDIAN data elements,
1604  [y]=median(x [,DIM])
1605
1606
1607
1608
1609 # name: <cell-element>
1610 # type: sq_string
1611 # elements: 1
1612 # length: 6
1613 moment
1614
1615
1616 # name: <cell-element>
1617 # type: sq_string
1618 # elements: 1
1619 # length: 627
1620  MOMENT estimates the p-th moment
1621
1622  M = moment(x, p [,opt] [,DIM])
1623  M = moment(H, p [,opt])
1624    calculates p-th central moment from data x in dimension DIM
1625         of from Histogram H
1626
1627  p      moment of order p
1628  opt   'ac': absolute 'a' and/or central ('c') moment
1629         DEFAULT: '' raw moments are estimated
1630  DIM    dimension
1631         1: STATS of columns
1632         2: STATS of rows
1633         default or []: first DIMENSION, with more than 1 element
1634
1635  features:
1636  - can deal with NaN's (missing values)
1637  - dimension argument
1638  - compatible to Matlab and Octave
1639
1640  see also: STD, VAR, SKEWNESS, KURTOSIS, STATISTIC,
1641
1642  REFERENCE(S):
1643  http://mathworld.wolfram.com/Moment.html
1644
1645
1646
1647 # name: <cell-element>
1648 # type: sq_string
1649 # elements: 1
1650 # length: 80
1651  MOMENT estimates the p-th moment
1652
1653  M = moment(x, p [,opt] [,DIM])
1654  M = moment
1655
1656
1657
1658 # name: <cell-element>
1659 # type: sq_string
1660 # elements: 1
1661 # length: 7
1662 nanconv
1663
1664
1665 # name: <cell-element>
1666 # type: sq_string
1667 # elements: 1
1668 # length: 616
1669  NANCONV computes the convolution for data with missing values.
1670   X and Y can contain missing values encoded with NaN.
1671   NaN's are skipped, NaN do not result in a NaN output.
1672   The output gives NaN only if there are insufficient input data
1673
1674  [...] = NANCONV(X,Y);
1675       calculates 2-dim convolution between X and Y
1676  [C]   = NANCONV(X,Y);
1677
1678  WARNING: missing values can introduce aliasing - causing unintended results.
1679     Moreover, the behavior of bandpass and highpass filters in case of missing values
1680     is not fully understood, and might contain some pitfalls.
1681
1682  see also: CONV, NANCONV2, NANFFT, NANFILTER
1683
1684
1685
1686 # name: <cell-element>
1687 # type: sq_string
1688 # elements: 1
1689 # length: 63
1690  NANCONV computes the convolution for data with missing values.
1691
1692
1693
1694 # name: <cell-element>
1695 # type: sq_string
1696 # elements: 1
1697 # length: 6
1698 nanfft
1699
1700
1701 # name: <cell-element>
1702 # type: sq_string
1703 # elements: 1
1704 # length: 618
1705  NANFFT calculates the Fourier-Transform of X for data with missing values.
1706   NANFFT is the same as FFT but X can contain missing values encoded with NaN.
1707   NaN's are skipped, NaN do not result in a NaN output.
1708
1709    Y = NANFFT(X)
1710    Y = NANFFT(X,N)
1711    Y = NANFFT(X,[],DIM)
1712
1713    [Y,N] = NANFFT(...)
1714        returns the number of valid samples N
1715
1716
1717  WARNING: missing values can introduce aliasing - causing unintended results.
1718     Moreover, the behavior of bandpass and highpass filters in case of missing values
1719     is not fully understood, and might contain some pitfalls.
1720
1721  see also: FFT, XCORR, NANCONV, NANFILTER
1722
1723
1724
1725 # name: <cell-element>
1726 # type: sq_string
1727 # elements: 1
1728 # length: 75
1729  NANFFT calculates the Fourier-Transform of X for data with missing values.
1730
1731
1732
1733 # name: <cell-element>
1734 # type: sq_string
1735 # elements: 1
1736 # length: 9
1737 nanfilter
1738
1739
1740 # name: <cell-element>
1741 # type: sq_string
1742 # elements: 1
1743 # length: 519
1744  NANFILTER is able to filter data with missing values encoded as NaN.
1745
1746       [Y,Z] = nanfilter(B,A,X [, Z]);
1747
1748  If X contains no missing data, NANFILTER should behave like FILTER.
1749  NaN-values are handled gracefully.
1750
1751  WARNING: missing values can introduce aliasing - causing unintended results.
1752     Moreover, the behavior of bandpass and highpass filters in case of missing values
1753     is not fully understood, and might contain some pitfalls.
1754
1755  see also: FILTER, SUMSKIPNAN, NANFFT, NANCONV, NANFILTER1UC
1756
1757
1758
1759 # name: <cell-element>
1760 # type: sq_string
1761 # elements: 1
1762 # length: 69
1763  NANFILTER is able to filter data with missing values encoded as NaN.
1764
1765
1766
1767 # name: <cell-element>
1768 # type: sq_string
1769 # elements: 1
1770 # length: 12
1771 nanfilter1uc
1772
1773
1774 # name: <cell-element>
1775 # type: sq_string
1776 # elements: 1
1777 # length: 257
1778  NANFILTER1UC is an adaptive filter for data with missing values encoded as NaN.
1779
1780       [Y,Z] = nanfilter1uc(uc,X [, Z]);
1781
1782  if X contains no missing data, NANFILTER behaves like FILTER(uc,[1,uc-1],X[,Z]).
1783
1784  see also: FILTER, NANFILTER, SUMSKIPNAN
1785
1786
1787
1788 # name: <cell-element>
1789 # type: sq_string
1790 # elements: 1
1791 # length: 80
1792  NANFILTER1UC is an adaptive filter for data with missing values encoded as NaN.
1793
1794
1795
1796 # name: <cell-element>
1797 # type: sq_string
1798 # elements: 1
1799 # length: 11
1800 naninsttest
1801
1802
1803 # name: <cell-element>
1804 # type: sq_string
1805 # elements: 1
1806 # length: 112
1807  NANINSTTEST checks whether the functions from NaN-toolbox have been
1808  correctly installed.
1809
1810  see also: NANTEST
1811
1812
1813
1814 # name: <cell-element>
1815 # type: sq_string
1816 # elements: 1
1817 # length: 80
1818  NANINSTTEST checks whether the functions from NaN-toolbox have been
1819  correctly
1820
1821
1822
1823 # name: <cell-element>
1824 # type: sq_string
1825 # elements: 1
1826 # length: 7
1827 nanmean
1828
1829
1830 # name: <cell-element>
1831 # type: sq_string
1832 # elements: 1
1833 # length: 330
1834  NANMEAN same as SUM but ignores NaN's.
1835  NANMEAN is OBSOLETE; use MEAN instead. NANMEAN is included
1836     to provide backward compatibility
1837
1838  Y = nanmean(x [,DIM])
1839
1840  DIM    dimension
1841         1 sum of columns
1842         2 sum of rows
1843         default or []: first DIMENSION with more than 1 element
1844  Y      resulting mean
1845
1846
1847  see also: MEAN, SUMSKIPNAN, NANSUM
1848
1849
1850
1851 # name: <cell-element>
1852 # type: sq_string
1853 # elements: 1
1854 # length: 39
1855  NANMEAN same as SUM but ignores NaN's.
1856
1857
1858
1859 # name: <cell-element>
1860 # type: sq_string
1861 # elements: 1
1862 # length: 6
1863 nanstd
1864
1865
1866 # name: <cell-element>
1867 # type: sq_string
1868 # elements: 1
1869 # length: 518
1870  NANSTD same as STD but ignores NaN's.
1871  NANSTD is OBSOLETE; use NaN/STD instead. NANSTD is included
1872     to fix a bug in alternative implementations and to
1873     provide some compatibility.
1874
1875  Y = nanstd(x, FLAG, [,DIM])
1876
1877  x     data
1878  FLAG  0: [default] normalizes with (N-1), N = sample size
1879  FLAG  1: normalizes with N, N = sample size
1880  DIM    dimension
1881         1 sum of columns
1882         2 sum of rows
1883         default or []: first DIMENSION with more than 1 element
1884  Y      resulting standard deviation
1885
1886  see also: SUM, SUMSKIPNAN, NANSUM, STD
1887
1888
1889
1890 # name: <cell-element>
1891 # type: sq_string
1892 # elements: 1
1893 # length: 38
1894  NANSTD same as STD but ignores NaN's.
1895
1896
1897
1898 # name: <cell-element>
1899 # type: sq_string
1900 # elements: 1
1901 # length: 6
1902 nansum
1903
1904
1905 # name: <cell-element>
1906 # type: sq_string
1907 # elements: 1
1908 # length: 333
1909  NANSUM same as SUM but ignores NaN's.
1910  NANSUM is OBSOLETE; use SUMSKIPNAN instead. NANSUM is included
1911     to fix a bug in some other versions.
1912
1913  Y = nansum(x [,DIM])
1914
1915  DIM    dimension
1916         1 sum of columns
1917         2 sum of rows
1918         default or []: first DIMENSION with more than 1 element
1919  Y      resulting sum
1920
1921
1922  see also: SUM, SUMSKIPNAN, NANSUM
1923
1924
1925
1926 # name: <cell-element>
1927 # type: sq_string
1928 # elements: 1
1929 # length: 38
1930  NANSUM same as SUM but ignores NaN's.
1931
1932
1933
1934 # name: <cell-element>
1935 # type: sq_string
1936 # elements: 1
1937 # length: 7
1938 nantest
1939
1940
1941 # name: <cell-element>
1942 # type: sq_string
1943 # elements: 1
1944 # length: 366
1945  NANTEST checks several mathematical operations and a few
1946  statistical functions for their correctness related to NaN's.
1947  e.g. it checks norminv, normcdf, normpdf, sort, matrix division and multiplication.
1948
1949
1950  see also: NANINSTTEST
1951
1952  REFERENCE(S):
1953  [1] W. Kahan (1996) Lecture notes on the Status of "IEEE Standard 754 for
1954      Binary Floating-point Arithmetic.
1955
1956
1957
1958
1959 # name: <cell-element>
1960 # type: sq_string
1961 # elements: 1
1962 # length: 80
1963  NANTEST checks several mathematical operations and a few
1964  statistical function
1965
1966
1967
1968 # name: <cell-element>
1969 # type: sq_string
1970 # elements: 1
1971 # length: 7
1972 normcdf
1973
1974
1975 # name: <cell-element>
1976 # type: sq_string
1977 # elements: 1
1978 # length: 290
1979  NORMCDF returns normal cumulative distribtion function
1980
1981  cdf = normcdf(x,m,s);
1982
1983  Computes the CDF of a the normal distribution
1984     with mean m and standard deviation s
1985     default: m=0; s=1;
1986  x,m,s must be matrices of same size, or any one can be a scalar.
1987
1988  see also: NORMPDF, NORMINV
1989
1990
1991
1992 # name: <cell-element>
1993 # type: sq_string
1994 # elements: 1
1995 # length: 56
1996  NORMCDF returns normal cumulative distribtion function
1997
1998
1999
2000
2001 # name: <cell-element>
2002 # type: sq_string
2003 # elements: 1
2004 # length: 7
2005 norminv
2006
2007
2008 # name: <cell-element>
2009 # type: sq_string
2010 # elements: 1
2011 # length: 341
2012  NORMINV returns inverse cumulative function of the normal distribution
2013
2014  x = norminv(p,m,s);
2015
2016  Computes the quantile (inverse of the CDF) of a the normal
2017     cumulative distribution with mean m and standard deviation s
2018     default: m=0; s=1;
2019  p,m,s must be matrices of same size, or any one can be a scalar.
2020
2021  see also: NORMPDF, NORMCDF
2022
2023
2024
2025 # name: <cell-element>
2026 # type: sq_string
2027 # elements: 1
2028 # length: 72
2029  NORMINV returns inverse cumulative function of the normal distribution
2030
2031
2032
2033
2034 # name: <cell-element>
2035 # type: sq_string
2036 # elements: 1
2037 # length: 7
2038 normpdf
2039
2040
2041 # name: <cell-element>
2042 # type: sq_string
2043 # elements: 1
2044 # length: 279
2045  NORMPDF returns normal probability density
2046
2047  pdf = normpdf(x,m,s);
2048
2049  Computes the PDF of a the normal distribution
2050     with mean m and standard deviation s
2051     default: m=0; s=1;
2052  x,m,s must be matrices of same size, or any one can be a scalar.
2053
2054  see also: NORMCDF, NORMINV
2055
2056
2057
2058 # name: <cell-element>
2059 # type: sq_string
2060 # elements: 1
2061 # length: 45
2062  NORMPDF returns normal probability density
2063
2064
2065
2066
2067 # name: <cell-element>
2068 # type: sq_string
2069 # elements: 1
2070 # length: 12
2071 partcorrcoef
2072
2073
2074 # name: <cell-element>
2075 # type: sq_string
2076 # elements: 1
2077 # length: 2015
2078  PARTCORRCOEF calculates the partial correlation between X and Y
2079  after removing the influence of Z.
2080  X, Y and Z can contain missing values encoded with NaN.
2081  NaN's are skipped, NaN do not result in a NaN output.
2082  (Its assumed that the occurence of NaN's is uncorrelated)
2083  The output gives NaN, only if there are insufficient input data.
2084
2085   The partial correlation  is defined as
2086   pcc(xy|z)=(cc(x,y)-cc(x,z)*cc(y,z))/sqrt((1-cc(x,y)�)*((1-cc(x,z)�)))
2087
2088
2089  PARTCORRCOEF(X [,Mode]);
2090       calculates the (auto-)correlation matrix of X
2091  PARTCORRCOEF(X,Y,Z);
2092  PARTCORRCOEF(X,Y,Z,[]);
2093  PARTCORRCOEF(X,Y,Z,'Pearson');
2094  PARTCORRCOEF(X,Y,Z,'Rank');
2095  PARTCORRCOEF(X,Y,Z,'Spearman');
2096
2097  Mode=[] [default]
2098         removes from X and Y the part that can be explained by Z
2099         and computes the correlation of the remaining part.
2100         Ideally, this is equivalent to Mode='Pearson', however, in practice
2101         this is more accurate.
2102  Mode='Pearson' or 'parametric'
2103  Mode='Spearman'
2104  Mode='Rank'
2105         computes the partial correlation based on cc(x,y),cc(x,z) and cc(y,z)
2106         with the respective mode.
2107
2108  [R,p,ci1,ci2] = PARTCORRCOEF(...);
2109   r is the partialcorrelation matrix
2110         r(i,j) is the partial correlation coefficient r between X(:,i) and Y(:,j)
2111         when influence of Z is removed.
2112   p    gives the significance of PCC
2113         It tests the null hypothesis that the product moment correlation coefficient is zero
2114        using Student's t-test on the statistic t = r sqrt(N-Nz-2)/sqrt(1-r^2)
2115        where N is the number of samples (Statistics, M. Spiegel, Schaum series).
2116   p > alpha: do not reject the Null hypothesis: "R is zero".
2117   p < alpha: The alternative hypothesis "R2 is larger than zero" is true with probability (1-alpha).
2118   ci1   lower 0.95 confidence interval
2119   ci2   upper 0.95 confidence interval
2120
2121  see also: SUMSKIPNAN, COVM, COV, COR, SPEARMAN, RANKCORR, RANKS, CORRCOEF
2122
2123  REFERENCES:
2124  on the partial correlation coefficient
2125  [1] http://www.tufts.edu/~gdallal/partial.htm
2126  [2] http://www.nag.co.uk/numeric/fl/manual/pdf/G02/g02byf.pdf
2127
2128
2129
2130 # name: <cell-element>
2131 # type: sq_string
2132 # elements: 1
2133 # length: 80
2134  PARTCORRCOEF calculates the partial correlation between X and Y
2135  after removing
2136
2137
2138
2139 # name: <cell-element>
2140 # type: sq_string
2141 # elements: 1
2142 # length: 10
2143 percentile
2144
2145
2146 # name: <cell-element>
2147 # type: sq_string
2148 # elements: 1
2149 # length: 554
2150  PERCENTILE calculates the percentiles of histograms and sample arrays.
2151
2152   Q = percentile(Y,q)
2153   Q = percentile(Y,q,DIM)
2154      returns the q-th percentile along dimension DIM of sample array Y.
2155      size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q)
2156
2157   Q = percentile(HIS,q)
2158      returns the q-th percentile from the histogram HIS.
2159      HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
2160      If q is a vector, the each row of Q returns the q(i)-th percentile
2161
2162  see also: HISTO2, HISTO3, QUANTILE
2163
2164
2165
2166 # name: <cell-element>
2167 # type: sq_string
2168 # elements: 1
2169 # length: 71
2170  PERCENTILE calculates the percentiles of histograms and sample arrays.
2171
2172
2173
2174 # name: <cell-element>
2175 # type: sq_string
2176 # elements: 1
2177 # length: 7
2178 prctile
2179
2180
2181 # name: <cell-element>
2182 # type: sq_string
2183 # elements: 1
2184 # length: 576
2185  PRCTILE calculates the percentiles of histograms and sample arrays.
2186  (its the same than PERCENTILE.M)
2187
2188   Q = prctile(Y,q)
2189   Q = prctile(Y,q,DIM)
2190      returns the q-th percentile along dimension DIM of sample array Y.
2191      size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q)
2192
2193   Q = prctile(HIS,q)
2194      returns the q-th percentile from the histogram HIS.
2195      HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
2196      If q is a vector, the each row of Q returns the q(i)-th percentile
2197
2198  see also: HISTO2, HISTO3, QUANTILE
2199
2200
2201
2202 # name: <cell-element>
2203 # type: sq_string
2204 # elements: 1
2205 # length: 68
2206  PRCTILE calculates the percentiles of histograms and sample arrays.
2207
2208
2209
2210 # name: <cell-element>
2211 # type: sq_string
2212 # elements: 1
2213 # length: 8
2214 quantile
2215
2216
2217 # name: <cell-element>
2218 # type: sq_string
2219 # elements: 1
2220 # length: 528
2221  QUANTILE calculates the quantiles of histograms and sample arrays.
2222
2223   Q = quantile(Y,q)
2224   Q = quantile(Y,q,DIM)
2225      returns the q-th quantile along dimension DIM of sample array Y.
2226      size(Q) is equal size(Y) except for dimension DIM which is size(Q,DIM)=length(Q)
2227
2228   Q = quantile(HIS,q)
2229      returns the q-th quantile from the histogram HIS.
2230      HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
2231      If q is a vector, the each row of Q returns the q(i)-th quantile
2232
2233  see also: HISTO2, HISTO3, PERCENTILE
2234
2235
2236
2237 # name: <cell-element>
2238 # type: sq_string
2239 # elements: 1
2240 # length: 67
2241  QUANTILE calculates the quantiles of histograms and sample arrays.
2242
2243
2244
2245 # name: <cell-element>
2246 # type: sq_string
2247 # elements: 1
2248 # length: 5
2249 range
2250
2251
2252 # name: <cell-element>
2253 # type: sq_string
2254 # elements: 1
2255 # length: 371
2256  RANGE calculates the range of Y
2257   Missing values (encoded as NaN) are ignored.
2258
2259   Q = range(Y)
2260   Q = range(Y,DIM)
2261      returns the range along dimension DIM of sample array Y.
2262
2263   Q = range(HIS)
2264      returns the RANGE from the histogram HIS.
2265      HIS must be a HISTOGRAM struct as defined in HISTO2 or HISTO3.
2266
2267  see also: IQR, MAD, HISTO2, HISTO3, PERCENTILE, QUANTILE
2268
2269
2270
2271 # name: <cell-element>
2272 # type: sq_string
2273 # elements: 1
2274 # length: 80
2275  RANGE calculates the range of Y
2276   Missing values (encoded as NaN) are ignored.
2277
2278
2279
2280 # name: <cell-element>
2281 # type: sq_string
2282 # elements: 1
2283 # length: 8
2284 rankcorr
2285
2286
2287 # name: <cell-element>
2288 # type: sq_string
2289 # elements: 1
2290 # length: 668
2291  RANKCORR calculated the rank correlation coefficient.
2292  This function is replaced by CORRCOEF.
2293  Significance test and confidence intervals can be obtained from CORRCOEF, too.
2294
2295  R = CORRCOEF(X, [Y, ] 'Rank');
2296
2297  The rank correlation   r = corrcoef(ranks(x)).
2298  is often confused with Spearman's rank correlation.
2299  Spearman's correlation is defined as
2300    r(x,y) = 1-6*sum((ranks(x)-ranks(y)).^2)/(N*(N*N-1))
2301  The results are different. Here, the former version is implemented.
2302
2303  see also: CORRCOEF, SPEARMAN, RANKS
2304
2305  REFERENCES:
2306  [1] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html
2307  [2] http://mathworld.wolfram.com/CorrelationCoefficient.html
2308
2309
2310
2311 # name: <cell-element>
2312 # type: sq_string
2313 # elements: 1
2314 # length: 54
2315  RANKCORR calculated the rank correlation coefficient.
2316
2317
2318
2319 # name: <cell-element>
2320 # type: sq_string
2321 # elements: 1
2322 # length: 5
2323 ranks
2324
2325
2326 # name: <cell-element>
2327 # type: sq_string
2328 # elements: 1
2329 # length: 1062
2330  RANKS gives the rank of each element in a vector.
2331  This program uses an advanced algorithm with averge effort O(m.n.log(n))
2332  NaN in the input yields NaN in the output.
2333
2334  r = ranks(X[,DIM])
2335    if X is a vector, return the vector of ranks of X adjusted for ties.
2336    if X is matrix, the rank is calculated along dimension DIM.
2337    if DIM is zero or empty, the lowest dimension with more then 1 element is used.
2338  r = ranks(X,DIM,'traditional')
2339    implements the traditional algorithm with O(n^2) computational
2340    and O(n^2) memory effort
2341  r = ranks(X,DIM,'mtraditional')
2342    implements the traditional algorithm with O(n^2) computational
2343    and O(n) memory effort
2344  r = ranks(X,DIM,'advanced   ')
2345    implements an advanced algorithm with O(n*log(n)) computational
2346    and O(n.log(n)) memory effort
2347  r = ranks(X,DIM,'advanced-ties')
2348    implements an advanced algorithm with O(n*log(n)) computational
2349    and O(n.log(n)) memory effort
2350    but without correction for ties
2351    This is the fastest algorithm
2352
2353  see also: CORRCOEF, SPEARMAN, RANKCORR
2354
2355  REFERENCES:
2356  --
2357
2358
2359
2360 # name: <cell-element>
2361 # type: sq_string
2362 # elements: 1
2363 # length: 50
2364  RANKS gives the rank of each element in a vector.
2365
2366
2367
2368 # name: <cell-element>
2369 # type: sq_string
2370 # elements: 1
2371 # length: 3
2372 rms
2373
2374
2375 # name: <cell-element>
2376 # type: sq_string
2377 # elements: 1
2378 # length: 560
2379  RMS calculates the root mean square
2380    can deal with complex data.
2381
2382  y = rms(x,DIM,W)
2383
2384  DIM    dimension
2385         1 STD of columns
2386         2 STD of rows
2387         N STD of  N-th dimension
2388         default or []: first DIMENSION, with more than 1 element
2389  W      weights to compute weighted s.d. (default: [])
2390         if W=[], all weights are 1.
2391         number of elements in W must match size(x,DIM)
2392
2393  y      estimated standard deviation
2394
2395  features:
2396  - can deal with NaN's (missing values)
2397  - weighting of data
2398  - dimension argument also in Octave
2399  - compatible to Matlab and Octave
2400
2401  see also: SUMSKIPNAN, MEAN
2402
2403
2404
2405 # name: <cell-element>
2406 # type: sq_string
2407 # elements: 1
2408 # length: 67
2409  RMS calculates the root mean square
2410    can deal with complex data.
2411
2412
2413
2414 # name: <cell-element>
2415 # type: sq_string
2416 # elements: 1
2417 # length: 16
2418 row_col_deletion
2419
2420
2421 # name: <cell-element>
2422 # type: sq_string
2423 # elements: 1
2424 # length: 739
2425  ROW_COL_DELETION selects the rows and columns for removing any missing values.
2426     A heuristic based on maximizing the number of remaining sample values
2427     is used. In other words, if there are more rows than columns, it is
2428     more likely that a row-wise deletion will be applied and vice versa.
2429
2430     [rix,cix] = row_col_deletion(d)
2431     [rix,cix] = row_col_deletion(d,c,w)
2432
2433  Input:
2434     d        data (each row is a sample, each column a feature)
2435     c        classlabels (not really used) [OPTIONAL]
2436     w        weight for each sample vector [OPTIONAL]
2437  Output:
2438     rix      selected samples
2439     cix      selected columns
2440
2441    d(rix,cix) does not contain any NaN's i.e. missing values
2442
2443  see also: TRAIN_SC, TEST_SC
2444
2445
2446
2447 # name: <cell-element>
2448 # type: sq_string
2449 # elements: 1
2450 # length: 79
2451  ROW_COL_DELETION selects the rows and columns for removing any missing values.
2452
2453
2454
2455 # name: <cell-element>
2456 # type: sq_string
2457 # elements: 1
2458 # length: 3
2459 sem
2460
2461
2462 # name: <cell-element>
2463 # type: sq_string
2464 # elements: 1
2465 # length: 695
2466  SEM calculates the standard error of the mean
2467
2468  [SE,M] = SEM(x [, DIM [,W]])
2469    calculates the standard error (SE) in dimension DIM
2470    the default DIM is the first non-single dimension
2471    M returns the mean.
2472    Can deal with complex data, too.
2473
2474  DIM    dimension
2475         1: SEM of columns
2476         2: SEM of rows
2477         N: SEM of  N-th dimension
2478         default or []: first DIMENSION, with more than 1 element
2479  W      weights to compute weighted mean and s.d. (default: [])
2480         if W=[], all weights are 1.
2481         number of elements in W must match size(x,DIM)
2482
2483  features:
2484  - can deal with NaN's (missing values)
2485  - weighting of data
2486  - dimension argument
2487  - compatible to Matlab and Octave
2488
2489  see also: SUMSKIPNAN, MEAN, VAR, STD
2490
2491
2492
2493 # name: <cell-element>
2494 # type: sq_string
2495 # elements: 1
2496 # length: 80
2497  SEM calculates the standard error of the mean
2498
2499  [SE,M] = SEM(x [, DIM [,W]])
2500
2501
2502
2503
2504 # name: <cell-element>
2505 # type: sq_string
2506 # elements: 1
2507 # length: 8
2508 skewness
2509
2510
2511 # name: <cell-element>
2512 # type: sq_string
2513 # elements: 1
2514 # length: 405
2515  SKEWNESS estimates the skewness
2516
2517  y = skewness(x,DIM)
2518    calculates skewness of x in dimension DIM
2519
2520  DIM    dimension
2521         1: STATS of columns
2522         2: STATS of rows
2523         default or []: first DIMENSION, with more than 1 element
2524
2525  features:
2526  - can deal with NaN's (missing values)
2527  - dimension argument
2528  - compatible to Matlab and Octave
2529
2530  see also: SUMSKIPNAN, STATISTIC
2531
2532  REFERENCE(S):
2533  http://mathworld.wolfram.com/
2534
2535
2536
2537 # name: <cell-element>
2538 # type: sq_string
2539 # elements: 1
2540 # length: 34
2541  SKEWNESS estimates the skewness
2542
2543
2544
2545
2546 # name: <cell-element>
2547 # type: sq_string
2548 # elements: 1
2549 # length: 8
2550 spearman
2551
2552
2553 # name: <cell-element>
2554 # type: sq_string
2555 # elements: 1
2556 # length: 683
2557  SPEARMAN Spearman's rank correlation coefficient.
2558  This function is replaced by CORRCOEF.
2559  Significance test and confidence intervals can be obtained from CORRCOEF.
2560
2561  [R,p,ci1,ci2] = CORRCOEF(x, [y, ] 'Rank');
2562
2563  For some (unknown) reason, in previous versions Spearman's rank correlation
2564    r = corrcoef(ranks(x)).
2565  But according to [1], Spearman's correlation is defined as
2566    r = 1-6*sum((ranks(x)-ranks(y)).^2)/(N*(N*N-1))
2567  The results are different. Here, the later version is implemented.
2568
2569  see also: CORRCOEF, RANKCORR
2570
2571  REFERENCES:
2572  [1] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html
2573  [2] http://mathworld.wolfram.com/CorrelationCoefficient.html
2574
2575
2576
2577 # name: <cell-element>
2578 # type: sq_string
2579 # elements: 1
2580 # length: 50
2581  SPEARMAN Spearman's rank correlation coefficient.
2582
2583
2584
2585 # name: <cell-element>
2586 # type: sq_string
2587 # elements: 1
2588 # length: 9
2589 statistic
2590
2591
2592 # name: <cell-element>
2593 # type: sq_string
2594 # elements: 1
2595 # length: 938
2596  STATISTIC estimates various statistics at once.
2597
2598  R = STATISTIC(x,DIM)
2599    calculates all statistic (see list of fun) in dimension DIM
2600    R is a struct with all statistics
2601
2602  y = STATISTIC(x,fun)
2603    estimate of fun on dimension DIM
2604    y gives the statistic of fun
2605
2606  DIM    dimension
2607         1: STATS of columns
2608         2: STATS of rows
2609         N: STATS of  N-th dimension
2610         default or []: first DIMENSION, with more than 1 element
2611
2612  fun    'mean'  mean
2613         'std'   standard deviation
2614         'var'   variance
2615         'sem'   standard error of the mean
2616         'rms'   root mean square
2617         'meansq' mean of squares
2618         'sum'   sum
2619         'sumsq' sum of squares
2620         'CM#'   central moment of order #
2621         'skewness' skewness
2622         'kurtosis' excess coefficient (Fisher kurtosis)
2623         'mad'   mean absolute deviation
2624
2625  features:
2626  - can deal with NaN's (missing values)
2627  - dimension argument
2628  - compatible to Matlab and Octave
2629
2630  see also: SUMSKIPNAN
2631
2632  REFERENCE(S):
2633  [1] http://www.itl.nist.gov/
2634  [2] http://mathworld.wolfram.com/
2635
2636
2637
2638 # name: <cell-element>
2639 # type: sq_string
2640 # elements: 1
2641 # length: 48
2642  STATISTIC estimates various statistics at once.
2643
2644
2645
2646 # name: <cell-element>
2647 # type: sq_string
2648 # elements: 1
2649 # length: 3
2650 std
2651
2652
2653 # name: <cell-element>
2654 # type: sq_string
2655 # elements: 1
2656 # length: 983
2657  STD calculates the standard deviation.
2658
2659  [y,v] = std(x [, opt[, DIM [, W]]])
2660
2661  opt   option
2662         0:  normalizes with N-1 [default]
2663                 provides the square root of best unbiased estimator of the variance
2664         1:  normalizes with N,
2665                 this provides the square root of the second moment around the mean
2666         otherwise:
2667                best unbiased estimator of the standard deviation (see [1])
2668
2669  DIM    dimension
2670         N STD of  N-th dimension
2671         default or []: first DIMENSION, with more than 1 element
2672  W      weights to compute weighted s.d. (default: [])
2673         if W=[], all weights are 1.
2674         number of elements in W must match size(x,DIM)
2675
2676  y      estimated standard deviation
2677
2678  features:
2679  - provides an unbiased estimation of the S.D.
2680  - can deal with NaN's (missing values)
2681  - weighting of data
2682  - dimension argument also in Octave
2683  - compatible to Matlab and Octave
2684
2685  see also: RMS, SUMSKIPNAN, MEAN, VAR, MEANSQ,
2686
2687
2688  References(s):
2689  [1] http://mathworld.wolfram.com/StandardDeviationDistribution.html
2690
2691
2692
2693 # name: <cell-element>
2694 # type: sq_string
2695 # elements: 1
2696 # length: 39
2697  STD calculates the standard deviation.
2698
2699
2700
2701 # name: <cell-element>
2702 # type: sq_string
2703 # elements: 1
2704 # length: 10
2705 sumskipnan
2706
2707
2708 # name: <cell-element>
2709 # type: sq_string
2710 # elements: 1
2711 # length: 1234
2712  SUMSKIPNAN adds all non-NaN values.
2713
2714  All NaN's are skipped; NaN's are considered as missing values.
2715  SUMSKIPNAN of NaN's only  gives O; and the number of valid elements is return.
2716  SUMSKIPNAN is also the elementary function for calculating
2717  various statistics (e.g. MEAN, STD, VAR, RMS, MEANSQ, SKEWNESS,
2718  KURTOSIS, MOMENT, STATISTIC etc.) from data with missing values.
2719  SUMSKIPNAN implements the DIMENSION-argument for data with missing values.
2720  Also the second output argument return the number of valid elements (not NaNs)
2721
2722  Y = sumskipnan(x [,DIM])
2723  [Y,N,SSQ] = sumskipnan(x [,DIM])
2724  [...] = sumskipnan(x, DIM, W)
2725
2726  x      input data
2727  DIM    dimension (default: [])
2728         empty DIM sets DIM to first non singleton dimension
2729  W      weight vector for weighted sum, numel(W) must fit size(x,DIM)
2730  Y      resulting sum
2731  N      number of valid (not missing) elements
2732  SSQ    sum of squares
2733
2734  the function FLAG_NANS_OCCURED() returns whether any value in x
2735   is a not-a-number (NaN)
2736
2737  features:
2738  - can deal with NaN's (missing values)
2739  - implements dimension argument.
2740  - computes weighted sum
2741  - compatible with Matlab and Octave
2742
2743  see also: FLAG_NANS_OCCURED, SUM, NANSUM, MEAN, STD, VAR, RMS, MEANSQ,
2744       SSQ, MOMENT, SKEWNESS, KURTOSIS, SEM
2745
2746
2747
2748 # name: <cell-element>
2749 # type: sq_string
2750 # elements: 1
2751 # length: 36
2752  SUMSKIPNAN adds all non-NaN values.
2753
2754
2755
2756 # name: <cell-element>
2757 # type: sq_string
2758 # elements: 1
2759 # length: 5
2760 sumsq
2761
2762
2763 # name: <cell-element>
2764 # type: sq_string
2765 # elements: 1
2766 # length: 391
2767  SUMSQ calculates the sum of squares.
2768
2769  [y] = sumsq(x [,  DIM])
2770
2771  DIM    dimension
2772         N STD of  N-th dimension
2773         default or []: first DIMENSION, with more than 1 element
2774
2775  y      estimated standard deviation
2776
2777  features:
2778  - can deal with NaN's (missing values)
2779  - dimension argument also in Octave
2780  - compatible to Matlab and Octave
2781
2782  see also: RMS, SUMSKIPNAN, MEAN, VAR, MEANSQ,
2783
2784
2785  References(s):
2786
2787
2788
2789 # name: <cell-element>
2790 # type: sq_string
2791 # elements: 1
2792 # length: 37
2793  SUMSQ calculates the sum of squares.
2794
2795
2796
2797 # name: <cell-element>
2798 # type: sq_string
2799 # elements: 1
2800 # length: 4
2801 tcdf
2802
2803
2804 # name: <cell-element>
2805 # type: sq_string
2806 # elements: 1
2807 # length: 254
2808  TCDF returns student cumulative distribtion function
2809
2810  cdf = tcdf(x,DF);
2811
2812  Computes the CDF of the students distribution
2813     with DF degrees of freedom
2814  x,DF must be matrices of same size, or any one can be a scalar.
2815
2816  see also: NORMCDF, TPDF, TINV
2817
2818
2819
2820 # name: <cell-element>
2821 # type: sq_string
2822 # elements: 1
2823 # length: 54
2824  TCDF returns student cumulative distribtion function
2825
2826
2827
2828
2829 # name: <cell-element>
2830 # type: sq_string
2831 # elements: 1
2832 # length: 7
2833 test_sc
2834
2835
2836 # name: <cell-element>
2837 # type: sq_string
2838 # elements: 1
2839 # length: 1441
2840  TEST_SC: apply statistical and SVM classifier to test data
2841
2842   R = test_sc(CC,D,TYPE [,target_Classlabel])
2843        R.output         output: "signed" distance for each class.
2844                 This represents the distances between sample D and the separating hyperplane
2845                 The "signed distance" is possitive if it matches the target class, and
2846                 and negative if it lays on the opposite side of the separating hyperplane.
2847        R.classlabel     class for output data
2848   The target class is optional. If it is provided, the following values are returned.
2849        R.kappa  Cohen's kappa coefficient
2850        R.ACC    Classification accuracy
2851        R.H      Confusion matrix
2852
2853  The classifier CC is typically obtained by TRAIN_SC. If a statistical
2854  classifier is used, TYPE can be used to modify the classifier.
2855     TYPE = 'MDA'    mahalanobis distance based classifier
2856     TYPE = 'MD2'    mahalanobis distance based classifier
2857     TYPE = 'MD3'    mahalanobis distance based classifier
2858     TYPE = 'GRB'    Gaussian radial basis function
2859     TYPE = 'QDA'    quadratic discriminant analysis
2860     TYPE = 'LD2'    linear discriminant analysis
2861     TYPE = 'LD3', 'LDA', 'FDA, 'FLDA'   (Fisher's) linear discriminant analysis
2862     TYPE = 'LD4'    linear discriminant analysis
2863     TYPE = 'GDBC'   general distance based classifier
2864
2865  see also: TRAIN_SC
2866
2867  References:
2868  [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed.
2869        John Wiley & Sons, 2001.
2870
2871
2872
2873 # name: <cell-element>
2874 # type: sq_string
2875 # elements: 1
2876 # length: 61
2877  TEST_SC: apply statistical and SVM classifier to test data
2878
2879
2880
2881
2882 # name: <cell-element>
2883 # type: sq_string
2884 # elements: 1
2885 # length: 8
2886 tiedrank
2887
2888
2889 # name: <cell-element>
2890 # type: sq_string
2891 # elements: 1
2892 # length: 272
2893  TIEDRANK compute rank of samples, the mean value is used in case of ties
2894   this function is just a wrapper for RANKS, and provided for compatibility
2895   with the statistics toolbox of matlab(tm)
2896
2897     R = tiedrank(X)
2898         computes the rank R of vector X
2899
2900  see also: RANKS
2901
2902
2903
2904 # name: <cell-element>
2905 # type: sq_string
2906 # elements: 1
2907 # length: 80
2908  TIEDRANK compute rank of samples, the mean value is used in case of ties
2909   this
2910
2911
2912
2913 # name: <cell-element>
2914 # type: sq_string
2915 # elements: 1
2916 # length: 4
2917 tinv
2918
2919
2920 # name: <cell-element>
2921 # type: sq_string
2922 # elements: 1
2923 # length: 330
2924  TINV returns inverse cumulative function of the student distribution
2925
2926  x = tinv(p,v);
2927
2928  Computes the quantile (inverse of the CDF) of a the student
2929     cumulative distribution with mean m and standard deviation s
2930  p,v must be matrices of same size, or any one can be a scalar.
2931
2932  see also: TPDF, TCDF, NORMPDF, NORMCDF, NORMINV
2933
2934
2935
2936 # name: <cell-element>
2937 # type: sq_string
2938 # elements: 1
2939 # length: 70
2940  TINV returns inverse cumulative function of the student distribution
2941
2942
2943
2944
2945 # name: <cell-element>
2946 # type: sq_string
2947 # elements: 1
2948 # length: 4
2949 tpdf
2950
2951
2952 # name: <cell-element>
2953 # type: sq_string
2954 # elements: 1
2955 # length: 261
2956  TPDF returns student probability density
2957
2958  pdf = tpdf(x,DF);
2959
2960  Computes the PDF of a the student distribution
2961     with DF degreas of freedom
2962  x,DF must be matrices of same size, or any one can be a scalar.
2963
2964  see also: TINV, TCDF, NORMPDF, NORMCDF, NORMINV
2965
2966
2967
2968 # name: <cell-element>
2969 # type: sq_string
2970 # elements: 1
2971 # length: 43
2972  TPDF returns student probability density
2973
2974
2975
2976
2977 # name: <cell-element>
2978 # type: sq_string
2979 # elements: 1
2980 # length: 16
2981 train_lda_sparse
2982
2983
2984 # name: <cell-element>
2985 # type: sq_string
2986 # elements: 1
2987 # length: 1689
2988  Linear Discriminant Analysis for the Small Sample Size Problem as described in
2989  Algorithm 1 of J. Duintjer Tebbens, P. Schlesinger: 'Improving
2990  Implementation of Linear Discriminant Analysis for the High Dimension/Small Sample Size
2991  Problem', Computational Statistics and Data Analysis, vol. 52, no. 1, pp. 423-437, 2007.
2992  Input:
2993                X                 ......       (sparse) training data matrix
2994                G                 ......       group coding matrix of the training data
2995                test              ......       (sparse) test data matrix
2996                Gtest             ......       group coding matrix of the test data
2997                par               ......       if par = 0 then classification exploits sparsity too
2998                tol               ......       tolerance to distinguish zero eigenvalues
2999  Output:
3000                err               ......       Wrong classification rate (in %)
3001                trafo             ......       LDA transformation vectors
3002
3003  Reference(s):
3004  J. Duintjer Tebbens, P. Schlesinger: 'Improving
3005  Implementation of Linear Discriminant Analysis for the High Dimension/Small Sample Size
3006  Problem', Computational Statistics and Data Analysis, vol. 52, no. 1,
3007  pp. 423-437, 2007.
3008
3009  Copyright (C) by J. Duintjer Tebbens, Institute of Computer Science of the Academy of Sciences of the Czech Republic,
3010  Pod Vodarenskou vezi 2, 182 07 Praha 8 Liben, 18.July.2006.
3011  This work was supported by the Program Information Society under project
3012  1ET400300415.
3013
3014
3015  Modified for the use with Matlab6.5 by A. Schloegl, 22.Aug.2006
3016
3017         $Id$
3018        This function is part of the NaN-toolbox
3019        http://pub.ist.ac.at/~schloegl/matlab/NaN/
3020
3021
3022
3023 # name: <cell-element>
3024 # type: sq_string
3025 # elements: 1
3026 # length: 80
3027  Linear Discriminant Analysis for the Small Sample Size Problem as described in
3028
3029
3030
3031
3032 # name: <cell-element>
3033 # type: sq_string
3034 # elements: 1
3035 # length: 8
3036 train_sc
3037
3038
3039 # name: <cell-element>
3040 # type: sq_string
3041 # elements: 1
3042 # length: 7655
3043  Train a (statistical) classifier
3044
3045   CC = train_sc(D,classlabel)
3046   CC = train_sc(D,classlabel,MODE)
3047   CC = train_sc(D,classlabel,MODE, W)
3048         weighting D(k,:) with weight W(k) (not all classifiers supported weighting)
3049
3050  CC contains the model parameters of a classifier which can be applied
3051    to test data using test_sc.
3052    R = test_sc(CC,D,...)
3053
3054    D            training samples (each row is a sample, each column is a feature)
3055    classlabel   labels of each sample, must have the same number of rows as D.
3056                 Two different encodings are supported:
3057                 {-1,1}-encoding (multiple classes with separate columns for each class) or
3058                 1..M encoding.
3059                 So [1;2;3;1;4] is equivalent to
3060                         [+1,-1,-1,-1;
3061                         [-1,+1,-1,-1;
3062                         [-1,-1,+1,-1;
3063                         [+1,-1,-1,-1]
3064                         [-1,-1,-1,+1]
3065                 Note, samples with classlabel=0 are ignored.
3066
3067   The following classifier types are supported MODE.TYPE
3068     'MDA'      mahalanobis distance based classifier [1]
3069     'MD2'      mahalanobis distance based classifier [1]
3070     'MD3'      mahalanobis distance based classifier [1]
3071     'GRB'      Gaussian radial basis function     [1]
3072     'QDA'      quadratic discriminant analysis    [1]
3073     'LD2'      linear discriminant analysis (see LDBC2) [1]
3074                 MODE.hyperparameter.gamma: regularization parameter [default 0]
3075     'LD3', 'FDA', 'LDA', 'FLDA'
3076                linear discriminant analysis (see LDBC3) [1]
3077                 MODE.hyperparameter.gamma: regularization parameter [default 0]
3078     'LD4'      linear discriminant analysis (see LDBC4) [1]
3079                 MODE.hyperparameter.gamma: regularization parameter [default 0]
3080     'LD5'      another LDA (motivated by CSP)
3081                 MODE.hyperparameter.gamma: regularization parameter [default 0]
3082     'RDA'      regularized discriminant analysis [7]
3083                 MODE.hyperparameter.gamma: regularization parameter
3084                 MODE.hyperparameter.lambda =
3085                 gamma = 0, lambda = 0 : MDA
3086                 gamma = 0, lambda = 1 : LDA [default]
3087                 Hint: hyperparameter are used only in test_sc.m, testing different
3088                 the hyperparameters do not need repetitive calls to train_sc,
3089                 it is sufficient to modify CC.hyperparameter before calling test_sc.
3090     'GDBC'     general distance based classifier  [1]
3091     ''         statistical classifier, requires Mode argument in TEST_SC
3092     '###/DELETION'  if the data contains missing values (encoded as NaNs),
3093                 a row-wise or column-wise deletion (depending on which method
3094                 removes less data values) is applied;
3095     '###/GSVD'  GSVD and statistical classifier [2,3],
3096     '###/sparse'  sparse  [5]
3097                 '###' must be 'LDA' or any other classifier
3098     'PLS'       (linear) partial least squares regression
3099     'REG'      regression analysis;
3100     'WienerHopf'        Wiener-Hopf equation
3101     'NBC'       Naive Bayesian Classifier [6]
3102     'aNBC'      Augmented Naive Bayesian Classifier [6]
3103     'NBPW'      Naive Bayesian Parzen Window [9]
3104
3105     'PLA'       Perceptron Learning Algorithm [11]
3106                 MODE.hyperparameter.alpha = alpha [default: 1]
3107                  w = w + alpha * e'*x
3108     'LMS', 'AdaLine'  Least mean squares, adaptive line element, Widrow-Hoff, delta rule
3109                 MODE.hyperparameter.alpha = alpha [default: 1]
3110     'Winnow2'  Winnow2 algorithm [12]
3111
3112     'PSVM'      Proximal SVM [8]
3113                 MODE.hyperparameter.nu  (default: 1.0)
3114     'LPM'      Linear Programming Machine
3115                  uses and requires train_LPM of the iLog CPLEX optimizer
3116                 MODE.hyperparameter.c_value =
3117     'CSP'       CommonSpatialPattern is very experimental and just a hack
3118                 uses a smoothing window of 50 samples.
3119     'SVM','SVM1r'  support vector machines, one-vs-rest
3120                 MODE.hyperparameter.c_value =
3121     'SVM11'    support vector machines, one-vs-one + voting
3122                 MODE.hyperparameter.c_value =
3123     'RBF'      Support Vector Machines with RBF Kernel
3124                 MODE.hyperparameter.c_value =
3125                 MODE.hyperparameter.gamma =
3126     'SVM:LIB'    libSVM [default SVM algorithm)
3127     'SVM:bioinfo' uses and requires svmtrain from the bioinfo toolbox
3128     'SVM:OSU'   uses and requires mexSVMTrain from the OSU-SVM toolbox
3129     'SVM:LOO'   uses and requires svcm_train from the LOO-SVM toolbox
3130     'SVM:Gunn'  uses and requires svc-functios from the Gunn-SVM toolbox
3131     'SVM:KM'    uses and requires svmclass-function from the KM-SVM toolbox
3132     'SVM:LINz'  LibLinear [10] (requires train.mex from LibLinear somewhere in the path)
3133             z=0 (default) LibLinear with -- L2-regularized logistic regression
3134             z=1 LibLinear with -- L2-loss support vector machines (dual)
3135             z=2 LibLinear with -- L2-loss support vector machines (primal)
3136             z=3 LibLinear with -- L1-loss support vector machines (dual)
3137     'SVM:LIN4'  LibLinear with -- multi-class support vector machines by Crammer and Singer
3138     'DT'        decision tree - not implemented yet.
3139
3140  {'REG','MDA','MD2','QDA','QDA2','LD2','LD3','LD4','LD5','LD6','NBC','aNBC','WienerHopf','LDA/GSVD','MDA/GSVD', 'LDA/sparse','MDA/sparse', 'PLA', 'LMS','LDA/DELETION','MDA/DELETION','NBC/DELETION','RDA/DELETION','REG/DELETION','RDA','GDBC','SVM','RBF','PSVM','SVM11','SVM:LIN4','SVM:LIN0','SVM:LIN1','SVM:LIN2','SVM:LIN3','WINNOW', 'DT'};
3141
3142  CC contains the model parameters of a classifier. Some time ago,
3143  CC was a statistical classifier containing the mean
3144  and the covariance of the data of each class (encoded in the
3145   so-called "extended covariance matrices". Nowadays, also other
3146  classifiers are supported.
3147
3148  see also: TEST_SC, COVM, ROW_COL_DELETION
3149
3150  References:
3151  [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed.
3152        John Wiley & Sons, 2001.
3153  [2] Peg Howland and Haesun Park,
3154        Generalizing Discriminant Analysis Using the Generalized Singular Value Decomposition
3155        IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 2004.
3156        dx.doi.org/10.1109/TPAMI.2004.46
3157  [3] http://www-static.cc.gatech.edu/~kihwan23/face_recog_gsvd.htm
3158  [4] Jieping Ye, Ravi Janardan, Cheong Hee Park, Haesun Park
3159        A new optimization criterion for generalized discriminant analysis on undersampled problems.
3160        The Third IEEE International Conference on Data Mining, Melbourne, Florida, USA
3161        November 19 - 22, 2003
3162  [5] J.D. Tebbens and P. Schlesinger (2006),
3163        Improving Implementation of Linear Discriminant Analysis for the Small Sample Size Problem
3164         Computational Statistics & Data Analysis, vol 52(1): 423-437, 2007
3165        http://www.cs.cas.cz/mweb/download/publi/JdtSchl2006.pdf
3166  [6] H. Zhang, The optimality of Naive Bayes,
3167          http://www.cs.unb.ca/profs/hzhang/publications/FLAIRS04ZhangH.pdf
3168  [7] J.H. Friedman. Regularized discriminant analysis.
3169         Journal of the American Statistical Association, 84:165–175, 1989.
3170  [8] G. Fung and O.L. Mangasarian, Proximal Support Vector Machine Classifiers, KDD 2001.
3171         Eds. F. Provost and R. Srikant, Proc. KDD-2001: Knowledge Discovery and Data Mining, August 26-29, 2001, San Francisco, CA.
3172         p. 77-86.
3173  [9] Kai Keng Ang, Zhang Yang Chin, Haihong Zhang, Cuntai Guan.
3174         Filter Bank Common Spatial Pattern (FBCSP) in Brain-Computer Interface.
3175         IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence).
3176         1-8 June 2008 Page(s):2390 - 2397
3177  [10] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin.
3178        LIBLINEAR: A Library for Large Linear Classification, Journal of Machine Learning Research 9(2008), 1871-1874.
3179        Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear
3180  [11] http://en.wikipedia.org/wiki/Perceptron#Learning_algorithm
3181  [12] Littlestone, N. (1988)
3182        "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm"
3183        Machine Learning 285-318(2)
3184         http://en.wikipedia.org/wiki/Winnow_(algorithm)
3185
3186
3187
3188 # name: <cell-element>
3189 # type: sq_string
3190 # elements: 1
3191 # length: 80
3192  Train a (statistical) classifier
3193
3194   CC = train_sc(D,classlabel)
3195   CC = train_s
3196
3197
3198
3199 # name: <cell-element>
3200 # type: sq_string
3201 # elements: 1
3202 # length: 7
3203 trimean
3204
3205
3206 # name: <cell-element>
3207 # type: sq_string
3208 # elements: 1
3209 # length: 266
3210  TRIMEAN yields the weighted mean of the median and the quartiles
3211     m = TRIMEAN(y).
3212
3213  The trimean is  m = (Q1+2*MED+Q3)/4
3214     with quartile Q1 and Q3 and median MED
3215
3216  N-dimensional data is supported
3217
3218  REFERENCES:
3219  [1] http://mathworld.wolfram.com/Trimean.html
3220
3221
3222
3223 # name: <cell-element>
3224 # type: sq_string
3225 # elements: 1
3226 # length: 80
3227  TRIMEAN yields the weighted mean of the median and the quartiles
3228     m = TRIMEA
3229
3230
3231
3232 # name: <cell-element>
3233 # type: sq_string
3234 # elements: 1
3235 # length: 8
3236 trimmean
3237
3238
3239 # name: <cell-element>
3240 # type: sq_string
3241 # elements: 1
3242 # length: 664
3243  TRIMMEAN calculates the trimmed mean by removing the fraction of p/2 upper and
3244  p/2 lower samples. Missing values (encoded as NaN) are ignored and not taken into account.
3245  The same number from the upper and lower values are removed, and is compatible to various
3246  spreadsheet programs including GNumeric [1], LibreOffice, OpenOffice and MS Excel.
3247
3248   Q = trimmean(Y,p)
3249   Q = trimmean(Y,p,DIM)
3250      returns the TRIMMEAN along dimension DIM of sample array Y.
3251   If p is a vector, the TRIMMEAN for each p is computed.
3252
3253  see also: MAD, RANGE, HISTO2, HISTO3, PERCENTILE, QUANTILE
3254
3255  References:
3256  [1] http://www.fifi.org/doc/gnumeric-doc/html/C/gnumeric-trimmean.html
3257
3258
3259
3260 # name: <cell-element>
3261 # type: sq_string
3262 # elements: 1
3263 # length: 80
3264  TRIMMEAN calculates the trimmed mean by removing the fraction of p/2 upper and
3265
3266
3267
3268 # name: <cell-element>
3269 # type: sq_string
3270 # elements: 1
3271 # length: 5
3272 ttest
3273
3274
3275 # name: <cell-element>
3276 # type: sq_string
3277 # elements: 1
3278 # length: 1474
3279  TTEST (paired) t-test
3280      For a sample X from a normal distribution with unknown mean and
3281      variance, perform a t-test of the null hypothesis `mean (X) == M'.
3282      Under the null, the test statistic T follows a Student
3283      distribution with `DF = length (X) - 1' degrees of freedom.
3284
3285      TTEST treads NaNs as "Missing values" and ignores these.
3286
3287  H = ttest(x,m)
3288         tests Null-hypothesis that mean of x is m.
3289  H = ttest(x,y)
3290         size of x and size of y must match, it is tested whether the
3291         difference x-y is significantly different to m=0;
3292  H = ttest(x,y,alpha)
3293  H = ttest(x,y,alpha,tail)
3294  H = ttest(x,y,alpha,tail,DIM)
3295  [H,PVAL] = ttest(...)
3296
3297      H=1 indicates a rejection of the Null-hypothesis at a significance
3298      level of alpha (default alpha = 0.05).
3299
3300      With the optional argument string TAIL, the alternative of interest
3301      can be selected.  If TAIL is '!=' or '<>' or 'both', the null is tested
3302      against the two-sided Alternative `mean (X) ~= mean (Y)'.  If TAIL
3303      is '>' or 'right', the one-sided Alternative `mean (X) > mean (Y)' is used.
3304      Similarly for '<' or 'left', the one-sided Alternative `mean (X) < mean
3305      (Y)' is used.  The default is the two-sided case.
3306
3307      H returns whether the Null-Hypotheses must be rejected.
3308      The p-value of the test is returned in PVAL.
3309
3310      TTEST works on the first non-singleton dimension or on DIM.
3311
3312      If no output argument is given, the p-value of the test is
3313      displayed.
3314
3315
3316
3317
3318 # name: <cell-element>
3319 # type: sq_string
3320 # elements: 1
3321 # length: 80
3322  TTEST (paired) t-test
3323      For a sample X from a normal distribution with unkno
3324
3325
3326
3327 # name: <cell-element>
3328 # type: sq_string
3329 # elements: 1
3330 # length: 6
3331 ttest2
3332
3333
3334 # name: <cell-element>
3335 # type: sq_string
3336 # elements: 1
3337 # length: 1514
3338  TTEST2 (unpaired) t-test
3339      For two samples x and y from normal distributions with unknown
3340      means and unknown equal variances, perform a two-sample t-test of
3341      the null hypothesis of equal means.  Under the null, the test
3342      statistic T follows a Student distribution with DF degrees of
3343      freedom.
3344
3345      TTEST2 treads NaNs as "Missing values" and ignores these.
3346
3347  H = ttest2(x,y)
3348  H = ttest2([x;y],C,W)
3349  H = ttest2(x,y,alpha)
3350  H = ttest2(x,y,alpha,tail)
3351  H = ttest2(x,y,alpha,tail,vartype)
3352  H = ttest2(x,y,alpha,tail,vartype,DIM)
3353  [H,PVAL] = ttest2(...)
3354  [h,p,ci,stats] = ttest2(...)
3355
3356      H=1 indicates a rejection of the Null-hypothesis at a significance
3357      level of alpha (default alpha = 0.05).
3358
3359      With the optional argument string TAIL, the Alternative of interest
3360      can be selected.  If TAIL is '!=' or '<>' or 'both', the null is tested
3361      against the two-sided Alternative `mean (X) ~= mean (Y)'.  If TAIL
3362      is '>' or 'right', the one-sided Alternative `mean (X) > mean (Y)' is used.
3363      Similarly for '<' or 'left', the one-sided Alternative `mean (X) < mean
3364      (Y)' is used.  The default is the two-sided case.
3365
3366      vartype support only 'equal' (default value); the value 'unequal' is not supported.
3367
3368      H returns whether the Null-Hypotheses must be rejected.
3369      The p-value of the test is returned in PVAL.
3370
3371      TTEST2 works on the first non-singleton dimension or on DIM.
3372
3373      If no output argument is given, the p-value of the test is
3374      displayed.
3375
3376
3377
3378
3379 # name: <cell-element>
3380 # type: sq_string
3381 # elements: 1
3382 # length: 80
3383  TTEST2 (unpaired) t-test
3384      For two samples x and y from normal distributions
3385
3386
3387
3388 # name: <cell-element>
3389 # type: sq_string
3390 # elements: 1
3391 # length: 3
3392 var
3393
3394
3395 # name: <cell-element>
3396 # type: sq_string
3397 # elements: 1
3398 # length: 772
3399  VAR calculates the variance.
3400
3401  y = var(x [, opt[, DIM]])
3402    calculates the variance in dimension DIM
3403    the default DIM is the first non-single dimension
3404
3405  opt   0: normalizes with N-1 [default]
3406         1: normalizes with N
3407  DIM    dimension
3408         1: VAR of columns
3409         2: VAR of rows
3410         N: VAR of  N-th dimension
3411         default or []: first DIMENSION, with more than 1 element
3412  W      weights to compute weighted variance (default: [])
3413         if W=[], all weights are 1.
3414         number of elements in W must match size(x,DIM)
3415
3416  usage:
3417         var(x)
3418         var(x, opt, DIM)
3419         var(x, [], DIM)
3420         var(x, W, DIM)
3421         var(x, opt, DIM, W)
3422
3423  features:
3424  - can deal with NaN's (missing values)
3425  - weighting of data
3426  - dimension argument
3427  - compatible to Matlab and Octave
3428
3429  see also: MEANSQ, SUMSQ, SUMSKIPNAN, MEAN, RMS, STD,
3430
3431
3432
3433 # name: <cell-element>
3434 # type: sq_string
3435 # elements: 1
3436 # length: 29
3437  VAR calculates the variance.
3438
3439
3440
3441 # name: <cell-element>
3442 # type: sq_string
3443 # elements: 1
3444 # length: 5
3445 xcovf
3446
3447
3448 # name: <cell-element>
3449 # type: sq_string
3450 # elements: 1
3451 # length: 1059
3452  XCOVF generates cross-covariance function.
3453  XCOVF is the same as XCORR except
3454    X and Y can contain missing values encoded with NaN.
3455    NaN's are skipped, NaN do not result in a NaN output.
3456    The output gives NaN only if there are insufficient input data
3457
3458  [C,N,LAGS] = xcovf(X,MAXLAG,SCALEOPT);
3459       calculates the (auto-)correlation function of X
3460  [C,N,LAGS] = xcovf(X,Y,MAXLAG,SCALEOPT);
3461       calculates the crosscorrelation function between X and Y
3462
3463   SCALEOPT   [character string] specifies the type of scaling applied
3464           to the correlation vector (or matrix). is one of:
3465     'none'      return the unscaled correlation, R,
3466     'biased'    return the biased average, R/N,
3467     'unbiased'  return the unbiassed average, R(k)/(N-|k|),
3468     'coeff'     return the correlation coefficient, R/(rms(x).rms(y)),
3469           where "k" is the lag, and "N" is the length of X.
3470           If omitted, the default value is "none".
3471           If Y is supplied but does not have the ame length as X,
3472           scale must be "none".
3473
3474
3475  see also: COVM, XCORR
3476
3477
3478
3479 # name: <cell-element>
3480 # type: sq_string
3481 # elements: 1
3482 # length: 43
3483  XCOVF generates cross-covariance function.
3484
3485
3486
3487 # name: <cell-element>
3488 # type: sq_string
3489 # elements: 1
3490 # length: 7
3491 xptopen
3492
3493
3494 # name: <cell-element>
3495 # type: sq_string
3496 # elements: 1
3497 # length: 723
3498  XPTOPEN read of several file formats and writing of the SAS Transport Format (*.xpt)
3499    Supported are ARFF, SAS-XPT and STATA files.
3500    XPTOPEN is a mex-file and must be compiled before use.
3501    More detailed help can be obtained by the command
3502      xptopen
3503    without an additional argument
3504
3505      X = xptopen(filename)
3506      X = xptopen(filename,'r')
3507    read file with filename and return variables in struct X
3508
3509    X = xptopen(filename,'w',X)
3510         save fields of struct X in filename.
3511
3512    The fields of X must be column vectors of equal length.
3513    Each vector is either a numeric vector or a cell array of strings.
3514    The SAS-XPT format stores Date/Time as numeric value counting the number of days since 1960-01-01.
3515
3516
3517
3518 # name: <cell-element>
3519 # type: sq_string
3520 # elements: 1
3521 # length: 80
3522  XPTOPEN read of several file formats and writing of the SAS Transport Format (*
3523
3524
3525
3526 # name: <cell-element>
3527 # type: sq_string
3528 # elements: 1
3529 # length: 4
3530 xval
3531
3532
3533 # name: <cell-element>
3534 # type: sq_string
3535 # elements: 1
3536 # length: 2980
3537  XVAL is used for crossvalidation
3538
3539   [R,CC] = xval(D,classlabel)
3540   .. = xval(D,classlabel,CLASSIFIER)
3541   .. = xval(D,classlabel,CLASSIFIER,type)
3542   .. = xval(D,{classlabel,W},CLASSIFIER)
3543   .. = xval(D,{classlabel,W,NG},CLASSIFIER)
3544
3545   example:
3546       load_fisheriris;    %builtin iris dataset
3547       C = species;
3548       K = 5; NG = [1:length(C)]'*K/length(C);
3549       [R,CC] = xval(meas,{C,[],NG},'NBC');
3550
3551  Input:
3552     D:  data features (one feature per column, one sample per row)
3553     classlabel  labels of each sample, must have the same number of rows as D.
3554                 Two different encodings are supported:
3555                 {-1,1}-encoding (multiple classes with separate columns for each class) or
3556                 1..M encoding.
3557                 So [1;2;3;1;4] is equivalent to
3558                         [+1,-1,-1,-1;
3559                         [-1,+1,-1,-1;
3560                         [-1,-1,+1,-1;
3561                         [+1,-1,-1,-1]
3562                         [-1,-1,-1,+1]
3563                 Note, samples with classlabel=0 are ignored.
3564
3565     CLASSIFIER can be any classifier supported by train_sc (default='LDA')
3566        {'REG','MDA','MD2','QDA','QDA2','LD2','LD3','LD4','LD5','LD6','NBC','aNBC','WienerHopf', 'RDA','GDBC',
3567          'SVM','RBF','PSVM','SVM11','SVM:LIN4','SVM:LIN0','SVM:LIN1','SVM:LIN2','SVM:LIN3','WINNOW'}
3568        these can be modified by ###/GSVD, ###/sparse and ###/DELETION.
3569            /DELETION removes in case of NaN's either the rows or the columns (which removes less data values) with any NaN
3570            /sparse and /GSVD preprocess the data an reduce it to some lower-dimensional space.
3571        Hyperparameters (like alpha for PLA, gamma/lambda for RDA, c_value for SVM, etc) can be defined as
3572         CLASSIFIER.hyperparameter.alpha, etc. and
3573         CLASSIFIER.TYPE = 'PLA' (as listed above).
3574        See train_sc for details.
3575     W:  weights for each sample (row) in D.
3576         default: [] (i.e. all weights are 1)
3577         number of elements in W must match the number of rows of D
3578     NG: used to define the type of cross-valdiation
3579         Leave-One-Out-Method (LOOM): NG = [1:length(classlabel)]' (default)
3580         Leave-K-Out-Method: NG = ceil([1:length(classlabel)]'/K)
3581         K-fold XV:  NG = ceil([1:length(classlabel)]'*K/length(classlabel))
3582         group-wise XV (if samples are not indepentent) can be also defined here
3583         samples from the same group (dependent samples) get the same identifier
3584         samples from different groups get different classifiers
3585     TYPE:  defines the type of cross-validation procedure if NG is not specified
3586         'LOOM'  leave-one-out-method
3587        k        k-fold crossvalidation
3588
3589  OUTPUT:
3590     R contains the resulting performance metric
3591     CC contains the classifier
3592
3593     plota(R) shows the confusion matrix of the results
3594
3595  see also: TRAIN_SC, TEST_SC, CLASSIFY, PLOTA
3596
3597  References:
3598  [1] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed.
3599        John Wiley & Sons, 2001.
3600  [2] A. Schlögl, J. Kronegg, J.E. Huggins, S. G. Mason;
3601        Evaluation criteria in BCI research.
3602        (Eds.) G. Dornhege, J.R. Millan, T. Hinterberger, D.J. McFarland, K.-R.Müller;
3603        Towards Brain-Computer Interfacing, MIT Press, 2007, p.327-342
3604
3605
3606
3607 # name: <cell-element>
3608 # type: sq_string
3609 # elements: 1
3610 # length: 35
3611  XVAL is used for crossvalidation
3612
3613
3614
3615
3616 # name: <cell-element>
3617 # type: sq_string
3618 # elements: 1
3619 # length: 12
3620 zScoreMedian
3621
3622
3623 # name: <cell-element>
3624 # type: sq_string
3625 # elements: 1
3626 # length: 326
3627  zScoreMedian removes the median and standardizes by the 1.483*median absolute deviation
3628
3629  Usage:  Z = zScoreMedian(X, DIM)
3630  Input:  X  : data
3631          DIM: dimension along which z-score should be calculated (1=columns, 2=rows)
3632               (optional, default=first dimension with more than 1 element
3633  Output: Z  : z-scores
3634
3635
3636
3637 # name: <cell-element>
3638 # type: sq_string
3639 # elements: 1
3640 # length: 59
3641  zScoreMedian removes the median and standardizes by the 1.
3642
3643
3644
3645 # name: <cell-element>
3646 # type: sq_string
3647 # elements: 1
3648 # length: 6
3649 zscore
3650
3651
3652 # name: <cell-element>
3653 # type: sq_string
3654 # elements: 1
3655 # length: 622
3656  ZSCORE removes the mean and normalizes the data
3657  to a variance of 1. Can be used for Pre-Whitening of the data, too.
3658
3659  [z,r,m] = zscore(x,DIM)
3660    z   z-score of x along dimension DIM
3661    r   is the inverse of the standard deviation
3662    m   is the mean of x
3663
3664  The data x can be reconstrated with
3665      x = z*diag(1./r) + repmat(m,size(z)./size(m))
3666      z = x*diag(r) - repmat(m.*v,size(z)./size(m))
3667
3668  DIM    dimension
3669         1: STATS of columns
3670         2: STATS of rows
3671         default or []: first DIMENSION, with more than 1 element
3672
3673  see also: SUMSKIPNAN, MEAN, STD, DETREND
3674
3675  REFERENCE(S):
3676  [1] http://mathworld.wolfram.com/z-Score.html
3677
3678
3679
3680 # name: <cell-element>
3681 # type: sq_string
3682 # elements: 1
3683 # length: 70
3684  ZSCORE removes the mean and normalizes the data
3685  to a variance of 1.
3686
3687
3688
3689
3690