1 # Created by Octave 3.6.1, Sun May 13 12:55:35 2012 UTC <root@brouzouf>
13 # name: <cell-element>
17 -- Function File: P = anderson_darling_cdf (A, N)
18 Return the CDF for the given Anderson-Darling coefficient A
19 computed from N values sampled from a distribution. For a vector
20 of random variables X of length N, compute the CDF of the values
21 from the distribution from which they are drawn. You can uses
22 these values to compute A as follows:
24 A = -N - sum( (2*i-1) .* (log(X) + log(1 - X(N:-1:1,:))) )/N;
26 From the value A, `anderson_darling_cdf' returns the probability
27 that A could be returned from a set of samples.
29 The algorithm given in [1] claims to be an approximation for the
30 Anderson-Darling CDF accurate to 6 decimal points.
34 n = 300; reps = 10000;
36 x = sort ((1 + erf (z/sqrt (2)))/2);
37 i = [1:n]' * ones (1, size (x, 2));
38 A = -n - sum ((2*i-1) .* (log (x) + log (1 - x (n:-1:1, :))))/n;
39 p = anderson_darling_cdf (A, n);
40 hist (100 * p, [1:100] - 0.5);
42 You will see that the histogram is basically flat, which is to say
43 that the probabilities returned by the Anderson-Darling CDF are
44 distributed uniformly.
46 You can easily determine the extreme values of P:
48 [junk, idx] = sort (p);
50 The histograms of various P aren't very informative:
52 histfit (z (:, idx (1)), linspace (-3, 3, 15));
53 histfit (z (:, idx (end/2)), linspace (-3, 3, 15));
54 histfit (z (:, idx (end)), linspace (-3, 3, 15));
56 More telling is the qqplot:
58 qqplot (z (:, idx (1))); hold on; plot ([-3, 3], [-3, 3], ';;'); hold off;
59 qqplot (z (:, idx (end/2))); hold on; plot ([-3, 3], [-3, 3], ';;'); hold off;
60 qqplot (z (:, idx (end))); hold on; plot ([-3, 3], [-3, 3], ';;'); hold off;
62 Try a similarly analysis for Z uniform:
64 z = rand (n, reps); x = sort(z);
66 and for Z exponential:
68 z = rande (n, reps); x = sort (1 - exp (-z));
70 [1] Marsaglia, G; Marsaglia JCW; (2004) "Evaluating the Anderson
71 Darling distribution", Journal of Statistical Software, 9(2).
73 See also: anderson_darling_test
79 # name: <cell-element>
83 Return the CDF for the given Anderson-Darling coefficient A computed
88 # name: <cell-element>
95 # name: <cell-element>
99 -- Function File: [Q, ASQ, INFO] = = anderson_darling_test (X,
101 Test the hypothesis that X is selected from the given distribution
102 using the Anderson-Darling test. If the returned Q is small,
103 reject the hypothesis at the Q*100% level.
105 The Anderson-Darling A^2 statistic is calculated as follows:
108 A^2_n = -n - SUM (2i-1)/n log(z_i (1 - z_{n-i+1}))
111 where z_i is the ordered position of the X's in the CDF of the
112 distribution. Unlike the Kolmogorov-Smirnov statistic, the
113 Anderson-Darling statistic is sensitive to the tails of the
116 The DISTRIBUTION argument must be a either "uniform", "normal", or
119 For "normal"' and "exponential" distributions, estimate the
120 distribution parameters from the data, convert the values to CDF
121 values, and compare the result to tabluated critical values. This
122 includes an correction for small N which works well enough for N
123 >= 8, but less so from smaller N. The returned
124 `info.Asq_corrected' contains the adjusted statistic.
126 For "uniform", assume the values are uniformly distributed in
127 (0,1), compute A^2 and return the corresponding p-value from
128 `1-anderson_darling_cdf(A^2,n)'.
130 If you are selecting from a known distribution, convert your
131 values into CDF values for the distribution and use "uniform". Do
132 not use "uniform" if the distribution parameters are estimated
133 from the data itself, as this sharply biases the A^2 statistic
134 toward smaller values.
136 [1] Stephens, MA; (1986), "Tests based on EDF statistics", in
137 D'Agostino, RB; Stephens, MA; (eds.) Goodness-of-fit Techinques.
140 See also: anderson_darling_cdf
146 # name: <cell-element>
150 Test the hypothesis that X is selected from the given distribution
155 # name: <cell-element>
162 # name: <cell-element>
166 -- Function File: [PVAL, F, DF_B, DF_E] = anovan (DATA, GRPS)
167 -- Function File: [PVAL, F, DF_B, DF_E] = anovan (DATA, GRPS,
169 Perform a multi-way analysis of variance (ANOVA). The goal is to
170 test whether the population means of data taken from K different
171 groups are all equal.
173 Data is a single vector DATA with groups specified by a
174 corresponding matrix of group labels GRPS, where GRPS has the same
175 number of rows as DATA. For example, if DATA = [1.1;1.2]; GRPS=
176 [1,2,1; 1,5,2]; then data point 1.1 was measured under conditions
177 1,2,1 and data point 1.2 was measured under conditions 1,5,2.
178 Note that groups do not need to be sequentially numbered.
180 By default, a 'linear' model is used, computing the N main effects
181 with no interactions. this may be modified by param 'model'
183 p= anovan(data,groups, 'model', modeltype) - modeltype = 'linear':
184 compute N main effects - modeltype = 'interaction': compute N
185 effects and N*(N-1) two-factor
186 interactions - modeltype = 'full': compute interactions at all
189 Under the null of constant means, the statistic F follows an F
190 distribution with DF_B and DF_E degrees of freedom.
192 The p-value (1 minus the CDF of this distribution at F) is
195 If no output argument is given, the standard one-way ANOVA table is
198 BUG: DFE is incorrect for modeltypes != full
203 # name: <cell-element>
207 Perform a multi-way analysis of variance (ANOVA).
211 # name: <cell-element>
218 # name: <cell-element>
222 -- Function File: [M, V] = betastat (A, B)
223 Compute mean and variance of the beta distribution.
228 * A is the first parameter of the beta distribution. A must be
231 * B is the second parameter of the beta distribution. B must be
233 A and B must be of common size or one of them must be scalar
238 * M is the mean of the beta distribution
240 * V is the variance of the beta distribution
247 [m, v] = betastat (a, b)
249 [m, v] = betastat (a, 1.5)
254 1. Wendy L. Martinez and Angel R. Martinez. `Computational
255 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
256 Chapman & Hall/CRC, 2001.
258 2. Athanasios Papoulis. `Probability, Random Variables, and
259 Stochastic Processes'. McGraw-Hill, New York, second edition,
265 # name: <cell-element>
269 Compute mean and variance of the beta distribution.
273 # name: <cell-element>
280 # name: <cell-element>
284 -- Function File: [M, V] = binostat (N, P)
285 Compute mean and variance of the binomial distribution.
290 * N is the first parameter of the binomial distribution. The
291 elements of N must be natural numbers
293 * P is the second parameter of the binomial distribution. The
294 elements of P must be probabilities
295 N and P must be of common size or one of them must be scalar
300 * M is the mean of the binomial distribution
302 * V is the variance of the binomial distribution
309 [m, v] = binostat (n, p)
311 [m, v] = binostat (n, 0.5)
316 1. Wendy L. Martinez and Angel R. Martinez. `Computational
317 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
318 Chapman & Hall/CRC, 2001.
320 2. Athanasios Papoulis. `Probability, Random Variables, and
321 Stochastic Processes'. McGraw-Hill, New York, second edition,
327 # name: <cell-element>
331 Compute mean and variance of the binomial distribution.
335 # name: <cell-element>
342 # name: <cell-element>
346 -- Function File: S = boxplot (DATA, NOTCHED, SYMBOL, VERTICAL,
348 -- Function File: [... H]= boxplot (...)
351 The box plot is a graphical display that simultaneously describes
352 several important features of a data set, such as center, spread,
353 departure from symmetry, and identification of observations that
354 lie unusually far from the bulk of the data.
356 DATA is a matrix with one column for each data set, or data is a
357 cell vector with one cell for each data set.
359 NOTCHED = 1 produces a notched-box plot. Notches represent a robust
360 estimate of the uncertainty about the median.
362 NOTCHED = 0 (default) produces a rectangular box plot.
364 NOTCHED in (0,1) produces a notch of the specified depth. notched
365 values outside (0,1) are amusing if not exactly practical.
367 SYMBOL sets the symbol for the outlier values, default symbol for
368 points that lie outside 3 times the interquartile range is 'o',
369 default symbol for points between 1.5 and 3 times the interquartile
372 SYMBOL = '.' points between 1.5 and 3 times the IQR is marked with
373 '.' and points outside 3 times IQR with 'o'.
375 SYMBOL = ['x','*'] points between 1.5 and 3 times the IQR is
376 marked with 'x' and points outside 3 times IQR with '*'.
378 VERTICAL = 0 makes the boxes horizontal, by default VERTICAL = 1.
380 MAXWHISKER defines the length of the whiskers as a function of the
381 IQR (default = 1.5). If MAXWHISKER = 0 then `boxplot' displays all
382 data values outside the box using the plotting symbol for points
383 that lie outside 3 times the IQR.
385 Supplemental arguments are concatenated and passed to plot.
387 The returned matrix S has one column for each data set as follows:
391 3 2nd quartile (median)
394 6 Lower confidence limit for median
395 7 Upper confidence limit for median
397 The returned structure H has hanldes to the plot elements, allowing
398 customization of the visualization using set/get functions.
402 title ("Grade 3 heights");
404 tics ("x", 1:2, {"girls"; "boys"});
405 boxplot ({randn(10,1)*5+140, randn(13,1)*8+135});
411 # name: <cell-element>
419 # name: <cell-element>
426 # name: <cell-element>
430 -- Function File: NAMES = caseread (FILENAME)
431 Read case names from an ascii file.
433 Essentially, this reads all lines from a file as text and returns
434 them in a string matrix.
436 See also: casewrite, tblread, tblwrite, csv2cell, cell2csv, fopen
442 # name: <cell-element>
446 Read case names from an ascii file.
450 # name: <cell-element>
457 # name: <cell-element>
461 -- Function File: casewrite (STRMAT, FILENAME)
462 Write case names to an ascii file.
464 Essentially, this writes all lines from STRMAT to FILENAME (after
467 See also: caseread, tblread, tblwrite, csv2cell, cell2csv, fopen
473 # name: <cell-element>
477 Write case names to an ascii file.
481 # name: <cell-element>
488 # name: <cell-element>
492 -- Function File: [M, V] = chi2stat (N)
493 Compute mean and variance of the chi-square distribution.
498 * N is the parameter of the chi-square distribution. The
499 elements of N must be positive
504 * M is the mean of the chi-square distribution
506 * V is the variance of the chi-square distribution
512 [m, v] = chi2stat (n)
517 1. Wendy L. Martinez and Angel R. Martinez. `Computational
518 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
519 Chapman & Hall/CRC, 2001.
521 2. Athanasios Papoulis. `Probability, Random Variables, and
522 Stochastic Processes'. McGraw-Hill, New York, second edition,
528 # name: <cell-element>
532 Compute mean and variance of the chi-square distribution.
536 # name: <cell-element>
543 # name: <cell-element>
547 -- Function File: CL = cl_multinom (X, N, B, CALCULATION_TYPE ) -
548 Confidence level of multinomial portions
549 Returns confidence level of multinomial parameters estimated p =
550 x / sum(x) with predefined confidence interval B. Finite
551 population is also considered.
553 This function calculates the level of confidence at which the
554 samples represent the true distribution given that there is a
555 predefined tolerance (confidence interval). This is the upside
556 down case of the typical excercises at which we want to get the
557 confidence interval given the confidence level (and the estimated
558 parameters of the underlying distribution). But once we accept
559 (lets say at elections) that we have a standard predefined maximal
560 acceptable error rate (e.g. B=0.02 ) in the estimation and we just
561 want to know that how sure we can be that the measured proportions
562 are the same as in the entire population (ie. the expected value
563 and mean of the samples are roghly the same) we need to use this
569 * X : int vector : sample frequencies bins
571 * N : int : Population size that was sampled by x. If
572 N<sum(x), infinite number assumed
574 * B : real, vector : confidence interval if
575 vector, it should be the size of x containing confence
576 interval for each cells if scalar, each cell will
577 have the same value of b unless it is zero or -1
578 if value is 0, b=.02 is assumed which is standard choice at
579 elections otherwise it is calculated in a way that
580 one sample in a cell alteration defines the confidence
583 * CALCULATION_TYPE : string : (Optional), described below
584 "bromaghin" (default) - do not change it unless
585 you have a good reason to do so "cochran"
586 "agresti_cull" this is not exactly the solution at reference
587 given below but an adjustment of the solutions above
597 CL = cl_multinom( [27;43;19;11], 10000, 0.05 ) returns 0.69
603 "bromaghin" calculation type (default) is based on is based on the
604 article Jeffrey F. Bromaghin, "Sample Size Determination for Interval
605 Estimation of Multinomial Probabilities", The American Statistician
606 vol 47, 1993, pp 203-206.
608 "cochran" calculation type is based on article Robert T.
609 Tortora, "A Note on Sample Size Estimation for Multinomial
610 Populations", The American Statistician, , Vol 32. 1978, pp 100-102.
612 "agresti_cull" calculation type is based on article in which
613 Quesenberry Hurst and Goodman result is combined A. Agresti and B.A.
614 Coull, "Approximate is better than \"exact\" for interval estimation of
615 binomial portions", The American Statistician, Vol. 52, 1998, pp 119-126
621 # name: <cell-element>
625 Returns confidence level of multinomial parameters estimated p = x /
630 # name: <cell-element>
637 # name: <cell-element>
641 -- Function File: C = combnk (DATA, K)
642 Return all combinations of K elements in DATA.
647 # name: <cell-element>
651 Return all combinations of K elements in DATA.
655 # name: <cell-element>
662 # name: <cell-element>
666 -- Function File: P = copulacdf (FAMILY, X, THETA)
667 -- Function File: copulacdf ('t', X, THETA, NU)
668 Compute the cumulative distribution function of a copula family.
673 * FAMILY is the copula family name. Currently, FAMILY can be
674 `'Gaussian'' for the Gaussian family, `'t'' for the Student's
675 t family, `'Clayton'' for the Clayton family, `'Gumbel'' for
676 the Gumbel-Hougaard family, `'Frank'' for the Frank family,
677 `'AMH'' for the Ali-Mikhail-Haq family, or `'FGM'' for the
678 Farlie-Gumbel-Morgenstern family.
680 * X is the support where each row corresponds to an observation.
682 * THETA is the parameter of the copula. For the Gaussian and
683 Student's t copula, THETA must be a correlation matrix. For
684 bivariate copulas THETA can also be a correlation coefficient.
685 For the Clayton family, the Gumbel-Hougaard family, the Frank
686 family, and the Ali-Mikhail-Haq family, THETA must be a
687 vector with the same number of elements as observations in X
688 or be scalar. For the Farlie-Gumbel-Morgenstern family, THETA
689 must be a matrix of coefficients for the
690 Farlie-Gumbel-Morgenstern polynomial where each row
691 corresponds to one set of coefficients for an observation in
692 X. A single row is expanded. The coefficients are in binary
695 * NU is the degrees of freedom for the Student's t family. NU
696 must be a vector with the same number of elements as
697 observations in X or be scalar.
702 * P is the cumulative distribution of the copula at each row of
703 X and corresponding parameter THETA.
708 x = [0.2:0.2:0.6; 0.2:0.2:0.6];
710 p = copulacdf ("Clayton", x, theta)
712 x = [0.2:0.2:0.6; 0.2:0.1:0.4];
713 theta = [0.2, 0.1, 0.1, 0.05];
714 p = copulacdf ("FGM", x, theta)
719 1. Roger B. Nelsen. `An Introduction to Copulas'. Springer, New
720 York, second edition, 2006.
725 # name: <cell-element>
729 Compute the cumulative distribution function of a copula family.
733 # name: <cell-element>
740 # name: <cell-element>
744 -- Function File: P = copulapdf (FAMILY, X, THETA)
745 Compute the probability density function of a copula family.
750 * FAMILY is the copula family name. Currently, FAMILY can be
751 `'Clayton'' for the Clayton family, `'Gumbel'' for the
752 Gumbel-Hougaard family, `'Frank'' for the Frank family, or
753 `'AMH'' for the Ali-Mikhail-Haq family.
755 * X is the support where each row corresponds to an observation.
757 * THETA is the parameter of the copula. The elements of THETA
758 must be greater than or equal to `-1' for the Clayton family,
759 greater than or equal to `1' for the Gumbel-Hougaard family,
760 arbitrary for the Frank family, and greater than or equal to
761 `-1' and lower than `1' for the Ali-Mikhail-Haq family.
762 Moreover, THETA must be non-negative for dimensions greater
763 than `2'. THETA must be a column vector with the same number
764 of rows as X or be scalar.
769 * P is the probability density of the copula at each row of X
770 and corresponding parameter THETA.
775 x = [0.2:0.2:0.6; 0.2:0.2:0.6];
777 p = copulapdf ("Clayton", x, theta)
779 p = copulapdf ("Gumbel", x, 2)
784 1. Roger B. Nelsen. `An Introduction to Copulas'. Springer, New
785 York, second edition, 2006.
790 # name: <cell-element>
794 Compute the probability density function of a copula family.
798 # name: <cell-element>
805 # name: <cell-element>
809 -- Function File: X = copularnd (FAMILY, THETA, N)
810 -- Function File: copularnd (FAMILY, THETA, N, D)
811 -- Function File: copularnd ('t', THETA, NU, N)
812 Generate random samples from a copula family.
817 * FAMILY is the copula family name. Currently, FAMILY can be
818 `'Gaussian'' for the Gaussian family, `'t'' for the Student's
819 t family, or `'Clayton'' for the Clayton family.
821 * THETA is the parameter of the copula. For the Gaussian and
822 Student's t copula, THETA must be a correlation matrix. For
823 bivariate copulas THETA can also be a correlation
824 coefficient. For the Clayton family, THETA must be a vector
825 with the same number of elements as samples to be generated
828 * NU is the degrees of freedom for the Student's t family. NU
829 must be a vector with the same number of elements as samples
830 to be generated or be scalar.
832 * N is the number of rows of the matrix to be generated. N must
833 be a non-negative integer and corresponds to the number of
834 samples to be generated.
836 * D is the number of columns of the matrix to be generated. D
837 must be a positive integer and corresponds to the dimension
843 * X is a matrix of random samples from the copula with N samples
844 of distribution dimension D.
850 x = copularnd ("Gaussian", theta);
854 x = copularnd ("t", theta, nu);
858 x = copularnd ("Clayton", theta, n);
863 1. Roger B. Nelsen. `An Introduction to Copulas'. Springer, New
864 York, second edition, 2006.
869 # name: <cell-element>
873 Generate random samples from a copula family.
877 # name: <cell-element>
884 # name: <cell-element>
888 -- Function File: [M, V] = expstat (L)
889 Compute mean and variance of the exponential distribution.
894 * L is the parameter of the exponential distribution. The
895 elements of L must be positive
900 * M is the mean of the exponential distribution
902 * V is the variance of the exponential distribution
913 1. Wendy L. Martinez and Angel R. Martinez. `Computational
914 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
915 Chapman & Hall/CRC, 2001.
917 2. Athanasios Papoulis. `Probability, Random Variables, and
918 Stochastic Processes'. McGraw-Hill, New York, second edition,
924 # name: <cell-element>
928 Compute mean and variance of the exponential distribution.
932 # name: <cell-element>
939 # name: <cell-element>
943 -- Function File: ff2n (N)
944 Full-factor design with n binary terms.
952 # name: <cell-element>
956 Full-factor design with n binary terms.
960 # name: <cell-element>
967 # name: <cell-element>
971 -- Function File: [MN, V] = fstat (M, N)
972 Compute mean and variance of the F distribution.
977 * M is the first parameter of the F distribution. The elements
978 of M must be positive
980 * N is the second parameter of the F distribution. The elements
981 of N must be positive
982 M and N must be of common size or one of them must be scalar
987 * MN is the mean of the F distribution. The mean is undefined
988 for N not greater than 2
990 * V is the variance of the F distribution. The variance is
991 undefined for N not greater than 4
998 [mn, v] = fstat (m, n)
1000 [mn, v] = fstat (m, 5)
1005 1. Wendy L. Martinez and Angel R. Martinez. `Computational
1006 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
1007 Chapman & Hall/CRC, 2001.
1009 2. Athanasios Papoulis. `Probability, Random Variables, and
1010 Stochastic Processes'. McGraw-Hill, New York, second edition,
1016 # name: <cell-element>
1020 Compute mean and variance of the F distribution.
1024 # name: <cell-element>
1031 # name: <cell-element>
1035 -- Function File: fullfact (N)
1036 Full factorial design.
1038 If N is a scalar, return the full factorial design with N binary
1041 If N is a vector, return the full factorial design with choices 1
1042 through N_I for each factor I.
1048 # name: <cell-element>
1052 Full factorial design.
1056 # name: <cell-element>
1063 # name: <cell-element>
1067 -- Function File: [A B] = gamfit (R)
1068 Finds the maximumlikelihood estimator for the Gamma distribution
1071 See also: gampdf, gaminv, gamrnd, gamlike
1077 # name: <cell-element>
1081 Finds the maximumlikelihood estimator for the Gamma distribution for R
1086 # name: <cell-element>
1093 # name: <cell-element>
1097 -- Function File: X = gamlike ([A B], R)
1098 Calculates the negative log-likelihood function for the Gamma
1099 distribution over vector R, with the given parameters A and B.
1101 See also: gampdf, gaminv, gamrnd, gamfit
1107 # name: <cell-element>
1111 Calculates the negative log-likelihood function for the Gamma
1116 # name: <cell-element>
1123 # name: <cell-element>
1127 -- Function File: [M, V] = gamstat (A, B)
1128 Compute mean and variance of the gamma distribution.
1133 * A is the first parameter of the gamma distribution. A must be
1136 * B is the second parameter of the gamma distribution. B must be
1138 A and B must be of common size or one of them must be scalar
1143 * M is the mean of the gamma distribution
1145 * V is the variance of the gamma distribution
1152 [m, v] = gamstat (a, b)
1154 [m, v] = gamstat (a, 1.5)
1159 1. Wendy L. Martinez and Angel R. Martinez. `Computational
1160 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
1161 Chapman & Hall/CRC, 2001.
1163 2. Athanasios Papoulis. `Probability, Random Variables, and
1164 Stochastic Processes'. McGraw-Hill, New York, second edition,
1170 # name: <cell-element>
1174 Compute mean and variance of the gamma distribution.
1178 # name: <cell-element>
1185 # name: <cell-element>
1189 -- Function File: geomean (X)
1190 -- Function File: geomean (X, DIM)
1191 Compute the geometric mean.
1193 This function does the same as `mean (x, "g")'.
1201 # name: <cell-element>
1205 Compute the geometric mean.
1209 # name: <cell-element>
1216 # name: <cell-element>
1220 -- Function File: [M, V] = geostat (P)
1221 Compute mean and variance of the geometric distribution.
1226 * P is the rate parameter of the geometric distribution. The
1227 elements of P must be probabilities
1232 * M is the mean of the geometric distribution
1234 * V is the variance of the geometric distribution
1240 [m, v] = geostat (p)
1245 1. Wendy L. Martinez and Angel R. Martinez. `Computational
1246 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
1247 Chapman & Hall/CRC, 2001.
1249 2. Athanasios Papoulis. `Probability, Random Variables, and
1250 Stochastic Processes'. McGraw-Hill, New York, second edition,
1256 # name: <cell-element>
1260 Compute mean and variance of the geometric distribution.
1264 # name: <cell-element>
1271 # name: <cell-element>
1275 -- Function File: harmmean (X)
1276 -- Function File: harmmean (X, DIM)
1277 Compute the harmonic mean.
1279 This function does the same as `mean (x, "h")'.
1287 # name: <cell-element>
1291 Compute the harmonic mean.
1295 # name: <cell-element>
1302 # name: <cell-element>
1306 -- Function File: histfit (DATA, NBINS)
1307 Plot histogram with superimposed fitted normal density.
1309 `histfit (DATA, NBINS)' plots a histogram of the values in the
1310 vector DATA using NBINS bars in the histogram. With one input
1311 argument, NBINS is set to the square root of the number of
1316 histfit (randn (100, 1))
1318 See also: bar, hist, pareto
1324 # name: <cell-element>
1328 Plot histogram with superimposed fitted normal density.
1332 # name: <cell-element>
1339 # name: <cell-element>
1343 -- Function File: [TRANSPROBEST, OUTPROBEST] = hmmestimate (SEQUENCE,
1345 -- Function File: hmmestimate (..., 'statenames', STATENAMES)
1346 -- Function File: hmmestimate (..., 'symbols', SYMBOLS)
1347 -- Function File: hmmestimate (..., 'pseudotransitions',
1349 -- Function File: hmmestimate (..., 'pseudoemissions',
1351 Estimate the matrix of transition probabilities and the matrix of
1352 output probabilities of a given sequence of outputs and states
1353 generated by a hidden Markov model. The model assumes that the
1354 generation starts in state `1' at step `0' but does not include
1355 step `0' in the generated states and sequence.
1360 * SEQUENCE is a vector of a sequence of given outputs. The
1361 outputs must be integers ranging from `1' to the number of
1362 outputs of the hidden Markov model.
1364 * STATES is a vector of the same length as SEQUENCE of given
1365 states. The states must be integers ranging from `1' to the
1366 number of states of the hidden Markov model.
1371 * TRANSPROBEST is the matrix of the estimated transition
1372 probabilities of the states. `transprobest(i, j)' is the
1373 estimated probability of a transition to state `j' given
1376 * OUTPROBEST is the matrix of the estimated output
1377 probabilities. `outprobest(i, j)' is the estimated
1378 probability of generating output `j' given state `i'.
1380 If `'symbols'' is specified, then SEQUENCE is expected to be a
1381 sequence of the elements of SYMBOLS instead of integers. SYMBOLS can
1384 If `'statenames'' is specified, then STATES is expected to be a
1385 sequence of the elements of STATENAMES instead of integers. STATENAMES
1386 can be a cell array.
1388 If `'pseudotransitions'' is specified then the integer matrix
1389 PSEUDOTRANSITIONS is used as an initial number of counted transitions.
1390 `pseudotransitions(i, j)' is the initial number of counted transitions
1391 from state `i' to state `j'. TRANSPROBEST will have the same size as
1392 PSEUDOTRANSITIONS. Use this if you have transitions that are very
1395 If `'pseudoemissions'' is specified then the integer matrix
1396 PSEUDOEMISSIONS is used as an initial number of counted outputs.
1397 `pseudoemissions(i, j)' is the initial number of counted outputs `j'
1398 given state `i'. If `'pseudoemissions'' is also specified then the
1399 number of rows of PSEUDOEMISSIONS must be the same as the number of
1400 rows of PSEUDOTRANSITIONS. OUTPROBEST will have the same size as
1401 PSEUDOEMISSIONS. Use this if you have outputs or states that are very
1407 transprob = [0.8, 0.2; 0.4, 0.6];
1408 outprob = [0.2, 0.4, 0.4; 0.7, 0.2, 0.1];
1409 [sequence, states] = hmmgenerate (25, transprob, outprob);
1410 [transprobest, outprobest] = hmmestimate (sequence, states)
1412 symbols = {'A', 'B', 'C'};
1413 statenames = {'One', 'Two'};
1414 [sequence, states] = hmmgenerate (25, transprob, outprob,
1415 'symbols', symbols, 'statenames', statenames);
1416 [transprobest, outprobest] = hmmestimate (sequence, states,
1418 'statenames', statenames)
1420 pseudotransitions = [8, 2; 4, 6];
1421 pseudoemissions = [2, 4, 4; 7, 2, 1];
1422 [sequence, states] = hmmgenerate (25, transprob, outprob);
1423 [transprobest, outprobest] = hmmestimate (sequence, states, 'pseudotransitions', pseudotransitions, 'pseudoemissions', pseudoemissions)
1428 1. Wendy L. Martinez and Angel R. Martinez. `Computational
1429 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
1430 Chapman & Hall/CRC, 2001.
1432 2. Lawrence R. Rabiner. A Tutorial on Hidden Markov Models and
1433 Selected Applications in Speech Recognition. `Proceedings of
1434 the IEEE', 77(2), pages 257-286, February 1989.
1439 # name: <cell-element>
1443 Estimate the matrix of transition probabilities and the matrix of output
1448 # name: <cell-element>
1455 # name: <cell-element>
1459 -- Function File: [SEQUENCE, STATES] = hmmgenerate (LEN, TRANSPROB,
1461 -- Function File: hmmgenerate (..., 'symbols', SYMBOLS)
1462 -- Function File: hmmgenerate (..., 'statenames', STATENAMES)
1463 Generate an output sequence and hidden states of a hidden Markov
1464 model. The model starts in state `1' at step `0' but will not
1465 include step `0' in the generated states and sequence.
1470 * LEN is the number of steps to generate. SEQUENCE and STATES
1471 will have LEN entries each.
1473 * TRANSPROB is the matrix of transition probabilities of the
1474 states. `transprob(i, j)' is the probability of a transition
1475 to state `j' given state `i'.
1477 * OUTPROB is the matrix of output probabilities. `outprob(i,
1478 j)' is the probability of generating output `j' given state
1484 * SEQUENCE is a vector of length LEN of the generated outputs.
1485 The outputs are integers ranging from `1' to `columns
1488 * STATES is a vector of length LEN of the generated hidden
1489 states. The states are integers ranging from `1' to `columns
1492 If `'symbols'' is specified, then the elements of SYMBOLS are used
1493 for the output sequence instead of integers ranging from `1' to
1494 `columns (outprob)'. SYMBOLS can be a cell array.
1496 If `'statenames'' is specified, then the elements of STATENAMES
1497 are used for the states instead of integers ranging from `1' to
1498 `columns (transprob)'. STATENAMES can be a cell array.
1503 transprob = [0.8, 0.2; 0.4, 0.6];
1504 outprob = [0.2, 0.4, 0.4; 0.7, 0.2, 0.1];
1505 [sequence, states] = hmmgenerate (25, transprob, outprob)
1507 symbols = {'A', 'B', 'C'};
1508 statenames = {'One', 'Two'};
1509 [sequence, states] = hmmgenerate (25, transprob, outprob,
1510 'symbols', symbols, 'statenames', statenames)
1515 1. Wendy L. Martinez and Angel R. Martinez. `Computational
1516 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
1517 Chapman & Hall/CRC, 2001.
1519 2. Lawrence R. Rabiner. A Tutorial on Hidden Markov Models and
1520 Selected Applications in Speech Recognition. `Proceedings of
1521 the IEEE', 77(2), pages 257-286, February 1989.
1526 # name: <cell-element>
1530 Generate an output sequence and hidden states of a hidden Markov model.
1534 # name: <cell-element>
1541 # name: <cell-element>
1545 -- Function File: VPATH = hmmviterbi (SEQUENCE, TRANSPROB, OUTPROB)
1546 -- Function File: hmmviterbi (..., 'symbols', SYMBOLS)
1547 -- Function File: hmmviterbi (..., 'statenames', STATENAMES)
1548 Use the Viterbi algorithm to find the Viterbi path of a hidden
1549 Markov model given a sequence of outputs. The model assumes that
1550 the generation starts in state `1' at step `0' but does not
1551 include step `0' in the generated states and sequence.
1556 * SEQUENCE is the vector of length LEN of given outputs. The
1557 outputs must be integers ranging from `1' to `columns
1560 * TRANSPROB is the matrix of transition probabilities of the
1561 states. `transprob(i, j)' is the probability of a transition
1562 to state `j' given state `i'.
1564 * OUTPROB is the matrix of output probabilities. `outprob(i,
1565 j)' is the probability of generating output `j' given state
1571 * VPATH is the vector of the same length as SEQUENCE of the
1572 estimated hidden states. The states are integers ranging from
1573 `1' to `columns (transprob)'.
1575 If `'symbols'' is specified, then SEQUENCE is expected to be a
1576 sequence of the elements of SYMBOLS instead of integers ranging from
1577 `1' to `columns (outprob)'. SYMBOLS can be a cell array.
1579 If `'statenames'' is specified, then the elements of STATENAMES
1580 are used for the states in VPATH instead of integers ranging from `1'
1581 to `columns (transprob)'. STATENAMES can be a cell array.
1586 transprob = [0.8, 0.2; 0.4, 0.6];
1587 outprob = [0.2, 0.4, 0.4; 0.7, 0.2, 0.1];
1588 [sequence, states] = hmmgenerate (25, transprob, outprob)
1589 vpath = hmmviterbi (sequence, transprob, outprob)
1591 symbols = {'A', 'B', 'C'};
1592 statenames = {'One', 'Two'};
1593 [sequence, states] = hmmgenerate (25, transprob, outprob,
1594 'symbols', symbols, 'statenames', statenames)
1595 vpath = hmmviterbi (sequence, transprob, outprob,
1596 'symbols', symbols, 'statenames', statenames)
1601 1. Wendy L. Martinez and Angel R. Martinez. `Computational
1602 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
1603 Chapman & Hall/CRC, 2001.
1605 2. Lawrence R. Rabiner. A Tutorial on Hidden Markov Models and
1606 Selected Applications in Speech Recognition. `Proceedings of
1607 the IEEE', 77(2), pages 257-286, February 1989.
1612 # name: <cell-element>
1616 Use the Viterbi algorithm to find the Viterbi path of a hidden Markov
1621 # name: <cell-element>
1628 # name: <cell-element>
1632 -- Function File: [MN, V] = hygestat (T, M, N)
1633 Compute mean and variance of the hypergeometric distribution.
1638 * T is the total size of the population of the hypergeometric
1639 distribution. The elements of T must be positive natural
1642 * M is the number of marked items of the hypergeometric
1643 distribution. The elements of M must be natural numbers
1645 * N is the size of the drawn sample of the hypergeometric
1646 distribution. The elements of N must be positive natural
1648 T, M, and N must be of common size or scalar
1653 * MN is the mean of the hypergeometric distribution
1655 * V is the variance of the hypergeometric distribution
1663 [mn, v] = hygestat (t, m, n)
1665 [mn, v] = hygestat (t, m, 2)
1670 1. Wendy L. Martinez and Angel R. Martinez. `Computational
1671 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
1672 Chapman & Hall/CRC, 2001.
1674 2. Athanasios Papoulis. `Probability, Random Variables, and
1675 Stochastic Processes'. McGraw-Hill, New York, second edition,
1681 # name: <cell-element>
1685 Compute mean and variance of the hypergeometric distribution.
1689 # name: <cell-element>
1696 # name: <cell-element>
1700 -- Function File: JACKSTAT = jackknife (E, X, ...)
1701 Compute jackknife estimates of a parameter taking one or more
1702 given samples as parameters. In particular, E is the estimator to
1703 be jackknifed as a function name, handle, or inline function, and
1704 X is the sample for which the estimate is to be taken. The I-th
1705 entry of JACKSTAT will contain the value of the estimator on the
1706 sample X with its I-th row omitted.
1708 jackstat(I) = E(X(1 : I - 1, I + 1 : length(X)))
1710 Depending on the number of samples to be used, the estimator must
1711 have the appropriate form: If only one sample is used, then the
1712 estimator need not be concerned with cell arrays, for example
1713 jackknifing the standard deviation of a sample can be performed
1714 with `JACKSTAT = jackknife (@std, rand (100, 1))'. If, however,
1715 more than one sample is to be used, the samples must all be of
1716 equal size, and the estimator must address them as elements of a
1717 cell-array, in which they are aggregated in their order of
1720 JACKSTAT = jackknife(@(x) std(x{1})/var(x{2}), rand (100, 1), randn (100, 1)
1722 If all goes well, a theoretical value P for the parameter is
1723 already known, N is the sample size, `T = N * E(X) - (N - 1) *
1724 mean(JACKSTAT)', and `V = sumsq(N * E(X) - (N - 1) * JACKSTAT - T)
1725 / (N * (N - 1))', then `(T-P)/sqrt(V)' should follow a
1726 t-distribution with N-1 degrees of freedom.
1728 Jackknifing is a well known method to reduce bias; further details
1730 * Rupert G. Miller: The jackknife-a review; Biometrika (1974)
1731 61(1): 1-15; doi:10.1093/biomet/61.1.1
1733 * Rupert G. Miller: Jackknifing Variances; Ann. Math. Statist.
1734 Volume 39, Number 2 (1968), 567-582;
1735 doi:10.1214/aoms/1177698418
1737 * M. H. Quenouille: Notes on Bias in Estimation; Biometrika
1738 Vol. 43, No. 3/4 (Dec., 1956), pp. 353-360;
1739 doi:10.1093/biomet/43.3-4.353
1744 # name: <cell-element>
1748 Compute jackknife estimates of a parameter taking one or more given
1753 # name: <cell-element>
1760 # name: <cell-element>
1764 -- Function File: jsucdf (X, ALPHA1, ALPHA2)
1765 For each element of X, compute the cumulative distribution
1766 function (CDF) at X of the Johnson SU distribution with shape
1767 parameters ALPHA1 and ALPHA2.
1769 Default values are ALPHA1 = 1, ALPHA2 = 1.
1774 # name: <cell-element>
1778 For each element of X, compute the cumulative distribution function
1783 # name: <cell-element>
1790 # name: <cell-element>
1794 -- Function File: jsupdf (X, ALPHA1, ALPHA2)
1795 For each element of X, compute the probability density function
1796 (PDF) at X of the Johnson SU distribution with shape parameters
1799 Default values are ALPHA1 = 1, ALPHA2 = 1.
1804 # name: <cell-element>
1808 For each element of X, compute the probability density function (PDF)
1813 # name: <cell-element>
1820 # name: <cell-element>
1824 -- Function File: [IDX, CENTERS] = kmeans (DATA, K, PARAM1, VALUE1,
1834 # name: <cell-element>
1842 # name: <cell-element>
1849 # name: <cell-element>
1853 -- Function File: Y = linkage (D)
1854 -- Function File: Y = linkage (D, METHOD)
1855 -- Function File: Y = linkage (X, METHOD, METRIC)
1856 -- Function File: Y = linkage (X, METHOD, ARGLIST)
1857 Produce a hierarchical clustering dendrogram
1859 D is the dissimilarity matrix relative to N observations,
1860 formatted as a (n-1)*n/2x1 vector as produced by `pdist'.
1861 Alternatively, X contains data formatted for input to `pdist',
1862 METRIC is a metric for `pdist' and ARGLIST is a cell array
1863 containing arguments that are passed to `pdist'.
1865 `linkage' starts by putting each observation into a singleton
1866 cluster and numbering those from 1 to N. Then it merges two
1867 clusters, chosen according to METHOD, to create a new cluster
1868 numbered N+1, and so on until all observations are grouped into a
1869 single cluster numbered 2*N-1. Row M of the m-1x3 output matrix
1870 relates to cluster n+m: the first two columns are the numbers of
1871 the two component clusters and column 3 contains their distance.
1873 METHOD defines the way the distance between two clusters is
1874 computed and how they are recomputed when two clusters are merged:
1876 `"single" (default)'
1877 Distance between two clusters is the minimum distance between
1878 two elements belonging each to one cluster. Produces a
1879 cluster tree known as minimum spanning tree.
1882 Furthest distance between two elements belonging each to one
1886 Unweighted pair group method with averaging (UPGMA). The
1887 mean distance between all pair of elements each belonging to
1891 Weighted pair group method with averaging (WPGMA). When two
1892 clusters A and B are joined together, the new distance to a
1893 cluster C is the mean between distances A-C and B-C.
1896 Unweighted Pair-Group Method using Centroids (UPGMC).
1897 Assumes Euclidean metric. The distance between cluster
1898 centroids, each centroid being the center of mass of a
1902 Weighted pair-group method using centroids (WPGMC). Assumes
1903 Euclidean metric. Distance between cluster centroids. When
1904 two clusters are joined together, the new centroid is the
1905 midpoint between the joined centroids.
1908 Ward's sum of squared deviations about the group mean (ESS).
1909 Also known as minimum variance or inner squared distance.
1910 Assumes Euclidean metric. How much the moment of inertia of
1911 the merged cluster exceeds the sum of those of the individual
1914 *Reference* Ward, J. H. Hierarchical Grouping to Optimize an
1915 Objective Function J. Am. Statist. Assoc. 1963, 58, 236-244,
1916 `http://iv.slis.indiana.edu/sw/data/ward.pdf'.
1918 See also: pdist, squareform
1923 # name: <cell-element>
1927 Produce a hierarchical clustering dendrogram
1932 # name: <cell-element>
1939 # name: <cell-element>
1943 -- Function File: [M, V] = lognstat (MU, SIGMA)
1944 Compute mean and variance of the lognormal distribution.
1949 * MU is the first parameter of the lognormal distribution
1951 * SIGMA is the second parameter of the lognormal distribution.
1952 SIGMA must be positive or zero
1953 MU and SIGMA must be of common size or one of them must be scalar
1958 * M is the mean of the lognormal distribution
1960 * V is the variance of the lognormal distribution
1966 sigma = 0.2:0.2:1.2;
1967 [m, v] = lognstat (mu, sigma)
1969 [m, v] = lognstat (0, sigma)
1974 1. Wendy L. Martinez and Angel R. Martinez. `Computational
1975 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
1976 Chapman & Hall/CRC, 2001.
1978 2. Athanasios Papoulis. `Probability, Random Variables, and
1979 Stochastic Processes'. McGraw-Hill, New York, second edition,
1985 # name: <cell-element>
1989 Compute mean and variance of the lognormal distribution.
1993 # name: <cell-element>
2000 # name: <cell-element>
2004 -- Function File: mad (X)
2005 -- Function File: mad (X, FLAG)
2006 -- Function File: mad (X, FLAG, DIM)
2007 Compute the mean/median absolute deviation of X.
2009 The mean absolute deviation is computed as
2011 mean (abs (X - mean (X)))
2013 and the median absolute deviation is computed as
2015 median (abs (X - median (X)))
2017 Elements of X containing NaN or NA values are ignored during
2020 If FLAG is 0, the absolute mean deviation is computed, and if FLAG
2021 is 1, the absolute median deviation is computed. By default FLAG
2024 This is done along the dimension DIM of X. If this variable is not
2025 given, the mean/median absolute deviation s computed along the
2026 smallest dimension of X.
2034 # name: <cell-element>
2038 Compute the mean/median absolute deviation of X.
2042 # name: <cell-element>
2049 # name: <cell-element>
2053 -- Function File: Y = mnpdf (X, P)
2054 Compute the probability density function of the multinomial
2060 * X is vector with a single sample of a multinomial
2061 distribution with parameter P or a matrix of random samples
2062 from multinomial distributions. In the latter case, each row
2063 of X is a sample from a multinomial distribution with the
2064 corresponding row of P being its parameter.
2066 * P is a vector with the probabilities of the categories or a
2067 matrix with each row containing the probabilities of a
2073 * Y is a vector of probabilites of the random samples X from the
2074 multinomial distribution with corresponding parameter P. The
2075 parameter N of the multinomial distribution is the sum of the
2076 elements of each row of X. The length of Y is the number of
2077 columns of X. If a row of P does not sum to `1', then the
2078 corresponding element of Y will be `NaN'.
2084 p = [0.2, 0.5, 0.3];
2087 x = [1, 4, 2; 1, 0, 9];
2088 p = [0.2, 0.5, 0.3; 0.1, 0.1, 0.8];
2094 1. Wendy L. Martinez and Angel R. Martinez. `Computational
2095 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
2096 Chapman & Hall/CRC, 2001.
2098 2. Merran Evans, Nicholas Hastings and Brian Peacock.
2099 `Statistical Distributions'. pages 134-136, Wiley, New York,
2100 third edition, 2000.
2105 # name: <cell-element>
2109 Compute the probability density function of the multinomial
2114 # name: <cell-element>
2121 # name: <cell-element>
2125 -- Function File: X = mnrnd (N, P)
2126 -- Function File: X = mnrnd (N, P, S)
2127 Generate random samples from the multinomial distribution.
2132 * N is the first parameter of the multinomial distribution. N
2133 can be scalar or a vector containing the number of trials of
2134 each multinomial sample. The elements of N must be
2135 non-negative integers.
2137 * P is the second parameter of the multinomial distribution. P
2138 can be a vector with the probabilities of the categories or a
2139 matrix with each row containing the probabilities of a
2140 multinomial sample. If P has more than one row and N is
2141 non-scalar, then the number of rows of P must match the
2142 number of elements of N.
2144 * S is the number of multinomial samples to be generated. S must
2145 be a non-negative integer. If S is specified, then N must be
2146 scalar and P must be a vector.
2151 * X is a matrix of random samples from the multinomial
2152 distribution with corresponding parameters N and P. Each row
2153 corresponds to one multinomial sample. The number of columns,
2154 therefore, corresponds to the number of columns of P. If S is
2155 not specified, then the number of rows of X is the maximum of
2156 the number of elements of N and the number of rows of P. If a
2157 row of P does not sum to `1', then the corresponding row of X
2158 will contain only `NaN' values.
2164 p = [0.2, 0.5, 0.3];
2167 n = 10 * ones (3, 1);
2168 p = [0.2, 0.5, 0.3];
2172 p = [0.2, 0.5, 0.3; 0.1, 0.1, 0.8];
2178 1. Wendy L. Martinez and Angel R. Martinez. `Computational
2179 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
2180 Chapman & Hall/CRC, 2001.
2182 2. Merran Evans, Nicholas Hastings and Brian Peacock.
2183 `Statistical Distributions'. pages 134-136, Wiley, New York,
2184 third edition, 2000.
2189 # name: <cell-element>
2193 Generate random samples from the multinomial distribution.
2197 # name: <cell-element>
2204 # name: <cell-element>
2208 -- Function File: YY = monotone_smooth (X, Y, H)
2209 Produce a smooth monotone increasing approximation to a sampled
2210 functional dependence y(x) using a kernel method (an Epanechnikov
2211 smoothing kernel is applied to y(x); this is integrated to yield
2212 the monotone increasing form. See Reference 1 for details.)
2217 * X is a vector of values of the independent variable.
2219 * Y is a vector of values of the dependent variable, of the
2220 same size as X. For best performance, it is recommended that
2221 the Y already be fairly smooth, e.g. by applying a kernel
2222 smoothing to the original values if they are noisy.
2224 * H is the kernel bandwidth to use. If H is not given, a
2225 "reasonable" value is computed.
2231 * YY is the vector of smooth monotone increasing function
2239 y = (x .^ 2) + 3 * randn(size(x)); %typically non-monotonic from the added noise
2240 ys = ([y(1) y(1:(end-1))] + y + [y(2:end) y(end)])/3; %crudely smoothed via
2241 moving average, but still typically non-monotonic
2242 yy = monotone_smooth(x, ys); %yy is monotone increasing in x
2243 plot(x, y, '+', x, ys, x, yy)
2248 1. Holger Dette, Natalie Neumeyer and Kay F. Pilz (2006), A
2249 simple nonparametric estimator of a strictly monotone
2250 regression function, `Bernoulli', 12:469-490
2252 2. Regine Scheder (2007), R Package 'monoProc', Version 1.0-6,
2253 `http://cran.r-project.org/web/packages/monoProc/monoProc.pdf'
2254 (The implementation here is based on the monoProc function
2260 # name: <cell-element>
2264 Produce a smooth monotone increasing approximation to a sampled
2269 # name: <cell-element>
2276 # name: <cell-element>
2280 -- Function File: P = mvncdf (X, MU, SIGMA)
2281 -- Function File: mvncdf (A, X, MU, SIGMA)
2282 -- Function File: [P, ERR] = mvncdf (...)
2283 Compute the cumulative distribution function of the multivariate
2284 normal distribution.
2289 * X is the upper limit for integration where each row
2290 corresponds to an observation.
2294 * SIGMA is the correlation matrix.
2296 * A is the lower limit for integration where each row
2297 corresponds to an observation. A must have the same size as X.
2302 * P is the cumulative distribution at each row of X and A.
2304 * ERR is the estimated error.
2311 sigma = [1.0 0.5; 0.5 1.0];
2312 p = mvncdf (x, mu, sigma)
2315 p = mvncdf (a, x, mu, sigma)
2320 1. Alan Genz and Frank Bretz. Numerical Computation of
2321 Multivariate t-Probabilities with Application to Power
2322 Calculation of Multiple Constrasts. `Journal of Statistical
2323 Computation and Simulation', 63, pages 361-378, 1999.
2328 # name: <cell-element>
2332 Compute the cumulative distribution function of the multivariate normal
2337 # name: <cell-element>
2344 # name: <cell-element>
2348 -- Function File: Y = mvnpdf (X)
2349 -- Function File: Y = mvnpdf (X, MU)
2350 -- Function File: Y = mvnpdf (X, MU, SIGMA)
2351 Compute multivariate normal pdf for X given mean MU and covariance
2352 matrix SIGMA. The dimension of X is D x P, MU is 1 x P and SIGMA
2353 is P x P. The normal pdf is defined as
2355 1/Y^2 = (2 pi)^P |SIGMA| exp { (X-MU)' inv(SIGMA) (X-MU) }
2359 NIST Engineering Statistics Handbook 6.5.4.2
2360 http://www.itl.nist.gov/div898/handbook/pmc/section5/pmc542.htm
2364 Using Cholesky factorization on the positive definite covariance
2369 where R'*R = SIGMA. Being upper triangular, the determinant of R
2370 is trivially the product of the diagonal, and the determinant of
2371 SIGMA is the square of this:
2373 DET = prod (diag (R))^2;
2375 The formula asks for the square root of the determinant, so no
2378 The exponential argument A = X' * inv (SIGMA) * X
2380 A = X' * inv (SIGMA) * X
2381 = X' * inv (R' * R) * X
2382 = X' * inv (R) * inv(R') * X
2384 Given that inv (R') == inv(R)', at least in theory if not
2387 A = (X' / R) * (X'/R)' = sumsq (X'/R)
2389 The interface takes the parameters to the multivariate normal in
2390 columns rather than rows, so we are actually dealing with the
2395 and the final result is:
2398 Y = (2*pi)^(-P/2) * exp (-sumsq ((X-MU)/R, 2)/2) / prod (diag (R))
2400 See also: mvncdf, mvnrnd
2406 # name: <cell-element>
2410 Compute multivariate normal pdf for X given mean MU and covariance
2415 # name: <cell-element>
2422 # name: <cell-element>
2426 -- Function File: S = mvnrnd (MU, SIGMA)
2427 -- Function File: S = mvnrnd (MU, SIGMA, N)
2428 Draw N random D-dimensional vectors from a multivariate Gaussian
2429 distribution with mean MU(NxD) and covariance matrix SIGMA(DxD).
2434 # name: <cell-element>
2438 Draw N random D-dimensional vectors from a multivariate Gaussian
2443 # name: <cell-element>
2450 # name: <cell-element>
2454 -- Function File: P = mvtcdf (X, SIGMA, NU)
2455 -- Function File: mvtcdf (A, X, SIGMA, NU)
2456 -- Function File: [P, ERR] = mvtcdf (...)
2457 Compute the cumulative distribution function of the multivariate
2458 Student's t distribution.
2463 * X is the upper limit for integration where each row
2464 corresponds to an observation.
2466 * SIGMA is the correlation matrix.
2468 * NU is the degrees of freedom.
2470 * A is the lower limit for integration where each row
2471 corresponds to an observation. A must have the same size as X.
2476 * P is the cumulative distribution at each row of X and A.
2478 * ERR is the estimated error.
2484 sigma = [1.0 0.5; 0.5 1.0];
2486 p = mvtcdf (x, sigma, nu)
2489 p = mvtcdf (a, x, sigma, nu)
2494 1. Alan Genz and Frank Bretz. Numerical Computation of
2495 Multivariate t-Probabilities with Application to Power
2496 Calculation of Multiple Constrasts. `Journal of Statistical
2497 Computation and Simulation', 63, pages 361-378, 1999.
2502 # name: <cell-element>
2506 Compute the cumulative distribution function of the multivariate
2511 # name: <cell-element>
2518 # name: <cell-element>
2522 -- Function File: X = mvtrnd (SIGMA, NU)
2523 -- Function File: X = mvtrnd (SIGMA, NU, N)
2524 Generate random samples from the multivariate t-distribution.
2529 * SIGMA is the matrix of correlation coefficients. If there are
2530 any non-unit diagonal elements then SIGMA will be normalized.
2532 * NU is the degrees of freedom for the multivariate
2533 t-distribution. NU must be a vector with the same number of
2534 elements as samples to be generated or be scalar.
2536 * N is the number of rows of the matrix to be generated. N must
2537 be a non-negative integer and corresponds to the number of
2538 samples to be generated.
2543 * X is a matrix of random samples from the multivariate
2544 t-distribution with N row samples.
2549 sigma = [1, 0.5; 0.5, 1];
2552 x = mvtrnd (sigma, nu, n);
2554 sigma = [1, 0.5; 0.5, 1];
2557 x = mvtrnd (sigma, nu, 2);
2562 1. Wendy L. Martinez and Angel R. Martinez. `Computational
2563 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
2564 Chapman & Hall/CRC, 2001.
2566 2. Samuel Kotz and Saralees Nadarajah. `Multivariate t
2567 Distributions and Their Applications'. Cambridge University
2568 Press, Cambridge, 2004.
2573 # name: <cell-element>
2577 Generate random samples from the multivariate t-distribution.
2581 # name: <cell-element>
2588 # name: <cell-element>
2592 -- Function File: [V, IDX] = nanmax (X)
2593 -- Function File: [V, IDX] = nanmax (X, Y)
2594 Find the maximal element while ignoring NaN values.
2596 `nanmax' is identical to the `max' function except that NaN values
2597 are ignored. If all values in a column are NaN, the maximum is
2598 returned as NaN rather than [].
2600 See also: max, nansum, nanmin, nanmean, nanmedian
2606 # name: <cell-element>
2610 Find the maximal element while ignoring NaN values.
2614 # name: <cell-element>
2621 # name: <cell-element>
2625 -- Function File: V = nanmean (X)
2626 -- Function File: V = nanmean (X, DIM)
2627 Compute the mean value while ignoring NaN values.
2629 `nanmean' is identical to the `mean' function except that NaN
2630 values are ignored. If all values are NaN, the mean is returned
2633 See also: mean, nanmin, nanmax, nansum, nanmedian
2639 # name: <cell-element>
2643 Compute the mean value while ignoring NaN values.
2647 # name: <cell-element>
2654 # name: <cell-element>
2658 -- Function File: V = nanmedian (X)
2659 -- Function File: V = nanmedian (X, DIM)
2660 Compute the median of data while ignoring NaN values.
2662 This function is identical to the `median' function except that
2663 NaN values are ignored. If all values are NaN, the median is
2666 See also: median, nanmin, nanmax, nansum, nanmean
2672 # name: <cell-element>
2676 Compute the median of data while ignoring NaN values.
2680 # name: <cell-element>
2687 # name: <cell-element>
2691 -- Function File: [V, IDX] = nanmin (X)
2692 -- Function File: [V, IDX] = nanmin (X, Y)
2693 Find the minimal element while ignoring NaN values.
2695 `nanmin' is identical to the `min' function except that NaN values
2696 are ignored. If all values in a column are NaN, the minimum is
2697 returned as NaN rather than [].
2699 See also: min, nansum, nanmax, nanmean, nanmedian
2705 # name: <cell-element>
2709 Find the minimal element while ignoring NaN values.
2713 # name: <cell-element>
2720 # name: <cell-element>
2724 -- Function File: V = nanstd (X)
2725 -- Function File: V = nanstd (X, OPT)
2726 -- Function File: V = nanstd (X, OPT, DIM)
2727 Compute the standard deviation while ignoring NaN values.
2729 `nanstd' is identical to the `std' function except that NaN values
2730 are ignored. If all values are NaN, the standard deviation is
2731 returned as NaN. If there is only a single non-NaN value, the
2732 deviation is returned as 0.
2734 The argument OPT determines the type of normalization to use.
2738 normalizes with N-1, provides the square root of best
2739 unbiased estimator of the variance [default]
2742 normalizes with N, this provides the square root of the
2743 second moment around the mean
2745 The third argument DIM determines the dimension along which the
2746 standard deviation is calculated.
2748 See also: std, nanmin, nanmax, nansum, nanmedian, nanmean
2754 # name: <cell-element>
2758 Compute the standard deviation while ignoring NaN values.
2762 # name: <cell-element>
2769 # name: <cell-element>
2773 -- Function File: V = nansum (X)
2774 -- Function File: V = nansum (X, DIM)
2775 Compute the sum while ignoring NaN values.
2777 `nansum' is identical to the `sum' function except that NaN values
2778 are treated as 0 and so ignored. If all values are NaN, the sum is
2781 See also: sum, nanmin, nanmax, nanmean, nanmedian
2787 # name: <cell-element>
2791 Compute the sum while ignoring NaN values.
2795 # name: <cell-element>
2802 # name: <cell-element>
2806 -- Function File: nanvar (X)
2807 -- Function File: V = nanvar (X, OPT)
2808 -- Function File: V = nanvar (X, OPT, DIM)
2809 Compute the variance while ignoring NaN values.
2811 For vector arguments, return the (real) variance of the values.
2812 For matrix arguments, return a row vector containing the variance
2815 The argument OPT determines the type of normalization to use.
2819 Normalizes with N-1, provides the best unbiased estimator of
2820 the variance [default].
2823 Normalizes with N, this provides the second moment around the
2826 The third argument DIM determines the dimension along which the
2827 variance is calculated.
2829 See also: var, nanmean, nanstd, nanmax, nanmin
2835 # name: <cell-element>
2839 Compute the variance while ignoring NaN values.
2843 # name: <cell-element>
2850 # name: <cell-element>
2854 -- Function File: [M, V] = nbinstat (N, P)
2855 Compute mean and variance of the negative binomial distribution.
2860 * N is the first parameter of the negative binomial
2861 distribution. The elements of N must be natural numbers
2863 * P is the second parameter of the negative binomial
2864 distribution. The elements of P must be probabilities
2865 N and P must be of common size or one of them must be scalar
2870 * M is the mean of the negative binomial distribution
2872 * V is the variance of the negative binomial distribution
2879 [m, v] = nbinstat (n, p)
2881 [m, v] = nbinstat (n, 0.5)
2886 1. Wendy L. Martinez and Angel R. Martinez. `Computational
2887 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
2888 Chapman & Hall/CRC, 2001.
2890 2. Athanasios Papoulis. `Probability, Random Variables, and
2891 Stochastic Processes'. McGraw-Hill, New York, second edition,
2897 # name: <cell-element>
2901 Compute mean and variance of the negative binomial distribution.
2905 # name: <cell-element>
2909 normalise_distribution
2912 # name: <cell-element>
2916 -- Function File: NORMALISED = normalise_distribution (DATA)
2917 -- Function File: NORMALISED = normalise_distribution (DATA,
2919 -- Function File: NORMALISED = normalise_distribution (DATA,
2920 DISTRIBUTION, DIMENSION)
2921 Transform a set of data so as to be N(0,1) distributed according
2922 to an idea by van Albada and Robinson. This is achieved by first
2923 passing it through its own cumulative distribution function (CDF)
2924 in order to get a uniform distribution, and then mapping the
2925 uniform to a normal distribution. The data must be passed as a
2926 vector or matrix in DATA. If the CDF is unknown, then [] can be
2927 passed in DISTRIBUTION, and in this case the empirical CDF will be
2928 used. Otherwise, if the CDFs for all data are known, they can be
2929 passed in DISTRIBUTION, either in the form of a single function
2930 name as a string, or a single function handle, or a cell array
2931 consisting of either all function names as strings, or all
2932 function handles. In the latter case, the number of CDFs passed
2933 must match the number of rows, or columns respectively, to
2934 normalise. If the data are passed as a matrix, then the
2935 transformation will operate either along the first non-singleton
2936 dimension, or along DIMENSION if present.
2938 Notes: The empirical CDF will map any two sets of data having the
2939 same size and their ties in the same places after sorting to some
2940 permutation of the same normalised data:
2941 `normalise_distribution([1 2 2 3 4])'
2942 => -1.28 0.00 0.00 0.52 1.28
2944 `normalise_distribution([1 10 100 10 1000])'
2945 => -1.28 0.00 0.52 0.00 1.28
2947 Original source: S.J. van Albada, P.A. Robinson "Transformation of
2948 arbitrary distributions to the normal distribution with
2949 application to EEG test-retest reliability" Journal of
2950 Neuroscience Methods, Volume 161, Issue 2, 15 April 2007, Pages
2951 205-211 ISSN 0165-0270, 10.1016/j.jneumeth.2006.11.004.
2952 (http://www.sciencedirect.com/science/article/pii/S0165027006005668)
2957 # name: <cell-element>
2961 Transform a set of data so as to be N(0,1) distributed according to an
2966 # name: <cell-element>
2973 # name: <cell-element>
2977 -- Function File: normplot (X)
2978 Produce a normal probability plot for each column of X.
2980 The line joing the 1st and 3rd quantile is drawn on the graph. If
2981 the underlying distribution is normal, the points will cluster
2984 Note that this function sets the title, xlabel, ylabel, axis,
2985 grid, tics and hold properties of the graph. These need to be
2986 cleared before subsequent graphs using 'clf'.
2991 # name: <cell-element>
2995 Produce a normal probability plot for each column of X.
2999 # name: <cell-element>
3006 # name: <cell-element>
3010 -- Function File: [MN, V] = normstat (M, S)
3011 Compute mean and variance of the normal distribution.
3016 * M is the mean of the normal distribution
3018 * S is the standard deviation of the normal distribution. S
3020 M and S must be of common size or one of them must be scalar
3025 * MN is the mean of the normal distribution
3027 * V is the variance of the normal distribution
3034 [mn, v] = normstat (m, s)
3036 [mn, v] = normstat (0, s)
3041 1. Wendy L. Martinez and Angel R. Martinez. `Computational
3042 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
3043 Chapman & Hall/CRC, 2001.
3045 2. Athanasios Papoulis. `Probability, Random Variables, and
3046 Stochastic Processes'. McGraw-Hill, New York, second edition,
3052 # name: <cell-element>
3056 Compute mean and variance of the normal distribution.
3060 # name: <cell-element>
3067 # name: <cell-element>
3071 -- Function File: Y = pdist (X)
3072 -- Function File: Y = pdist (X, METRIC)
3073 -- Function File: Y = pdist (X, METRIC, METRICARG, ...)
3074 Return the distance between any two rows in X.
3076 X is the NxD matrix representing Q row vectors of size D.
3078 The output is a dissimilarity matrix formatted as a row vector Y,
3079 (n-1)*n/2 long, where the distances are in the order [(1, 2) (1,
3080 3) ... (2, 3) ... (n-1, n)]. You can use the `squareform'
3081 function to display the distances between the vectors arranged
3084 `metric' is an optional argument specifying how the distance is
3085 computed. It can be any of the following ones, defaulting to
3086 "euclidean", or a user defined function that takes two arguments X
3087 and Y plus any number of optional arguments, where X is a row
3088 vector and and Y is a matrix having the same number of columns as
3089 X. `metric' returns a column vector where row I is the distance
3090 between X and row I of Y. Any additional arguments after the
3091 `metric' are passed as metric (X, Y, METRICARG1, METRICARG2 ...).
3093 Predefined distance functions are:
3096 Euclidean distance (default).
3099 Standardized Euclidean distance. Each coordinate in the sum of
3100 squares is inverse weighted by the sample variance of that
3104 Mahalanobis distance: see the function mahalanobis.
3107 City Block metric, aka Manhattan distance.
3110 Minkowski metric. Accepts a numeric parameter P: for P=1
3111 this is the same as the cityblock metric, with P=2 (default)
3112 it is equal to the euclidean metric.
3115 One minus the cosine of the included angle between rows, seen
3119 One minus the sample correlation between points (treated as
3120 sequences of values).
3123 One minus the sample Spearman's rank correlation between
3124 observations, treated as sequences of values.
3127 Hamming distance: the quote of the number of coordinates that
3131 One minus the Jaccard coefficient, the quote of nonzero
3132 coordinates that differ.
3135 Chebychev distance: the maximum coordinate difference.
3137 See also: linkage, mahalanobis, squareform
3143 # name: <cell-element>
3147 Return the distance between any two rows in X.
3151 # name: <cell-element>
3158 # name: <cell-element>
3162 -- Function File: [M, V] = poisstat (LAMBDA)
3163 Compute mean and variance of the Poisson distribution.
3168 * LAMBDA is the parameter of the Poisson distribution. The
3169 elements of LAMBDA must be positive
3174 * M is the mean of the Poisson distribution
3176 * V is the variance of the Poisson distribution
3181 lambda = 1 ./ (1:6);
3182 [m, v] = poisstat (lambda)
3187 1. Wendy L. Martinez and Angel R. Martinez. `Computational
3188 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
3189 Chapman & Hall/CRC, 2001.
3191 2. Athanasios Papoulis. `Probability, Random Variables, and
3192 Stochastic Processes'. McGraw-Hill, New York, second edition,
3198 # name: <cell-element>
3202 Compute mean and variance of the Poisson distribution.
3206 # name: <cell-element>
3213 # name: <cell-element>
3217 -- Function File: [PC, Z, W, TSQ] = princomp (X)
3218 Compute principal components of X.
3220 The first output argument PC is the principal components of X.
3221 The second Z is the transformed data, and W is the eigenvalues of
3222 the covariance matrix of X. TSQ is the Hotelling's T^2 statistic
3223 for the transformed data.
3228 # name: <cell-element>
3232 Compute principal components of X.
3236 # name: <cell-element>
3243 # name: <cell-element>
3247 -- Function File: R = random(NAME, ARG1)
3248 -- Function File: R = random(NAME, ARG1, ARG2)
3249 -- Function File: R = random(NAME, ARG1, ARG2, ARG3)
3250 -- Function File: R = random(NAME, ..., S1, ...)
3251 Generates pseudo-random numbers from a given one-, two-, or
3252 three-parameter distribution.
3254 The variable NAME must be a string that names the distribution from
3255 which to sample. If this distribution is a one-parameter
3256 distribution ARG1 should be supplied, if it is a two-paramter
3257 distribution ARG2 must also be supplied, and if it is a
3258 three-parameter distribution ARG3 must also be present. Any
3259 arguments following the distribution paramters will determine the
3262 As an example, the following code generates a 10 by 20 matrix
3263 containing random numbers from a normal distribution with mean 5
3264 and standard deviation 2.
3265 R = random("normal", 5, 2, [10, 20]);
3267 The variable NAME can be one of the following strings
3271 Samples are drawn from the Beta distribution.
3275 "binomial distribution"
3276 Samples are drawn from the Binomial distribution.
3280 "chi-square distribution"
3281 Samples are drawn from the Chi-Square distribution.
3285 "exponential distribution"
3286 Samples are drawn from the Exponential distribution.
3290 Samples are drawn from the F distribution.
3294 "gamma distribution"
3295 Samples are drawn from the Gamma distribution.
3299 "geometric distribution"
3300 Samples are drawn from the Geometric distribution.
3304 "hypergeometric distribution"
3305 Samples are drawn from the Hypergeometric distribution.
3309 "lognormal distribution"
3310 Samples are drawn from the Log-Normal distribution.
3314 "negative binomial distribution"
3315 Samples are drawn from the Negative Binomial distribution.
3319 "normal distribution"
3320 Samples are drawn from the Normal distribution.
3324 "poisson distribution"
3325 Samples are drawn from the Poisson distribution.
3329 "rayleigh distribution"
3330 Samples are drawn from the Rayleigh distribution.
3334 Samples are drawn from the T distribution.
3338 "uniform distribution"
3339 Samples are drawn from the Uniform distribution.
3343 "discrete uniform distribution"
3344 Samples are drawn from the Uniform Discrete distribution.
3348 "weibull distribution"
3349 Samples are drawn from the Weibull distribution.
3351 See also: rand, betarnd, binornd, chi2rnd, exprnd, frnd, gamrnd,
3352 geornd, hygernd, lognrnd, nbinrnd, normrnd, poissrnd, raylrnd,
3353 trnd, unifrnd, unidrnd, wblrnd
3359 # name: <cell-element>
3363 Generates pseudo-random numbers from a given one-, two-, or
3364 three-parameter dist
3368 # name: <cell-element>
3375 # name: <cell-element>
3379 -- Function File: P = raylcdf (X, SIGMA)
3380 Compute the cumulative distribution function of the Rayleigh
3386 * X is the support. The elements of X must be non-negative.
3388 * SIGMA is the parameter of the Rayleigh distribution. The
3389 elements of SIGMA must be positive.
3390 X and SIGMA must be of common size or one of them must be scalar.
3395 * P is the cumulative distribution of the Rayleigh distribution
3396 at each element of X and corresponding parameter SIGMA.
3403 p = raylcdf (x, sigma)
3405 p = raylcdf (x, 0.5)
3410 1. Wendy L. Martinez and Angel R. Martinez. `Computational
3411 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
3412 Chapman & Hall/CRC, 2001.
3414 2. Athanasios Papoulis. `Probability, Random Variables, and
3415 Stochastic Processes'. pages 104 and 148, McGraw-Hill, New
3416 York, second edition, 1984.
3421 # name: <cell-element>
3425 Compute the cumulative distribution function of the Rayleigh
3430 # name: <cell-element>
3437 # name: <cell-element>
3441 -- Function File: X = raylinv (P, SIGMA)
3442 Compute the quantile of the Rayleigh distribution. The quantile is
3443 the inverse of the cumulative distribution function.
3448 * P is the cumulative distribution. The elements of P must be
3451 * SIGMA is the parameter of the Rayleigh distribution. The
3452 elements of SIGMA must be positive.
3453 P and SIGMA must be of common size or one of them must be scalar.
3458 * X is the quantile of the Rayleigh distribution at each
3459 element of P and corresponding parameter SIGMA.
3466 x = raylinv (p, sigma)
3468 x = raylinv (p, 0.5)
3473 1. Wendy L. Martinez and Angel R. Martinez. `Computational
3474 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
3475 Chapman & Hall/CRC, 2001.
3477 2. Athanasios Papoulis. `Probability, Random Variables, and
3478 Stochastic Processes'. pages 104 and 148, McGraw-Hill, New
3479 York, second edition, 1984.
3484 # name: <cell-element>
3488 Compute the quantile of the Rayleigh distribution.
3492 # name: <cell-element>
3499 # name: <cell-element>
3503 -- Function File: Y = raylpdf (X, SIGMA)
3504 Compute the probability density function of the Rayleigh
3510 * X is the support. The elements of X must be non-negative.
3512 * SIGMA is the parameter of the Rayleigh distribution. The
3513 elements of SIGMA must be positive.
3514 X and SIGMA must be of common size or one of them must be scalar.
3519 * Y is the probability density of the Rayleigh distribution at
3520 each element of X and corresponding parameter SIGMA.
3527 y = raylpdf (x, sigma)
3529 y = raylpdf (x, 0.5)
3534 1. Wendy L. Martinez and Angel R. Martinez. `Computational
3535 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
3536 Chapman & Hall/CRC, 2001.
3538 2. Athanasios Papoulis. `Probability, Random Variables, and
3539 Stochastic Processes'. pages 104 and 148, McGraw-Hill, New
3540 York, second edition, 1984.
3545 # name: <cell-element>
3549 Compute the probability density function of the Rayleigh distribution.
3553 # name: <cell-element>
3560 # name: <cell-element>
3564 -- Function File: X = raylrnd (SIGMA)
3565 -- Function File: X = raylrnd (SIGMA, SZ)
3566 -- Function File: X = raylrnd (SIGMA, R, C)
3567 Generate a matrix of random samples from the Rayleigh distribution.
3572 * SIGMA is the parameter of the Rayleigh distribution. The
3573 elements of SIGMA must be positive.
3575 * SZ is the size of the matrix to be generated. SZ must be a
3576 vector of non-negative integers.
3578 * R is the number of rows of the matrix to be generated. R must
3579 be a non-negative integer.
3581 * C is the number of columns of the matrix to be generated. C
3582 must be a non-negative integer.
3587 * X is a matrix of random samples from the Rayleigh
3588 distribution with corresponding parameter SIGMA. If neither
3589 SZ nor R and C are specified, then X is of the same size as
3599 x = raylrnd (0.5, sz)
3603 x = raylrnd (0.5, r, c)
3608 1. Wendy L. Martinez and Angel R. Martinez. `Computational
3609 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
3610 Chapman & Hall/CRC, 2001.
3612 2. Athanasios Papoulis. `Probability, Random Variables, and
3613 Stochastic Processes'. pages 104 and 148, McGraw-Hill, New
3614 York, second edition, 1984.
3619 # name: <cell-element>
3623 Generate a matrix of random samples from the Rayleigh distribution.
3627 # name: <cell-element>
3634 # name: <cell-element>
3638 -- Function File: [M, V] = raylstat (SIGMA)
3639 Compute mean and variance of the Rayleigh distribution.
3644 * SIGMA is the parameter of the Rayleigh distribution. The
3645 elements of SIGMA must be positive.
3650 * M is the mean of the Rayleigh distribution.
3652 * V is the variance of the Rayleigh distribution.
3658 [m, v] = raylstat (sigma)
3663 1. Wendy L. Martinez and Angel R. Martinez. `Computational
3664 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
3665 Chapman & Hall/CRC, 2001.
3667 2. Athanasios Papoulis. `Probability, Random Variables, and
3668 Stochastic Processes'. McGraw-Hill, New York, second edition,
3674 # name: <cell-element>
3678 Compute mean and variance of the Rayleigh distribution.
3682 # name: <cell-element>
3689 # name: <cell-element>
3693 -- Function File: [B, BINT, R, RINT, STATS] = regress (Y, X, [ALPHA])
3694 Multiple Linear Regression using Least Squares Fit of Y on X with
3695 the model `y = X * beta + e'.
3699 * `y' is a column vector of observed values
3701 * `X' is a matrix of regressors, with the first column filled
3702 with the constant value 1
3704 * `beta' is a column vector of regression parameters
3706 * `e' is a column vector of random errors
3710 * Y is the `y' in the model
3712 * X is the `X' in the model
3714 * ALPHA is the significance level used to calculate the
3715 confidence intervals BINT and RINT (see `Return values'
3716 below). If not specified, ALPHA defaults to 0.05
3720 * B is the `beta' in the model
3722 * BINT is the confidence interval for B
3724 * R is a column vector of residuals
3726 * RINT is the confidence interval for R
3728 * STATS is a row vector containing:
3734 * The p value for the full model
3736 * The estimated error variance
3738 R and RINT can be passed to `rcoplot' to visualize the residual
3739 intervals and identify outliers.
3741 NaN values in Y and X are removed before calculation begins.
3747 # name: <cell-element>
3751 Multiple Linear Regression using Least Squares Fit of Y on X with the
3756 # name: <cell-element>
3763 # name: <cell-element>
3767 -- Function File: [PVAL, TABLE, ST] = repanova (X, COND)
3768 -- Function File: [PVAL, TABLE, ST] = repanova (X, COND, ['string' |
3770 Perform a repeated measures analysis of variance (Repeated ANOVA).
3771 X is formated such that each row is a subject and each column is a
3774 condition is typically a point in time, say t=1 then t=2, etc
3775 condition can also be thought of as groups.
3777 The optional flag can be either 'cell' or 'string' and reflects
3778 the format of the table returned. Cell is the default.
3780 NaNs are ignored using nanmean and nanstd.
3782 This fuction does not currently support multiple columns of the
3788 # name: <cell-element>
3792 Perform a repeated measures analysis of variance (Repeated ANOVA).
3796 # name: <cell-element>
3803 # name: <cell-element>
3807 -- Function File: Y = squareform (X)
3808 -- Function File: Y = squareform (X, "tovector")
3809 -- Function File: Y = squareform (X, "tomatrix")
3810 Convert a vector from the pdist function into a square matrix or
3811 from a square matrix back to the vector form.
3813 The second argument is used to specify the output type in case
3814 there is a single element.
3822 # name: <cell-element>
3826 Convert a vector from the pdist function into a square matrix or from a
3831 # name: <cell-element>
3838 # name: <cell-element>
3842 -- Function File: TABLE = tabulate (DATA, EDGES)
3843 Compute a frequency table.
3845 For vector data, the function counts the number of values in data
3846 that fall between the elements in the edges vector (which must
3847 contain monotonically non-decreasing values). TABLE is a matrix.
3848 The first column of TABLE is the number of bin, the second is the
3849 number of instances in each class (absolute frequency). The third
3850 column contains the percentage of each value (relative frequency)
3851 and the fourth column contains the cumulative frequency.
3853 If EDGES is missed the width of each class is unitary, if EDGES is
3854 a scalar then represent the number of classes, or you can define
3855 the width of each bin. TABLE(K, 2) will count the value DATA (I)
3856 if EDGES (K) <= DATA (I) < EDGES (K+1). The last bin will count
3857 the value of DATA (I) if EDGES(K) <= DATA (I) <= EDGES (K+1).
3858 Values outside the values in EDGES are not counted. Use -inf and
3859 inf in EDGES to include all values. Tabulate with no output
3860 arguments returns a formatted table in the command window.
3864 sphere_radius = [1:0.05:2.5];
3865 tabulate (sphere_radius)
3867 Tabulate returns 2 bins, the first contains the sphere with radius
3868 between 1 and 2 mm excluded, and the second one contains the
3869 sphere with radius between 2 and 3 mm.
3871 tabulate (sphere_radius, 10)
3873 Tabulate returns ten bins.
3875 tabulate (sphere_radius, [1, 1.5, 2, 2.5])
3877 Tabulate returns three bins, the first contains the sphere with
3878 radius between 1 and 1.5 mm excluded, the second one contains the
3879 sphere with radius between 1.5 and 2 mm excluded, and the third
3880 contains the sphere with radius between 2 and 2.5 mm.
3882 bar (table (:, 1), table (:, 2))
3886 See also: bar, pareto
3892 # name: <cell-element>
3896 Compute a frequency table.
3900 # name: <cell-element>
3907 # name: <cell-element>
3911 -- Function File: [DATA, VARNAMES, CASENAMES] = tblread (FILENAME)
3912 -- Function File: [DATA, VARNAMES, CASENAMES] = tblread (FILENAME,
3914 Read tabular data from an ascii file.
3916 DATA is read from an ascii data file named FILENAME with an
3917 optional DELIMETER. The delimeter may be any single character or
3918 * "space" " " (default)
3928 The DATA is read starting at cell (2,2) where the VARNAMES form a
3929 char matrix from the first row (starting at (1,2)) vertically
3930 concatenated, and the CASENAMES form a char matrix read from the
3931 first column (starting at (2,1)) vertically concatenated.
3933 See also: tblwrite, csv2cell, cell2csv
3939 # name: <cell-element>
3943 Read tabular data from an ascii file.
3947 # name: <cell-element>
3954 # name: <cell-element>
3958 -- Function File: tblwrite (DATA, VARNAMES, CASENAMES, FILENAME)
3959 -- Function File: tblwrite (DATA, VARNAMES, CASENAMES, FILENAME,
3961 Write tabular data to an ascii file.
3963 DATA is written to an ascii data file named FILENAME with an
3964 optional DELIMETER. The delimeter may be any single character or
3965 * "space" " " (default)
3975 The DATA is written starting at cell (2,2) where the VARNAMES are
3976 a char matrix or cell vector written to the first row (starting at
3977 (1,2)), and the CASENAMES are a char matrix (or cell vector)
3978 written to the first column (starting at (2,1)).
3980 See also: tblread, csv2cell, cell2csv
3986 # name: <cell-element>
3990 Write tabular data to an ascii file.
3994 # name: <cell-element>
4001 # name: <cell-element>
4005 -- Function File: A = trimmean (X, P)
4006 Compute the trimmed mean.
4008 The trimmed mean of X is defined as the mean of X excluding the
4009 highest and lowest P percent of the data.
4013 mean ([-inf, 1:9, inf])
4017 trimmean ([-inf, 1:9, inf], 10)
4019 excludes the infinite values, which make the result 5.
4027 # name: <cell-element>
4031 Compute the trimmed mean.
4035 # name: <cell-element>
4042 # name: <cell-element>
4046 -- Function File: [M, V] = tstat (N)
4047 Compute mean and variance of the t (Student) distribution.
4052 * N is the parameter of the t (Student) distribution. The
4053 elements of N must be positive
4058 * M is the mean of the t (Student) distribution
4060 * V is the variance of the t (Student) distribution
4071 1. Wendy L. Martinez and Angel R. Martinez. `Computational
4072 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
4073 Chapman & Hall/CRC, 2001.
4075 2. Athanasios Papoulis. `Probability, Random Variables, and
4076 Stochastic Processes'. McGraw-Hill, New York, second edition,
4082 # name: <cell-element>
4086 Compute mean and variance of the t (Student) distribution.
4090 # name: <cell-element>
4097 # name: <cell-element>
4101 -- Function File: [M, V] = unidstat (N)
4102 Compute mean and variance of the discrete uniform distribution.
4107 * N is the parameter of the discrete uniform distribution. The
4108 elements of N must be positive natural numbers
4113 * M is the mean of the discrete uniform distribution
4115 * V is the variance of the discrete uniform distribution
4121 [m, v] = unidstat (n)
4126 1. Wendy L. Martinez and Angel R. Martinez. `Computational
4127 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
4128 Chapman & Hall/CRC, 2001.
4130 2. Athanasios Papoulis. `Probability, Random Variables, and
4131 Stochastic Processes'. McGraw-Hill, New York, second edition,
4137 # name: <cell-element>
4141 Compute mean and variance of the discrete uniform distribution.
4145 # name: <cell-element>
4152 # name: <cell-element>
4156 -- Function File: [M, V] = unifstat (A, B)
4157 Compute mean and variance of the continuous uniform distribution.
4162 * A is the first parameter of the continuous uniform
4165 * B is the second parameter of the continuous uniform
4167 A and B must be of common size or one of them must be scalar and A
4173 * M is the mean of the continuous uniform distribution
4175 * V is the variance of the continuous uniform distribution
4182 [m, v] = unifstat (a, b)
4184 [m, v] = unifstat (a, 10)
4189 1. Wendy L. Martinez and Angel R. Martinez. `Computational
4190 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
4191 Chapman & Hall/CRC, 2001.
4193 2. Athanasios Papoulis. `Probability, Random Variables, and
4194 Stochastic Processes'. McGraw-Hill, New York, second edition,
4200 # name: <cell-element>
4204 Compute mean and variance of the continuous uniform distribution.
4208 # name: <cell-element>
4215 # name: <cell-element>
4219 -- Function File: THETA = vmpdf (X, MU, K)
4220 Evaluates the Von Mises probability density function.
4222 The Von Mises distribution has probability density function
4223 f (X) = exp (K * cos (X - MU)) / Z ,
4224 where Z is a normalisation constant. By default, MU is 0 and K is
4233 # name: <cell-element>
4237 Evaluates the Von Mises probability density function.
4241 # name: <cell-element>
4248 # name: <cell-element>
4252 -- Function File: THETA = vmrnd (MU, K)
4253 -- Function File: THETA = vmrnd (MU, K, SZ)
4254 Draw random angles from a Von Mises distribution with mean MU and
4257 The Von Mises distribution has probability density function
4258 f (X) = exp (K * cos (X - MU)) / Z ,
4259 where Z is a normalisation constant.
4261 The output, THETA, is a matrix of size SZ containing random angles
4262 drawn from the given Von Mises distribution. By default, MU is 0
4271 # name: <cell-element>
4275 Draw random angles from a Von Mises distribution with mean MU and
4280 # name: <cell-element>
4287 # name: <cell-element>
4291 -- Function File: [M, V] = wblstat (SCALE, SHAPE)
4292 Compute mean and variance of the Weibull distribution.
4297 * SCALE is the scale parameter of the Weibull distribution.
4298 SCALE must be positive
4300 * SHAPE is the shape parameter of the Weibull distribution.
4301 SHAPE must be positive
4302 SCALE and SHAPE must be of common size or one of them must be
4308 * M is the mean of the Weibull distribution
4310 * V is the variance of the Weibull distribution
4317 [m, v] = wblstat (scale, shape)
4319 [m, v] = wblstat (6, shape)
4324 1. Wendy L. Martinez and Angel R. Martinez. `Computational
4325 Statistics Handbook with MATLAB'. Appendix E, pages 547-557,
4326 Chapman & Hall/CRC, 2001.
4328 2. Athanasios Papoulis. `Probability, Random Variables, and
4329 Stochastic Processes'. McGraw-Hill, New York, second edition,
4335 # name: <cell-element>
4339 Compute mean and variance of the Weibull distribution.