This page compares the Mann-Whitney-Wilcoxon test against two other non-parametric statistical tests, which are the Kolmogorov-Smirnov and the Pearson's Chi-square test.
We introduce each test separately in Section 1 to 3, while elaborating more on the chosen test (Mann-Whitney-Wilcoxon, Section 1).
We give a qualitative comparison in Section 4 and demonstrate that the test of our choice is solely sensitive to a difference in two distributions' medians. It does not have an undesired (for our application) sensitivity to differences in shape, which is the case for the other two.
The MWW test was first presented by Wilcoxon in 1945 [1] and two years later discussed by Mann and Whitney [2] on a more solid mathematical basis. The test assess whether one of two random variables is stochastically larger than the other, i.e. whether their medians differ.
Let X1 and X2 be sets of drawings from unknown distributions functions, respectively. The test to assess whether the two underlying random variables are identical is done in three steps:
μT = | n1*(n1 + n2 + 1) |
2 |
σT2 = | n1*n2*(n1 + n2 + 1) |
12 |
z = | T - μT |
σT |
The two-sample KS test assesses whether two probability distributions differ or not [3,4]. It is sensitive to location and shape.
Given two drawings X1 and X2, the empirical cumulative distributions functions are F1(x) and F2(x), respectively. Then the test statistic is computed as:
Dn1, n2 = supx|F1(x) - F2(x)|
which is the maximum difference between the two cumulative distribution functions along the horizontal x-axis. n1, n2 are the cardinalities of X1 and X2, respectively. The statistic Dn1, n2 can be normalized using precomputed tables [4].The Pearson's Chi-square test assesses whether an observed random variable with distribution follows an expected distribution [5].
Let Oi and Ei be the bins of the observed and expected probability function, respectively. Then the Chi-square test is:
X2 = | Σi=1..n | (Oi - Ei)2 |
Ei |
If the test statistic is zero, the respective graph is marked with a dashed frame. One sees that the MWW test is only unequal to zero in the first case where the medians are different. The KS test measures the difference in shape for the example in row two. However, it barely measure the difference in shape in the third row since the cumulative distribution functions are very similar. The test statistic is close to zero. The Chi-square test also measures the difference in the third row since it sums up the squared differences in every single bin.
The semantic gray-level enhancement and color transfer are based on tone-mapping curves that adaptively de- or increase pixel values in different channles independent of their distribution. As we do not consider the shape of the distribution as an extra feature we use the MWW test. We have found the MWW to be more robust and adapted to our application.
|
|
|
|
|
---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
[1] Frank Wilcoxon, Individual Comparisons by Ranking Methods, Biometrics Bulletin, vol. 1, nr. 6, pp. 80-83, 1945
[2] H. B. Mann and D. R. Whitney, On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other, The Annals of Mathematical Statistics, vol. 18, nr. 1, pp. 50-60, 1947
[3] A. Kolmogorov, Sulla determinazione empirica di una legge di distributione, Giornale dell' Istituto Italiano degli Attuari 4, pp 83-91, 1933
[4] N. Smirnov, Table for Estimating the Goodness of Fit of Empirical Distributions, The Annals of Mathematical Statistics, vol. 19, nr. 2, pp. 279-281, 1948
[5] R. L. Plackett, Karl Pearson and the Chi-Squared Test, International Statistical Review, vol. 51, nr. 1, pp. 59-72, 1983