smsets: Simple multivariate statistical estimation and tests

Abstract

The document describes the functions implemented in the smsets package, which focuses on the estimation and comparison of means and measures of variation, and one distance measure (Penrose’s distance) as described in Chapters 4 and 5 of the book Multivariate Statistical Methods: A Primer. 5th Edition (MSMAP5) by Manly et al. (2024). Worked examples for each function are presented, all of them characterized by the simple input function arguments, given the simple data layout needed to perform the statistical analyses (ranging from two univariate/multivariate samples to multivariate samples classified by one-single factor with m levels). Multiple two-sample t-tests and Levene tests on more than one response vector can be optionally corrected by any of the significance level adjustment methods for multiple comparisons offered by the p.adjust function. Effects sizes are also computed in these multiple univariate tests. The two-sample comparison of multivariate means is performed by Hotelling’s test while the comparison of multivariate variation in two samples can be executed with two unconventional methods: a Levene’s test based on Hotelling’s \(T^2\), and van Valen’s test. The comparison of multivariate means for a single factor is also available as well as the comparison of variation using Box’s M test using an approximate F-statistic. Following the idea behind Levene’s test for one single variable, robust tests for the comparison of variation in m samples from multivariate data are also included in the package. Finally, a Penrose’s distance calculator has been implemented as an alternative procedure to compare m multivariate populations, using means and variances only.

Comparison of Mean Values for Two Samples

Tests of significance for means and variances can be performed when several variables are measured on the same sample units and the approach to be taken can be either univariate or multivariate. This vignette covers both approaches, using base R- commands and functions implemented in the smsets package to ease the calculations described in Chapter 4 of MSMAP5.

Comparison of Mean Values for Two Samples: The Single-Variable Case

The standard approach for the comparison of means for two univariate samples is the t-test. This can be easily computed using the base function t.test for the case of two (non-paired) samples. The alternative argument is useful for the specification of the type of alternative hypothesis to be considered (either one-sided (less or greater) or two.sided). In addition, the user may choose whether the two population variances are treated as equal (var.equal = TRUE) or not (var.equal = FALSE, the default). See the documentation of t.test for details.

Example

Consider the Bumpus’ sparrows data described in Section 1.1 of MSMAP5. These data were used to exemplify most of the tests of significance in Chapter 4. The corresponding data frame sparrows is included in smsets and can be invoked once the package has been loaded into the R session.

library(smsets)
data("sparrows")
str(sparrows)
'data.frame':   49 obs. of  6 variables:
 $ Survivorship  : Factor w/ 2 levels "NS","S": 2 2 2 2 2 2 2 2 2 2 ...
 $ Total_length  : num  156 154 153 153 155 163 157 155 164 158 ...
 $ Alar_extent   : num  245 240 240 236 243 247 238 239 248 238 ...
 $ L_beak_head   : num  31.6 30.4 31 30.9 31.5 32 30.9 32.8 32.7 31 ...
 $ L_humerus     : num  18.5 17.9 18.4 17.7 18.6 19 18.4 18.6 19.1 18.8 ...
 $ L_keel_sternum: num  20.5 19.6 20.6 20.2 20.3 20.9 20.2 21.2 21.1 22 ...

The data frame sparrows contains the two-level factor Survivorship (with levels S and NS). The R-code giving the list of means and variances for all variables and the univariate t-test for Total_length as shown in Table 4.1 of MSMAP5, are

# Table 4.1
# Means
aggregate(sparrows[, 2:5], by = list(Survivorship = sparrows$Survivorship),
          FUN = mean)
  Survivorship Total_length Alar_extent L_beak_head L_humerus
1           NS     158.4286    241.5714    31.47857  18.44643
2            S     157.3810    241.0000    31.43333  18.50000
# Variances
aggregate(sparrows[, 2:5], by = list(Survivorship = sparrows$Survivorship),
          FUN = var)
  Survivorship Total_length Alar_extent L_beak_head L_humerus
1           NS     15.06878    32.55026   0.7284127 0.4344312
2            S     11.04762    17.50000   0.5313333 0.1760000
# t.test using a formula
t.test(Total_length ~ Survivorship, data = sparrows, var.equal = TRUE)

    Two Sample t-test

data:  Total_length by Survivorship
t = 0.99295, df = 47, p-value = 0.3258
alternative hypothesis: true difference in means between group NS and group S is not equal to 0
95 percent confidence interval:
 -1.074874  3.170113
sample estimates:
mean in group NS  mean in group S 
        158.4286         157.3810 

To produce the corresponding t-tests for any of the remaining four variables, the last expression has to be modified by writing the chosen variable name before the ~ symbol. The smsets package facilitates this task by implementing tests for differences between sample means for all variables.

Simultaneous (Multiple) Univariate Tests on Several variables

Assume that p variables are measured for two independent samples. Function ttests2s.mv in the smsets package extends the t.test function to produce all p univariate t-tests; the function includes a P.adjust argument useful to correct significance levels of multiple t-tests by any of the adjustment methods for multiple comparisons implemented in the function p.stats. The following code executes function ttests2s.mv with Bonferroni’s correction for the five two-sided t-tests shown in table 4.1. Notice that level1 is a character string identifying “Sample 1”. The string is "S" in this case; it is one of the factor levels in group. In addition, all morphological variables in the sparrows data frame are measured in mm, thus the character string for unit is "mm" :

# Two-sample t-tests with p values adjusted by the Bonferroni correction. 
# The default alternatives are two-sided.
ttests.sparrows <- ttests2s.mv(sparrows, group = Survivorship, level1 = "S", 
                               var.equal = TRUE, P.adjust = "bonferroni", unit = "mm")
ttests.sparrows
 Multiple Two Sample t-tests for Multivariate Data

Data:   sparrows 
Group levels: (1) S ;  (2) NS 

Variable:  Total_length 
Sample estimates: 
     Mean of S  Variance of S     Mean of NS Variance of NS 
        157.38          11.05         158.43          15.07 
t = -0.9930 , df = 47 , p-value = 1.0000 
Effect size: Raw = -1.048 mm  ;   Hedges' d =  0.993 

Variable:  Alar_extent 
Sample estimates: 
     Mean of S  Variance of S     Mean of NS Variance of NS 
        241.00          17.50         241.57          32.55 
t = -0.3871 , df = 47 , p-value = 1.0000 
Effect size: Raw = -0.571 mm  ;   Hedges' d =  0.387 

Variable:  L_beak_head 
Sample estimates: 
     Mean of S  Variance of S     Mean of NS Variance of NS 
         31.43           0.53          31.48           0.73 
t = -0.1952 , df = 47 , p-value = 1.0000 
Effect size: Raw = -0.045 mm  ;   Hedges' d =  0.195 

Variable:  L_humerus 
Sample estimates: 
     Mean of S  Variance of S     Mean of NS Variance of NS 
         18.50           0.18          18.45           0.43 
t = 0.3258 , df = 47 , p-value = 1.0000 
Effect size: Raw = 0.054 mm  ;   Hedges' d =  0.326 

Variable:  L_keel_sternum 
Sample estimates: 
     Mean of S  Variance of S     Mean of NS Variance of NS 
         20.81           0.57          20.84           1.32 
t = -0.1029 , df = 47 , p-value = 1.0000 
Effect size: Raw = -0.030 mm  ;   Hedges' d =  0.103 

Alternative hypothesis for all tests: true difference in means is not equal to 0 
P-values adjusted using Bonferroni method

Compare these Bonferroni-corrected p-values with those shown in Table 4.1.

Comparison of Mean Values for Two Samples: The Multivariate Case

In the multivariate case, covariance between all possible pairs of variables are accounted for in the calculation of test statistics developed to test the difference of mean vectors. For the comparison of two multivariate samples, a generalization of the t-test is Hotelling’s \(T^2\) test, introduced in Section 4.3 of MSMAP5.

Hotelling’s \(T^2\) test

The multivariate comparison of mean measurements between survivor and nonsurvivor Bumpus’ sparrows can be obtained with the hotelling.test function from package Hotelling (Curran & Hersh, 2021).

library(Hotelling)  
Loading required package: corpcor
#  Hotelling's T2 test. The result is a list
T2.sparrows <- with(sparrows, hotelling.test(Total_length + Alar_extent + 
                                             L_beak_head + L_humerus + 
                                             L_keel_sternum ~ Survivorship))
# Output of the function hotelling.test is given
T2.sparrows
Test stat:  2.8237 
Numerator df:  5 
Denominator df:  43 
P-value:  0.7622 

The alternative R-function for the \(T^2\) test is provided by the function Hotelling.mat in the smsets package. The syntax of the function is

Hotelling.mat(x, group, level1)

where x is a data frame with p + 1 columns, being p of them numeric response variables, and the remaining column, group, is a two-factor variable, written without quotes. Finally, level1 is a character string specifying the first level of interest in group. In addition to Hotelling’s test statistics, the long = TRUE option in the print method outputs the mean vectors and matrices involved in the calculation of the \(T²\) statistic.

# Hotelling's T2 test. Comparing multivariate means between survivor and 
# nonsurvivor sparrows using function Hotelling.mat
results.T2 <- Hotelling.mat(sparrows, group = Survivorship, level1 = "S")
# Long output
print(results.T2, long = TRUE)
 Hotelling's T2 test for the comparison of two multivariate samples
 (Assuming equal covariance matrices)
 Data:   sparrows 
 Group levels: (1) S ;  (2) NS 

Mean vectors and Covariance Matrices

  Total_length Alar_extent L_beak_head L_humerus L_keel_sternum
S      157.381         241    31.43333      18.5       20.80952

Covariance Matrix:
               Total_length Alar_extent L_beak_head L_humerus L_keel_sternum
Total_length      11.047619        9.10   1.5566667    0.8700      1.2861905
Alar_extent        9.100000       17.50   1.9100000    1.3100      0.8800000
L_beak_head        1.556667        1.91   0.5313333    0.1890      0.2396667
L_humerus          0.870000        1.31   0.1890000    0.1760      0.1325000
L_keel_sternum     1.286190        0.88   0.2396667    0.1325      0.5749048

   Total_length Alar_extent L_beak_head L_humerus L_keel_sternum
NS     158.4286    241.5714    31.47857  18.44643       20.83929

Covariance Matrix:
               Total_length Alar_extent L_beak_head L_humerus L_keel_sternum
Total_length      15.068783   17.190476   2.2428571 1.7460317      2.9306878
Alar_extent       17.190476   32.550265   3.3978836 2.9502646      4.0656085
L_beak_head        2.242857    3.397884   0.7284127 0.4695503      0.5590212
L_humerus          1.746032    2.950265   0.4695503 0.4344312      0.5058862
L_keel_sternum     2.930688    4.065608   0.5590212 0.5058862      1.3209921

Pooled Covariance Matrix:
               Total_length Alar_extent L_beak_head L_humerus L_keel_sternum
Total_length      13.357649   13.747720   1.9508612 1.3732523      2.2309017
Alar_extent       13.747720   26.145897   2.7647416 2.2522796      2.7100304
L_beak_head        1.950861    2.764742   0.6445491 0.3501672      0.4231256
L_humerus          1.373252    2.252280   0.3501672 0.3244605      0.3469985
L_keel_sternum     2.230902    2.710030   0.4231256 0.3469985      1.0035081

Inverse of Covariance Matrix:
               Total_length Alar_extent L_beak_head   L_humerus L_keel_sternum
Total_length     0.20605404 -0.06937533 -0.23946750  0.07848176    -0.19689454
Alar_extent     -0.06937533  0.12335410 -0.03760830 -0.55173264     0.02774227
L_beak_head     -0.23946750 -0.03760830  4.22184744 -3.26236979    -0.01812284
L_humerus        0.07848176 -0.55173264 -3.26236979 11.46092696    -1.27194270
L_keel_sternum  -0.19689454  0.02774227 -0.01812284 -1.27194270     1.80676209

Hotelling's T2 statistic = 2.8237 
F statistic = 0.5167 
Numerator df =   5 
Denominator df =  43 
P-value = 0.7622 

Comparison of Variation for Two Samples

Comparison of Variation for Two Samples: The Single-Variable Case

F-test and Levene’s test

The F-test applied to compare variances in total length for survivor and nonsurvivor sparrows is included here but as indicated in section 4.5, this test should never be used to compare variances, because it is very sensitive to the assumption of normality.

# F-test for Total length (not recommended)
with(sparrows, var.test(Total_length[Survivorship == "S"],
                        Total_length[Survivorship == "NS"]))

    F test to compare two variances

data:  Total_length[Survivorship == "S"] and Total_length[Survivorship == "NS"]
F = 0.73315, num df = 20, denom df = 27, p-value = 0.4788
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.3253692 1.7412766
sample estimates:
ratio of variances 
         0.7331461 

The robust two-sample Levene’s test can be alternatively run, using leveneTest function from the car package (Fox & Weisberg, 2019), to compare again the variation in total length for survivor and nonsurvivor sparrows.

library(car)
Loading required package: carData
leveneTest(Total_length ~ Survivorship, data = sparrows)
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1   1.447  0.235
      47               

Notice that leveneTest produces an F statistic, instead of a t statistic, but the degrees of freedom for Survivorship (the factor defining groups) is equal to 1 thus, the relation \(F = t^2\) holds. Thus, for the analysis of the variation for the sparrows data in Section 4.6.1 of MSMAP5, \(t = -1.20\), then \(t^2 = 1.464\). This is not too far from the F value = \(1.447\) produced by R (the difference due to rounding errors). Notice that leveneTest produces a two-sided test. The alternative hypothesis that we are interested in is that the variance for survivors is smaller than the variance for nonsurvivors. This is a lower-tail test thus, the p-value shown in the Levene’s test output, \(0.235\), must be halved:

p.value.lower <- 0.235 / 2
p.value.lower
[1] 0.1175

Similar code can be written to compare the variation between survivor and nonsurvivor sparrows for the remaining variables.

Simultaneous (Multiple) Univariate Tests on Several variables

Similarly to ttests2s.mv, the Levenetests2s.mv function in the smsets package extends two-sample Levene’s tests based on the t-statistic to produce all p univariate Levene’s tests (one-sided alternatives included). Comparisons of variation between survivors and nonsurvivors for all variables, one at a time, are shown below using Benjamini & Hochberg (1995) correction (indicated here as fdr or “false discovery rate” correction), and considering lower-tailed alternatives in all cases, as described in Section 4.6.1 of MSMAP5. Effect sizes are also computed.

fdr.Levene2s.mv <- Levenetests2s.mv(sparrows, Survivorship, "S",
                                alternative = "less", var.equal = TRUE,
                                P.adjust = "fdr", unit = "mm")
fdr.Levene2s.mv
Two Sample Levene's tests
Testing variation using t-tests via absolute deviations from medians

Data:   sparrows 
Group levels: (1) S ;  (2) NS 

Variable:  Total_length 
 Sample estimates: 
  Median of S  Median of NS 
          157           159 
 Mean of absolute deviations from the median: 
  S : 2.571429 , NS : 3.285714 
 Variance of absolute deviations from the median: 
  S : 4.257143 , NS : 4.21164 
 t = -1.2029 , df = 47 , p-value = 0.1514 
 Effect size: Raw = -0.714 mm  ;   Hedges' d =  1.203 

Variable:  Alar_extent 
 Sample estimates: 
  Median of S  Median of NS 
          240           242 
 Mean of absolute deviations from the median: 
  S : 3.571429 , NS : 4.571429 
 Variance of absolute deviations from the median: 
  S : 5.157143 , NS : 11.06878 
 t = -1.1845 , df = 47 , p-value = 0.1514 
 Effect size: Raw = -1.000 mm  ;   Hedges' d =  1.184 

Variable:  L_beak_head 
 Sample estimates: 
  Median of S  Median of NS 
         31.4          31.5 
 Mean of absolute deviations from the median: 
  S : 0.5761905 , NS : 0.6857143 
 Variance of absolute deviations from the median: 
  S : 0.1839048 , NS : 0.2412698 
 t = -0.8147 , df = 47 , p-value = 0.2097 
 Effect size: Raw = -0.110 mm  ;   Hedges' d =  0.815 

Variable:  L_humerus 
 Sample estimates: 
  Median of S  Median of NS 
         18.5          18.5 
 Mean of absolute deviations from the median: 
  S : 0.3142857 , NS : 0.5107143 
 Variance of absolute deviations from the median: 
  S : 0.07228571 , NS : 0.166918 
 t = -1.9120 , df = 47 , p-value = 0.1514 
 Effect size: Raw = -0.196 mm  ;   Hedges' d =  1.912 

Variable:  L_keel_sternum 
 Sample estimates: 
  Median of S  Median of NS 
         20.6          20.7 
 Mean of absolute deviations from the median: 
  S : 0.6380952 , NS : 0.8892857 
 Variance of absolute deviations from the median: 
  S : 0.1934762 , NS : 0.5209921 
 t = -1.4086 , df = 47 , p-value = 0.1514 
 Effect size: Raw = -0.251 mm  ;   Hedges' d =  1.409 

Alternative hypothesis for all tests: true difference in means is less than 0 
P-values adjusted using FDR method

Looking at the p-values obtained with the “false discovery rate” adjustment, it is seen that the variation between survivors and nonsurvivors is non-significant for none of the five morphological variables, which contrasts with the simultaneous uncorrected Levene’s tests reported in Section 4.6.1 of MSMAP5. This latter set of tests are shown below (the P.adjust argument in Levenetests2s.mv has been omitted):

none.Levene2s.mv <- Levenetests2s.mv(sparrows, Survivorship, "S",
                                alternative = "less", var.equal = TRUE, unit = "mm")
none.Levene2s.mv
Two Sample Levene's tests
Testing variation using t-tests via absolute deviations from medians

Data:   sparrows 
Group levels: (1) S ;  (2) NS 

Variable:  Total_length 
 Sample estimates: 
  Median of S  Median of NS 
          157           159 
 Mean of absolute deviations from the median: 
  S : 2.571429 , NS : 3.285714 
 Variance of absolute deviations from the median: 
  S : 4.257143 , NS : 4.21164 
 t = -1.2029 , df = 47 , p-value = 0.1175 
 Effect size: Raw = -0.714 mm  ;   Hedges' d =  1.203 

Variable:  Alar_extent 
 Sample estimates: 
  Median of S  Median of NS 
          240           242 
 Mean of absolute deviations from the median: 
  S : 3.571429 , NS : 4.571429 
 Variance of absolute deviations from the median: 
  S : 5.157143 , NS : 11.06878 
 t = -1.1845 , df = 47 , p-value = 0.1211 
 Effect size: Raw = -1.000 mm  ;   Hedges' d =  1.184 

Variable:  L_beak_head 
 Sample estimates: 
  Median of S  Median of NS 
         31.4          31.5 
 Mean of absolute deviations from the median: 
  S : 0.5761905 , NS : 0.6857143 
 Variance of absolute deviations from the median: 
  S : 0.1839048 , NS : 0.2412698 
 t = -0.8147 , df = 47 , p-value = 0.2097 
 Effect size: Raw = -0.110 mm  ;   Hedges' d =  0.815 

Variable:  L_humerus 
 Sample estimates: 
  Median of S  Median of NS 
         18.5          18.5 
 Mean of absolute deviations from the median: 
  S : 0.3142857 , NS : 0.5107143 
 Variance of absolute deviations from the median: 
  S : 0.07228571 , NS : 0.166918 
 t = -1.9120 , df = 47 , p-value = 0.0310 
 Effect size: Raw = -0.196 mm  ;   Hedges' d =  1.912 

Variable:  L_keel_sternum 
 Sample estimates: 
  Median of S  Median of NS 
         20.6          20.7 
 Mean of absolute deviations from the median: 
  S : 0.6380952 , NS : 0.8892857 
 Variance of absolute deviations from the median: 
  S : 0.1934762 , NS : 0.5209921 
 t = -1.4086 , df = 47 , p-value = 0.0828 
 Effect size: Raw = -0.251 mm  ;   Hedges' d =  1.409 

Alternative hypothesis for all tests: true difference in means is less than 0 
No P-value adjustment made.

As described in Section 4.6.1 of MSMAP5, “only for the length of the humerus is the result significantly low at 5% level”. However, it is recommended here to rely on tests based on adjustments like fdr. Therefore, it is more appropriate to conclude that, on the basis on multiple univariate one-sided Levene’s tests, apparently the five morphological variables for survivor sparrows do not vary less than those for nonsurvivors.

More suitable approaches can be considered for testing variation from a multivariate point of view. Two methods of this sort are described in the next section.

Comparison of Variation for Two Samples: The Multivariate Case

Two-sample Levene’s test based on Hotelling’s \(T^2\) for the comparison of multivariate variation

The idea behind the multivariate version of the two-sample Levene’s test is to compare the mean vectors of absolute deviations from medians or MADs for all variables. More precisely, the variation between the two samples are measured in terms of two sample MADs for all variables and, then, the mean MADs vectors are compared using Hotelling’s \(T^2\) test.

The following code implements function LeveneT2 included in the smsets package to produce a Levene’s test based on Hotelling’s \(T^2\) for the comparison of multivariate variation between survivors and nonsurvivors in the Bumpus’ sparrows data.

# Levene's test based on Hotelling's T2
LeveneT2.sparrows <- LeveneT2(sparrows, group = Survivorship, level1 = "S",
                              var.equal = TRUE)
LeveneT2.sparrows
 Comparison of variation for two multivariate samples (Levene's test)

 Variation is measured as absolute deviations around group medians
 Hotelling's test compares two vectors of mean absolute deviations

 Data:  sparrows 
 Variables:  Total_length Alar_extent L_beak_head L_humerus L_keel_sternum 
 Group levels: (1) S ;  (2) NS 

 Levene's test based on Hotelling's T2
 T2 statistic = 4.7478 
 F = 0.8687 
 Num df = 5 
 Den df = 43 
 p-value = 0.5099 

If a long output is desired (e.g., a display of sub-data frames containing the absolute deviations around medians), long = TRUE can be added as an option to the print method:

print(LeveneT2.sparrows, long = TRUE)

Van Valen’s test

Details about the test of multivariate variation for two samples suggested by van Valen (1978) are found in Section 4.6 of MSMAP5. The test assumes that the level of variation is consistent for all variables, as the test statistic is reduced to a single variation measure (the deviation around medians for all standardized variables), denoted as d. As a consequence, the comparison of multivariate variation is carried out using a simple two-sample t-test of means for the single variable d . The function VanValen in the smsets package facilitates the calculations involved in van Valen’s test. The code for the comparison of multivariate variation between survivor and nonsurvivor sparrows follows, assuming that one is interested to test that the five morphological features for survivors are less variable than the corresponding features for nonsurvivors. The print method in this example includes the option long = TRUE, indicating that a detailed output is wanted, including by-group matrices of standardized variables, standardized medians, absolute deviations from sample medians for each group, and by-group d-values used in Van Valen’s test.

# Van Valen's test. A t-test based on absolute differences around medians from 
# standardized data
res.VanValen <- VanValen(sparrows, group = "Survivorship", level1 = "S", 
                         alternative = "less", var.equal = TRUE)
print(res.VanValen, long = TRUE)
 Comparison of variation for two multivariate samples (Van Valen's test)
 Variation measured as deviations of standardized data around medians

 Data:  sparrows 
 Variables:  Total_length Alar_extent L_beak_head L_humerus L_keel_sternum 
 Group levels: (1) S ;  (2) NS 


Standardized data for group S 
   Total_length Alar_extent L_beak_head   L_humerus L_keel_sternum
1  -0.541719129   0.7248615  0.17718246  0.05424955    -0.32937165
2  -1.089022992  -0.2617555 -1.33272023 -1.00904159    -1.23720227
3  -1.362674923  -0.2617555 -0.57776889 -0.12296564    -0.22850158
4  -1.362674923  -1.0510492 -0.70359411 -1.36347197    -0.63198186
5  -0.815371061   0.3302147  0.05135723  0.23146474    -0.53111179
6   1.373844390   1.1195083  0.68048336  0.94032550     0.07410862
7  -0.268067198  -0.6564024 -0.70359411 -0.12296564    -0.63198186
8  -0.815371061  -0.4590790  1.68708515  0.23146474     0.37671883
9   1.647496321   1.3168318  1.56125993  1.11754069     0.27584876
10  0.005584733  -0.6564024 -0.57776889  0.58589512     1.18367938
11  0.005584733  -0.2617555 -0.20029321  0.23146474     1.18367938
12  0.552888596   0.5275381 -0.45194366  0.23146474    -0.32937165
13  0.826540527   0.9221849  1.05795903  1.47197107     0.98193924
14 -0.268067198   0.7248615  0.68048336  1.11754069    -0.83372199
15 -0.268067198  -1.2483726  0.05135723 -0.65461121    -1.03546213
16 -0.541719129  -0.8537258 -0.70359411 -0.83182640    -0.53111179
17  0.005584733   0.5275381 -0.07446799  0.05424955     0.78019910
18 -1.362674923  -0.6564024 -1.20689501 -0.47739602     0.07410862
19 -0.815371061  -1.0510492 -1.45854546  0.05424955    -0.73285193
20  1.373844390   0.9221849  1.30960948  0.23146474     1.08280931
21  0.279236665  -1.0510492  0.05135723 -0.83182640     0.67932903

Medians of standardized data for group S 
  Total_length    Alar_extent    L_beak_head      L_humerus L_keel_sternum 
   -0.26806720    -0.26175555    -0.07446799     0.05424955    -0.22850158 

Standardized data for group NS 
   Total_length Alar_extent L_beak_head   L_humerus L_keel_sternum
1    -0.8153711  -0.2617555 -0.07446799 -0.83182640    -0.12763152
2    -0.5417191  -0.2617555  0.05135723 -0.47739602    -0.22850158
3     0.5528886   0.1328913  1.43543470  0.58589512     0.88106917
4    -1.6363269  -1.8403429 -1.45854546 -2.24954792    -1.03546213
5     0.5528886   1.7114786  0.30300768  0.58589512     1.68802972
6    -0.8153711  -0.8537258 -0.57776889  0.05424955    -0.83372199
7    -0.2680672   0.7248615  0.93213380  1.82640145     0.57845896
8     1.9211483   0.7248615  2.06456082  2.35804702     1.88976985
9    -1.3626749  -2.0376663 -1.71019591 -2.07233273    -1.03546213
10    1.1001925  -0.4590790 -1.45854546 -0.83182640     2.29325013
11    1.1001925   0.3302147  0.17718246  0.58589512     0.47758890
12    0.2792367   0.7248615  0.42883291  0.05424955     0.88106917
13    0.2792367   1.1195083 -0.70359411 -0.65461121    -1.84242268
14   -0.8153711   0.3302147 -0.70359411  0.05424955     0.47758890
15    1.1001925   2.1061254  0.55465813  1.11754069     1.38541951
16   -1.6363269  -2.2349897 -1.33272023 -2.07233273    -2.24590295
17    0.2792367   0.1328913 -0.82941934 -0.47739602    -0.32937165
18   -0.8153711  -0.6564024 -0.32611844 -1.00904159    -1.53981247
19    1.3738444   1.5141552  2.44203650  1.82640145     1.99063992
20    1.3738444   0.1328913 -0.57776889 -0.65461121    -0.12763152
21   -0.5417191  -0.8537258  0.30300768 -0.47739602    -0.53111179
22    0.2792367  -0.6564024  0.05135723 -0.12296564    -0.53111179
23    0.8265405   0.7248615  0.80630858  1.11754069    -0.02676145
24   -0.8153711  -1.2483726 -0.95524456 -1.36347197    -1.23720227
25    1.1001925   1.1195083  0.55465813  1.11754069    -0.43024172
26   -1.3626749  -0.8537258 -1.08106978  0.23146474    -0.43024172
27    1.1001925   0.7248615  1.30960948  0.05424955     0.27584876
28    1.6474963   1.3168318  1.05795903  0.58589512     0.07410862

Medians of standardized data for group NS 
  Total_length    Alar_extent    L_beak_head      L_humerus L_keel_sternum 
    0.27923666     0.13289128     0.05135723     0.05424955    -0.12763152 

Deviations from sample medians for standardized values in group S 
   Total_length Alar_extent L_beak_head  L_humerus L_keel_sternum
1    -0.2736519   0.9866171   0.2516504  0.0000000     -0.1008701
2    -0.8209558   0.0000000  -1.2582522 -1.0632911     -1.0087007
3    -1.0946077   0.0000000  -0.5033009 -0.1772152      0.0000000
4    -1.0946077  -0.7892937  -0.6291261 -1.4177215     -0.4034803
5    -0.5473039   0.5919702   0.1258252  0.1772152     -0.3026102
6     1.6419116   1.3812639   0.7549513  0.8860759      0.3026102
7     0.0000000  -0.3946468  -0.6291261 -0.1772152     -0.4034803
8    -0.5473039  -0.1973234   1.7615531  0.1772152      0.6052204
9     1.9155635   1.5785873   1.6357279  1.0632911      0.5043503
10    0.2736519  -0.3946468  -0.5033009  0.5316456      1.4121810
11    0.2736519   0.0000000  -0.1258252  0.1772152      1.4121810
12    0.8209558   0.7892937  -0.3774757  0.1772152     -0.1008701
13    1.0946077   1.1839405   1.1324270  1.4177215      1.2104408
14    0.0000000   0.9866171   0.7549513  1.0632911     -0.6052204
15    0.0000000  -0.9866171   0.1258252 -0.7088608     -0.8069605
16   -0.2736519  -0.5919702  -0.6291261 -0.8860759     -0.3026102
17    0.2736519   0.7892937   0.0000000  0.0000000      1.0087007
18   -1.0946077  -0.3946468  -1.1324270 -0.5316456      0.3026102
19   -0.5473039  -0.7892937  -1.3840775  0.0000000     -0.5043503
20    1.6419116   1.1839405   1.3840775  0.1772152      1.3113109
21    0.5473039  -0.7892937   0.1258252 -0.8860759      0.9078306

Deviations from sample medians for standardized values in group NS 
   Total_length Alar_extent L_beak_head  L_humerus L_keel_sternum
1    -1.0946077  -0.3946468  -0.1258252 -0.8860759      0.0000000
2    -0.8209558  -0.3946468   0.0000000 -0.5316456     -0.1008701
3     0.2736519   0.0000000   1.3840775  0.5316456      1.0087007
4    -1.9155635  -1.9732341  -1.5099027 -2.3037975     -0.9078306
5     0.2736519   1.5785873   0.2516504  0.5316456      1.8156612
6    -1.0946077  -0.9866171  -0.6291261  0.0000000     -0.7060905
7    -0.5473039   0.5919702   0.8807766  1.7721519      0.7060905
8     1.6419116   0.5919702   2.0132036  2.3037975      2.0174014
9    -1.6419116  -2.1705575  -1.7615531 -2.1265823     -0.9078306
10    0.8209558  -0.5919702  -1.5099027 -0.8860759      2.4208816
11    0.8209558   0.1973234   0.1258252  0.5316456      0.6052204
12    0.0000000   0.5919702   0.3774757  0.0000000      1.0087007
13    0.0000000   0.9866171  -0.7549513 -0.7088608     -1.7147912
14   -1.0946077   0.1973234  -0.7549513  0.0000000      0.6052204
15    0.8209558   1.9732341   0.5033009  1.0632911      1.5130510
16   -1.9155635  -2.3678810  -1.3840775 -2.1265823     -2.1182714
17    0.0000000   0.0000000  -0.8807766 -0.5316456     -0.2017401
18   -1.0946077  -0.7892937  -0.3774757 -1.0632911     -1.4121810
19    1.0946077   1.3812639   2.3906793  1.7721519      2.1182714
20    1.0946077   0.0000000  -0.6291261 -0.7088608      0.0000000
21   -0.8209558  -0.9866171   0.2516504 -0.5316456     -0.4034803
22    0.0000000  -0.7892937   0.0000000 -0.1772152     -0.4034803
23    0.5473039   0.5919702   0.7549513  1.0632911      0.1008701
24   -1.0946077  -1.3812639  -1.0066018 -1.4177215     -1.1095708
25    0.8209558   0.9866171   0.5033009  1.0632911     -0.3026102
26   -1.6419116  -0.9866171  -1.1324270  0.1772152     -0.3026102
27    0.8209558   0.5919702   1.2582522  0.0000000      0.4034803
28    1.3682597   1.1839405   1.0066018  0.5316456      0.2017401

d's computed from standardized values around the median for group S 
   Survivorship       dij
1             S 1.0591512
2             S 2.0988645
3             S 1.2177369
4             S 2.0951565
5             S 0.8881331
6             S 2.4597599
7             S 0.8635666
8             S 1.9593990
9             S 3.1971682
10            S 1.6615792
11            S 1.4547775
12            S 1.2169720
13            S 2.7124479
14            S 1.7436297
15            S 1.4638696
16            S 1.3030032
17            S 1.3097125
18            S 1.7350859
19            S 1.7585692
20            S 2.7864315
21            S 1.5961344

Mean of d's for group S : 1.741959
Variance of d's for group S : 0.4024835 

d's computed from standardized values around the median for group NS 
   Survivorship       dij
1            NS 1.4679492
2            NS 1.0594981
3            NS 1.8140231
4            NS 3.9968090
5            NS 2.4918716
6            NS 1.7509834
7            NS 2.2505163
8            NS 4.0591480
9            NS 3.9820562
10           NS 3.1543624
11           NS 1.1737443
12           NS 1.2289808
13           NS 2.2330152
14           NS 1.4742272
15           NS 2.8706871
16           NS 4.4945901
17           NS 1.0483861
18           NS 2.2557663
19           NS 4.0557366
20           NS 1.4479121
21           NS 1.4683845
22           NS 0.9039834
23           NS 1.5364520
24           NS 2.7130029
25           NS 1.7671600
26           NS 2.2526997
27           NS 1.6644495
28           NS 2.1471942

Mean of d's for group NS : 2.241557
Variance of d's for group NS : 1.110142 

 Van Valen's test based on a t-test of d-values
 t = -1.9241 , df = 47 , p-value = 0.0302 
 Alternative hypothesis: true difference in means is less than 0 

Comparison of Means for Several Samples

The Single-Variable Case: One-factor ANOVA

The comparison of several samples (classified by a single factor) for a single variable is customarily performed using one-factor or one-way ANOVA, a procedure focused on testing the hypothesis that all samples came from populations with the same mean. The code below gives the one-factor analysis of variance for maximum breadth of Egyptian skulls, applied to the comparison of periods (Period is the single factor here); see Section 4.8.1 of MSMAP5 for more details. The analysis of variance table is obtained using the summary method of aov, the basic function in the stats package useful to fit an analysis of variance model.

# One-factor ANOVA tests: comparing univariate means
# Variable: Maximum_breadth
library("smsets")
skulls.aovMB <- aov(Maximum_breadth ~ Period, data = skulls)
summary(skulls.aovMB)
             Df Sum Sq Mean Sq F value   Pr(>F)    
Period        4  502.8  125.71   5.955 0.000183 ***
Residuals   145 3061.1   21.11                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Similar code can be written to perform ANOVAs for the other three morphological variables.

The Multivariate Case: One-factor MANOVA

For the case of several variables and one single factor determining two or more samples, the procedure called “One-Factor Multivariate Analysis of Variance or One-factor/One-way MANOVA was described in Section 4.7. Four statistics were defined to test the hypothesis that all samples came from populations with the same mean vector: Wilks’ lambda, Roy’s largest root, Pillai’s trace and Lawley-Hotelling trace. The calculation of these statistics in R are made by the manova function, an extension of the aov function with the capacity of handling matrix operations involved in the MANOVA. The summary method for manova determines the particular test given as output, being Pillai’s trace the default test statistic.

The manova and summary functions applied to the comparison of samples of Egyptian skulls using Wilks’ lambda is:

# One-factor MANOVA: comparing multivariate means
skulls.mnv <- manova(as.matrix(skulls[, -1]) ~ Period, data = skulls)
# Approximate F-test after the one-factor MANOVA
summary(skulls.mnv, test="Wilks")
           Df   Wilks approx F num Df den Df   Pr(>F)    
Period      4 0.66359   3.9009     16 434.45 7.01e-07 ***
Residuals 145                                            
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The smsets package implements the convenience function MANOVA.mat which optionally displays extra information to the tests offered by the manova function. The following chunk of code tests the difference between periods for the skulls data with respect to their multivariate means based on Pillai’s trace, by calling MANOVA.mat function. The print method of the object produced by this function, indicates that a long output is wanted.

res.MANOVA <- OnewayMANOVA(skulls, group = Period)
print(res.MANOVA, long = TRUE)
 One-factor Multivariate Analysis of Variance with extra matrix info

 Data: skulls 
 Variables: Maximum_breadth Basibregmatic_height Basialveolar_length Nasal_height 
 Factor: Period 
 Levels: 12th and 13th Dynasty Early predynastic Late predynastic Ptolemaic period Roman period 

Between-Sample Sum of Squares and Cross Products Matrix, B
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth             502.8267           -228.14667           -626.6267
Basibregmatic_height       -228.1467            229.90667            292.2800
Basialveolar_length        -626.6267            292.28000            803.2933
Nasal_height                135.4333            -66.06667           -180.7333
                     Nasal_height
Maximum_breadth         135.43333
Basibregmatic_height    -66.06667
Basialveolar_length    -180.73333
Nasal_height             61.20000

Within-Sample Sum of Squares and Cross Products Matrix, W
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth          3061.066667             5.333333            11.46667
Basibregmatic_height        5.333333          3405.266667           754.00000
Basialveolar_length        11.466667           754.000000          3505.96667
Nasal_height              291.300000           412.533333           164.33333
                     Nasal_height
Maximum_breadth          291.3000
Basibregmatic_height     412.5333
Basialveolar_length      164.3333
Nasal_height            1472.1333

Total Sum of Squares and Cross Products Matrix, T
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth            3563.8933            -222.8133             -615.16
Basibregmatic_height       -222.8133            3635.1733             1046.28
Basialveolar_length        -615.1600            1046.2800             4309.26
Nasal_height                426.7333             346.4667              -16.40
                     Nasal_height
Maximum_breadth          426.7333
Basibregmatic_height     346.4667
Basialveolar_length      -16.4000
Nasal_height            1533.3333

                      One-Way MANOVA
           Df  Pillai approx F num Df den Df    Pr(>F)    
Period      4 0.35331    3.512     16    580 4.675e-06 ***
Residuals 145                                             
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comparison of Variation for Several Samples: The Multivariate Case

Testing the equality of several covariance matrices: Box’s M test

Box’s M test was described in Section 4.8 of MSMAP5 as one of the best multivariate method known for comparing the variation in several samples. Box’s M test applied to the Egyptian skulls data, using function boxM from package biotools (da Silva et al., 2017; da Silva, 2025) is shown below. This function produces an approximate chi-square statistic for M.

library(biotools)
Loading required package: MASS
---
biotools version 4.3
groups <- skulls[, 1] # The grouping variable is located in the 1st column 
vars <- skulls[, -1]  # The y-variables are not located in the 1st column
# Producing the chi-square test of homogeneity of variance-covariance matrices
chitest.boxM <- boxM(vars, groups)
chitest.boxM

    Box's M-test for Homogeneity of Covariance Matrices

data:  vars
Chi-Sq (approx.) = 45.667, df = 40, p-value = 0.2483

Alternatively, function BoxM.F in the smsets package can be accessed to perform again Box’s M test but now following the procedure described in Section 4.8 (an F approximation). Covariance matrices are also shown, as a result of the option long = TRUE added to the print method of BoxM.F.

resBoxM.F <- BoxM.F(skulls, Period)
print(resBoxM.F, long = TRUE)
 Box's M-test for Homogeneity of Covariance Matrices (F approximation)

 Data: skulls 
 Variables: Maximum_breadth Basibregmatic_height Basialveolar_length Nasal_height 
 Factor: Period 
 Levels: Early predynastic Late predynastic 12th and 13th Dynasty Ptolemaic period Roman period 

Covariance matrix for each group
Early predynastic 
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth            26.309195            4.1517241           0.4540230
Basibregmatic_height        4.151724           19.9724138          -0.7931034
Basialveolar_length         0.454023           -0.7931034          34.6264368
Nasal_height                7.245977            0.3931034          -1.9195402
                     Nasal_height
Maximum_breadth         7.2459770
Basibregmatic_height    0.3931034
Basialveolar_length    -1.9195402
Nasal_height            7.6367816

Late predynastic 
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth            23.136782             1.010345           4.7678161
Basibregmatic_height        1.010345            21.596552           3.3655172
Basialveolar_length         4.767816             3.365517          18.8919540
Nasal_height                1.842529             5.624138           0.1908046
                     Nasal_height
Maximum_breadth         1.8425287
Basibregmatic_height    5.6241379
Basialveolar_length     0.1908046
Nasal_height            8.7367816

12th and 13th Dynasty 
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth           12.1195402           0.78620690          -0.7747126
Basibregmatic_height       0.7862069          24.78620690           3.5931034
Basialveolar_length       -0.7747126           3.59310345          20.7229885
Nasal_height               0.8988506          -0.08965517           1.6701149
                     Nasal_height
Maximum_breadth        0.89885057
Basibregmatic_height  -0.08965517
Basialveolar_length    1.67011494
Nasal_height          12.59885057

Ptolemaic period 
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth            15.362069            -5.534483           -2.172414
Basibregmatic_height       -5.534483            26.355172            8.110345
Basialveolar_length        -2.172414             8.110345           21.085057
Nasal_height                2.051724             6.148276            5.328736
                     Nasal_height
Maximum_breadth          2.051724
Basibregmatic_height     6.148276
Basialveolar_length      5.328736
Nasal_height             7.964368

Roman period 
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth           28.6264368           -0.2298851          -1.8793103
Basibregmatic_height      -0.2298851           24.7126437          11.7241379
Basialveolar_length       -1.8793103           11.7241379          25.5689655
Nasal_height              -1.9942529            2.1494253           0.3965517
                     Nasal_height
Maximum_breadth        -1.9942529
Basibregmatic_height    2.1494253
Basialveolar_length     0.3965517
Nasal_height           13.8264368

Pooled Covariance Matrix
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth          21.11080460           0.03678161          0.07908046
Basibregmatic_height      0.03678161          23.48459770          5.20000000
Basialveolar_length       0.07908046           5.20000000         24.17908046
Nasal_height              2.00896552           2.84505747          1.13333333
                     Nasal_height
Maximum_breadth          2.008966
Basibregmatic_height     2.845057
Basialveolar_length      1.133333
Nasal_height            10.152644

 Box's M = 2.8725e-11 
 F statistic = 1.1406 , Num df = 40.0 , Den df = 46378.7 , p-value = 0.2498

Differences between the p-values for the two approximations, chi square and F, are negligible.

Robust m-sample Levene’s test, based on absolute deviations around medians

In Section 4.8 in Manly et al. (2024) it is pointed out that Box’s test is sensitive to deviations from normality thus, robust alternatives to Box’s test are recommended. One of the those alternatives is Levene’s test based on absolute deviations from sample medians for the data in m samples. For a single variable, these can be treated as the observations for a one-factor analysis of variance (or one-way ANOVA). This case was illustrated in Section 3.1.1 of this vignette, using leveneTest function from the car package (Fox & Weisberg, 2019). A significant F-ratio in any of the p variables considered individually indicates that populations differ in covariance matrices. When all variables are considered together, the multivariate Levene’s test version can generated with any of the four tests produced by manova or OneWayMANOVA in smsets, and a significant result indicates that the covariance matrix is not constant for the m populations.

The application of Levene’s test to one variable at a time involves the application of p one-way ANOVAs for the absolute deviations around medians. Therefore, the p-values can be subject to corrections for multiple testing, depending on the number of response variables analyzed, p. Levenetestsms.mv allows to specify adjustments to p-values (as described in 1.1.2 and 3.1.2.) with the argument P.adjust. The code shown below produces Levene’s tests, both univariate and multivariate, for the comparison of variation among the m=5 samples from various past ages on the basis of p = 4 measurements on male Egyptian skulls. The p-values of each univariate Levene’s tests have been adjusted using Bonferroni’s correction, and the implicit one-way ANOVAs assume that the variance of absolute deviations around medians are equal. The option format = TRUE in the print method for the object res.Levenems.mv establishes that the ANOVA tables will be displayed using the output format produced by both oneway.test and anova.lm. The print method for Levenetestsms.mv determines that EffectSize = TRUE thus, by default, four different effect size measures among samples for each variable are computed. These measures indicate the magnitude of differences in variation among Periods for each measurement made. The option multivariate = TRUE dictates that the multivariate Levene’s test is requested, using Pillai’s statistic as the (default) test statistic for the MANOVA. Finally, as a consequence of the option long = TRUE, the output listing also contains the matrix of medians for the original measurements, and the matrix of means and variances of absolute deviations around medians.

data(skulls)
res.Levenems.mv <- Levenetestsms.mv(skulls, Period, var.equal = TRUE,
                                    P.adjust = "bonferroni")
print(res.Levenems.mv, format = "both", multivariate = TRUE, long = TRUE)
           m-sample Levene's tests for homogeneity of variance in multivariate data
             Testing variation using F-tests via absolute deviations from medians

Data:  skulls 
Variables:  Maximum_breadth Basibregmatic_height Basialveolar_length Nasal_height 
Group levels:  Early predynastic Late predynastic 12th and 13th Dynasty Ptolemaic period Roman period 

Summary statistics
Matrix of medians for the original variables
                      Maximum_breadth Basibregmatic_height Basialveolar_length
Early predynastic                 131                  134                 100
Late predynastic                  132                  133                  98
12th and 13th Dynasty             136                  134                  96
Ptolemaic period                  135                  132                  94
Roman period                      137                  130                  94
                      Nasal_height
Early predynastic               50
Late predynastic                50
12th and 13th Dynasty           50
Ptolemaic period                52
Roman period                    52
 
Matrix of mean absolute deviations around medians
                      Maximum_breadth Basibregmatic_height Basialveolar_length
Early predynastic                 4.0                  3.3                 4.6
Late predynastic                  4.0                  3.4                 4.7
12th and 13th Dynasty             5.6                  3.3                 5.3
Ptolemaic period                  5.0                  3.7                 6.0
Roman period                      6.3                  4.7                 6.2
                      Nasal_height
Early predynastic              2.1
Late predynastic               2.2
12th and 13th Dynasty          2.1
Ptolemaic period               2.6
Roman period                   2.6
 
Matrix of variances of absolute deviations around medians
                      Maximum_breadth Basibregmatic_height Basialveolar_length
Early predynastic                10.2                  9.1                  14
Late predynastic                  9.9                  8.4                  12
12th and 13th Dynasty            15.7                  8.5                  16
Ptolemaic period                 13.8                  8.7                  20
Roman period                     18.1                 10.2                  22
                      Nasal_height
Early predynastic              3.2
Late predynastic               2.8
12th and 13th Dynasty          3.2
Ptolemaic period               2.9
Roman period                   2.9

Individual Levene's tests, one for each variable.
P-values adjusted using Bonferroni method.

Format: "oneway.test"

    Levene's test for Homogeneity of Variance (center = median)

data:  Maximum_breadth and Period
F = 2.2715, num df = 4, denom df = 145, p-value = 0.2574


    Levene's test for Homogeneity of Variance (center = median)

data:  Basibregmatic_height and Period
F = 1.2357, num df = 4, denom df = 145, p-value = 1


    Levene's test for Homogeneity of Variance (center = median)

data:  Basialveolar_length and Period
F = 0.96275, num df = 4, denom df = 145, p-value = 1


    Levene's test for Homogeneity of Variance (center = median)

data:  Nasal_height and Period
F = 0.62588, num df = 4, denom df = 145, p-value = 1


Format: "anova.lm" 

Levene's test for Homogeneity of Variance (center = median)

Response: Maximum_breadth
           Df  Sum Sq Mean Sq F value Pr(>F)
Period      4  122.83  30.707  2.2715 0.2574
Residuals 145 1960.17  13.518               


Levene's test for Homogeneity of Variance (center = median)

Response: Basibregmatic_height
           Df  Sum Sq Mean Sq F value Pr(>F)
Period      4   44.37 11.0933  1.2357      1
Residuals 145 1301.77  8.9777               


Levene's test for Homogeneity of Variance (center = median)

Response: Basialveolar_length
           Df  Sum Sq Mean Sq F value Pr(>F)
Period      4   64.69  16.173  0.9628      1
Residuals 145 2435.87  16.799               


Levene's test for Homogeneity of Variance (center = median)

Response: Nasal_height
           Df Sum Sq Mean Sq F value Pr(>F)
Period      4   7.49  1.8733  0.6259      1
Residuals 145 434.00  2.9931               



Effect size (E.S.) of variation for each variable.
For one-way between subjects designs, partial eta/omega/epsilon squared are
equivalent to eta/omega/epsilon squared. Returning eta/omega/epsilon squared.

Response: Maximum_breadth 
          E.S. Measure 95%-LCL 95%-UCL
eta^2            0.059       0       1
omega^2          0.033       0       1
epsilon^2        0.033       0       1
Cohen's f        0.250       0     Inf

Response: Basibregmatic_height 
          E.S. Measure 95%-LCL 95%-UCL
eta^2            0.033       0       1
omega^2          0.006       0       1
epsilon^2        0.006       0       1
Cohen's f        0.185       0     Inf

Response: Basialveolar_length 
          E.S. Measure 95%-LCL 95%-UCL
eta^2            0.026       0       1
omega^2          0.000       0       1
epsilon^2        0.000       0       1
Cohen's f        0.163       0     Inf

Response: Nasal_height 
          E.S. Measure 95%-LCL 95%-UCL
eta^2            0.017       0       1
omega^2          0.000       0       1
epsilon^2        0.000       0       1
Cohen's f        0.131       0     Inf


Multivariate Levene's test

           Df  Pillai approx F num Df den Df Pr(>F)
Period      4 0.14555   1.3688     16    580 0.1512
Residuals 145                                      

Extra function: Penrose.dist

Penrose.dist in the smsets package returns Penrose’s (1953) distances between m multivariate populations, when information is available on the means and variances only. This function is described in Chapter 5 of MSMAP5.

Let the mean of \(X_k\) in population i be \(\mu_{ki}\), \(k=1,..., p\); \(i=1,...,m\), and assume that the variance of variable \(X_k\) is \(V_k\). The Penrose (1953) distance \(P_{ij}\) between population i and population j is given by

\[P_{ij} = \sum_{k = 1}^{p}\frac{(\mu_{ki} - \mu_{kj})^2}{p \cdot V_k}\]

A disadvantage of Penrose’s measure is that it does not consider the correlations between the p variables.

Penrose’s distances between Periods for the skulls data are displayed below, along with sample sizes, the mean vector for each Period, the covariance matrix for each Period, and the pooled covariance matrix.

res.Penrose <- Penrose.dist(x = skulls, group = Period)
# Long output
print(res.Penrose, long = TRUE)
                     Calculation of Penrose distances

 Data: skulls 
 Variables: Maximum_breadth Basibregmatic_height Basialveolar_length Nasal_height 
 Factor: Period 
 Levels: Early predynastic Late predynastic 12th and 13th Dynasty Ptolemaic period Roman period 

Population/Sample sizes
Period
    Early predynastic      Late predynastic 12th and 13th Dynasty 
                   30                    30                    30 
     Ptolemaic period          Roman period 
                   30                    30 

Mean vectors
                     Early predynastic Late predynastic 12th and 13th Dynasty
Maximum_breadth                 131.37           132.37                134.47
Basibregmatic_height            133.60           132.70                133.80
Basialveolar_length              99.17            99.07                 96.03
Nasal_height                     50.53            50.23                 50.57
                     Ptolemaic period Roman period
Maximum_breadth                135.50       136.17
Basibregmatic_height           132.30       130.33
Basialveolar_length             94.53        93.50
Nasal_height                    51.97        51.37

Covariance matrices
$`Early predynastic`
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth                26.31                 4.15                0.45
Basibregmatic_height            4.15                19.97               -0.79
Basialveolar_length             0.45                -0.79               34.63
Nasal_height                    7.25                 0.39               -1.92
                     Nasal_height
Maximum_breadth              7.25
Basibregmatic_height         0.39
Basialveolar_length         -1.92
Nasal_height                 7.64

$`Late predynastic`
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth                23.14                 1.01                4.77
Basibregmatic_height            1.01                21.60                3.37
Basialveolar_length             4.77                 3.37               18.89
Nasal_height                    1.84                 5.62                0.19
                     Nasal_height
Maximum_breadth              1.84
Basibregmatic_height         5.62
Basialveolar_length          0.19
Nasal_height                 8.74

$`12th and 13th Dynasty`
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth                12.12                 0.79               -0.77
Basibregmatic_height            0.79                24.79                3.59
Basialveolar_length            -0.77                 3.59               20.72
Nasal_height                    0.90                -0.09                1.67
                     Nasal_height
Maximum_breadth              0.90
Basibregmatic_height        -0.09
Basialveolar_length          1.67
Nasal_height                12.60

$`Ptolemaic period`
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth                15.36                -5.53               -2.17
Basibregmatic_height           -5.53                26.36                8.11
Basialveolar_length            -2.17                 8.11               21.09
Nasal_height                    2.05                 6.15                5.33
                     Nasal_height
Maximum_breadth              2.05
Basibregmatic_height         6.15
Basialveolar_length          5.33
Nasal_height                 7.96

$`Roman period`
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth                28.63                -0.23               -1.88
Basibregmatic_height           -0.23                24.71               11.72
Basialveolar_length            -1.88                11.72               25.57
Nasal_height                   -1.99                 2.15                0.40
                     Nasal_height
Maximum_breadth             -1.99
Basibregmatic_height         2.15
Basialveolar_length          0.40
Nasal_height                13.83

Pooled covariance matrix
                     Maximum_breadth Basibregmatic_height Basialveolar_length
Maximum_breadth                21.11                 0.04                0.08
Basibregmatic_height            0.04                23.48                5.20
Basialveolar_length             0.08                 5.20               24.18
Nasal_height                    2.01                 2.85                1.13
                     Nasal_height
Maximum_breadth              2.01
Basibregmatic_height         2.85
Basialveolar_length          1.13
Nasal_height                10.15

Penrose distances
                      Early predynastic Late predynastic 12th and 13th Dynasty
Late predynastic                  0.023                                       
12th and 13th Dynasty             0.216            0.163                      
Ptolemaic period                  0.493            0.404                 0.108
Roman period                      0.736            0.583                 0.244
                      Ptolemaic period
Late predynastic                      
12th and 13th Dynasty                 
Ptolemaic period                      
Roman period                     0.066

References

Curran, J., & Hersh, T. (2021). Hotelling: Hotelling’s t^2 test and variants. https://doi.org/10.32614/CRAN.package.Hotelling
da Silva, A. R., Malafaia, G., & Menezes, I. P. P. de. (2017). biotools: Tools for biometry and applied statistics in agricultural science. Genetics and Molecular Research, 16. https://doi.org/10.4238/gmr16029655
da Silva, A. R. (2025). biotools: Tools for biometry and applied statistics in agricultural science. https://CRAN.R-project.org/package=biotools
Fox, J., & Weisberg, S. (2019). An R companion to applied regression (Third Edition). Sage. https://www.john-fox.ca/Companion/index.html
Manly, B. F. J., Navarro Alberto, J. A., & Gerow, K. (2024). Multivariate statistical methods. A primer (Fifth Edition). CRC Press. https://www.routledge.com/Multivariate-Statistical-Methods-A-Primer/Manly-NavarroAlberto-Gerow/p/book/9781032591971
Penrose, L. W. (1953). Distance, size and shape. Annals of Eugenics, 18, 337–343.
Valen, van. (1978). The statistics of variation. Evolutionary Theory, 4, 33–43. (Erratum Evolutionary Theory 4:202).