Wednesday, September 24, 2014

Bootstrapping factorial ANOVA in SPSS



The following question appeared in Research Gate:

Elisabeth Fontaine
University of Adelaide · School of Psychology

Bootstrapping factorial ANOVA in SPSS v21?

Our institution provides SPSS v21 with the Bootstrap module loaded. I want to run factorial ANOVA (2x2) with a continuous DV that is skewed in each condition. I understand bootstrapping is a robust tool to apply to skewed data. Andy Field gives an excellent Youtube explanation of how to run the bootstrap option in SPSS and explains it is good for skewed data but the actual 'beer goggles' data he uses is not skewed. Perhaps I'm over-thinking it, but my question is: given the Bootstrap option is available for multi-way ANOVAs in SPSS does it therefore matter that the raw data is skewed when I run the ANOVA?

Suggested answer (Herman Adèr): 

The problem with bootstrapping a factorial design Anova is not the characteristics of the dependent variable (Skewness in this case) but that Anova is sensitive to the unbalancedness of the factorial design itself (unequal cell counts). Different bootstrap samples can produce very unbalanced design tables which will produce biased F-tests.
Instead, consider using a (bootstrapped, least squares) regression analysis of the model:
y = const + factor1 + factor2 + factor1 * factor2 + epsilon
in which factor1 and factor2 are dummy variables. Regression analysis is not sensitive to the variance-covariance matrix within the cells and therefore less sensitive to cell unbalancedness.
Maybe do both analyses and compare the results.

Another comment/answer (Noel Artiles-LeonUniversity of Puerto Rico at Mayagüez)
Before jumping into the bootstrapping van wagon, I would try to apply a transformation to the dependent variable to make the residuals normally distributed. You may try first with Sqrt[Y]  and Log[Y] ... if these transformations fail, use a Box-Cox transformation.

Comment to this:
If you distrust bootstrapping an Anova, use regression analysis (with or) without bootstrapping.
Noel's transformation suggestion has the disadvantage of making the interpretation of the results less straightforward, unless the transformation has a proper substantive interpretation like it has in the case of reaction times. If you have count data you should use Poisson regression anyway. 
Personally, I prefer the regression solution since it avoids getting biased results due to the strict assumptions of Anova.

Herman Adèr


No comments: