Wednesday, September 24, 2014

Bootstrapping factorial ANOVA in SPSS



The following question appeared in Research Gate:

Elisabeth Fontaine
University of Adelaide · School of Psychology

Bootstrapping factorial ANOVA in SPSS v21?

Our institution provides SPSS v21 with the Bootstrap module loaded. I want to run factorial ANOVA (2x2) with a continuous DV that is skewed in each condition. I understand bootstrapping is a robust tool to apply to skewed data. Andy Field gives an excellent Youtube explanation of how to run the bootstrap option in SPSS and explains it is good for skewed data but the actual 'beer goggles' data he uses is not skewed. Perhaps I'm over-thinking it, but my question is: given the Bootstrap option is available for multi-way ANOVAs in SPSS does it therefore matter that the raw data is skewed when I run the ANOVA?

Suggested answer (Herman Adèr): 

The problem with bootstrapping a factorial design Anova is not the characteristics of the dependent variable (Skewness in this case) but that Anova is sensitive to the unbalancedness of the factorial design itself (unequal cell counts). Different bootstrap samples can produce very unbalanced design tables which will produce biased F-tests.
Instead, consider using a (bootstrapped, least squares) regression analysis of the model:
y = const + factor1 + factor2 + factor1 * factor2 + epsilon
in which factor1 and factor2 are dummy variables. Regression analysis is not sensitive to the variance-covariance matrix within the cells and therefore less sensitive to cell unbalancedness.
Maybe do both analyses and compare the results.

Another comment/answer (Noel Artiles-LeonUniversity of Puerto Rico at Mayagüez)
Before jumping into the bootstrapping van wagon, I would try to apply a transformation to the dependent variable to make the residuals normally distributed. You may try first with Sqrt[Y]  and Log[Y] ... if these transformations fail, use a Box-Cox transformation.

Comment to this:
If you distrust bootstrapping an Anova, use regression analysis (with or) without bootstrapping.
Noel's transformation suggestion has the disadvantage of making the interpretation of the results less straightforward, unless the transformation has a proper substantive interpretation like it has in the case of reaction times. If you have count data you should use Poisson regression anyway. 
Personally, I prefer the regression solution since it avoids getting biased results due to the strict assumptions of Anova.

Herman Adèr


Advising on research methods: Selected topics 2013


In 2014, A third collection of selected topics:

Advising on research methods: Selected topics 2013
Herman J. Adèr and Gideon J. Mellenbergh (Eds.)

was published by Johannes van Kessel Publishing.

Like the previous booklets, this one also has its own website:
www.jvank.nl/ARMSelected2013

All books on methodological advising can be found at:
www.jvank.nl/publishing/scientific

The 2011 and 2012 selected topics booklets are also available as e-books.The 2013 edition will be converted soon.

Herman

CONTENTS

Random or non-random assignment:
 What difference does it make? 
by Daan R. van Renswoude
Parametric IRT models and item analysis in R
by Joost Kruis
Comparing item imputation methods 
in questionnaire research
by Paul Lodder
Bootstrap basics
by Abe Huijbers
Data mining: Characteristics and application
 to the Math Garden data
 by Lisa Wijsen
Interpreting economic games
by Simon Columbus






Friday, July 26, 2013

Advising on research methods: Selected topics 2011

A similar publication as the one mentioned in the previous post, did appear the year before, following the same format. The title is:

Advising on research methods: Selected topics 2011
Edited by Herman J. Adèr and Gideon J. Mellenbergh

also published by Johannes van Kessel.
The paperback edition is sold out. But a eBook/iBook version is still available. This booklet is also indexed for Google Books. For more information, see it's website.

Table of contents:



From research question to statistical model
by Anja Sommavilla and Corinne Brenner
Pitfalls and payoffs in Internet sampling
by Corinne Brenner and Charlotte M. W. Gaasterland
Introduction to Computerized Adaptive Testing
by César-Reyer Vroom and Daniel A. Bannan
A critique of stepwise model selection methods
by Daniel A. Bannan and César-Reyer Vroom
A short introduction into survival analysis
by Charlotte M. W. Gaasterland and Mattis van den Bergh
On the relevance of mixed methods
by Mattis van den Bergh and Anja Sommavilla

Advising on research methods: Selected topics 2012


During last year's course `Advising on research methods' given at the Department of psychological methods by Don Mellenbergh and myself, participants (master's students) wrote papers on methodological topics which were afterwards collected in a booklet entitled:

Advising on research methods: Selected topics 2012
Edited by Herman J. Adèr and Gideon J. Mellenbergh

The booklet is published by Johannes van Kessel (ISBN 97-890-79418-21-3).
Since the papers were carefully reviewed, the contributions are of high quality.
The table of contents is:


Measurement Invariance
by Jonas Dalege and Loes Kreemers
A Comparison of Classical Test Theory
and Item Response Theory
by Marie K. Deserno
Unit nonresponse in Surveys
by Milou K. M. Lünnemann
Effect size: A meta-analytic perspective
by Adam Sasiadek and David Scholz
Outliers and Extreme Observations:
What are they and how to handle them?
by Suzan Q. Blommestijn and Esther A. Lietaert Peerbolte
Questionable research practices and scientific fraud
by Jochem Bout




The booklet is also available as an eBook/iBook.
For more information see the website of the booklet.

Herman



 

Tuesday, May 22, 2012

Confidence Intervals for Animal Resource Selection

Confidence Intervals for Animal Resource Selection

(Question posed and answer given on Research Gate)

What is the best method for determining the Confidence Interval of Animal Resource Selection datasets?





I am not sure what your data set looks like.
However, a very general method to get confidence intervals for any statistic of which the probability distribution is not available, is by bootstrapping.
This can be done using the freely available, high quality R statistical package.
Regards.
Herman

And on Mohammad's further question after bootstrapping using the statistical package R:

Dear Mohammad,
There is a package called boot which you have to load.
Note that using R is not straightforward, but it has excellent online documentation.
So if you are not familiar with it, you have to take some time to familiarize.
As to the boot package, as a result you get several confidence intervals.
The preferred one is BCa (bias corrected and accelerated), but it not always
converges. You have to take a look and pick another one.
The package is based on Efron and Tibshirani's book:
Introduction to the bootstrap (1993).
Furthermore, if you have SPSS available: it has provisions
for bootstrapping also nowadays.
They may be less general than provisions offered
in R which allows to program a function that calculates
the statistic you want to bootstrap.
There is also the book by myself, Don Mellenbergh and David Hand:
`Advising on research methods: A consultant's companion' (2008)
which gives a concise but clear discussion of the bootstrap.
It has also been indexed for Google Books, so that you
can consult it online. See: www.jvank.nl/ARMHome
Best,
Herman

Calculate interobserver agreement

Which one is the best way to calculate interobserver agreement related with behavioral observations?
I became a member of Research Gate (http://www.researchgate.net/home.Home.html). Below one of the questions and the answer formulated by me.

Which one is the best way to calculate interobserver agreement related with behavioral observations? (lizard stress study)
A colleague and I performed a study with lizards, where we subjected them to 4 different types of stress (cold, heat, low frequency noise and high frequency noise). We have videos of the behaviors they expressed during the experiment (flicking, head turns and so on). Now to begin the analysis of the videos, we need to make sure that our observations are more less the same, so we can exclude differences due observers bias.

We have agreed on the behaviors that we are recording and some of them are frequencies of events meanwhile others deal with duration of events. So far, we have the data of a section of our recordings that we analyzed separately and right now, we need to statistically probe that the data each one produced has no meaningful differences. Is there any statistical method you recommend?



There is a difference between assessment of association and assessment of agreement between observers.





If the variable for which you wish to calculate the agreement between observers:

a) is continuous (or ordinal with more than 5 values) things are easiest: you can use variance component analysis and calculate the interclass correlation coefficent (ICC), possibly corrected for any background factors (See: Snijder & Bosker, below). Use procedure VARCOMP in SPSS (or a similar procedure in R)

b) is dichotomous or categorical, you can use Cohen's kappa. Kappa can be calculated in SPSS using the RELIABILITY program.

c) is ordinal (with less than five values): use weighted kappas (weights concern off-diagonal distance in a observerXobserver crosstable for the item to be assessed). This can also been done in SPSS. But even if the number of options is less than 5, you can also apply variance component analysis as in a). Actually, the quadratically weighted kappa is equivalent to the ICC.

For b) and c) there is also a commercially available program called AGREE developed by Popping.

a) can also been done using multilevel analysis (cf. MLwiN), but that requires some extra skills.

All the above can be found in section 17.8 (page 453) of Adèr, Mellenbergh & Hand (2008), which also gives the appropriate references. This book is indexed for Google Books. Use the book's website to consult GB on this topic: www.jvank.nl/ARMHome
Herman Adèr

Adèr, H.j., Mellenbergh, G. J. & Hand, D. J. (2008). Advising on research methods: A consultant's companion. Huizen, The Netherlands: van Kessel.

Snijders, T.A.B. & Bosker, R. J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London: Sage.

A log book or diary to track all the steps taken for a project

A log book or diary to track all the steps taken for a project,

I became a member of Research Gate (http://www.researchgate.net/home.Home.html).
In the next posts I give some of the questions and answers formulated by me.

The first one is posted by Julia Law:

Diary of research steps as preface to writing up methodology
I would appreciate advice on methods used to write up a log book or diary to track all the steps I am taking for a few projects I am working on. I would like to develop a template to use for future projects also, so any suggestions as to what works and what doesn't would be appreciated. Thanks!






Your question is an interesting one. I have done some work on this problem.





But the way you formulate your question is a bit to general to answer in a satisfactory way. In particular, it is unclear in what field of research your projects are. Different disciplines have different methods to plan and specify research. For instance, in Medicine it is usual to formulate a protocol that is first thoroughly discussed by a local research committy before it is submitted to a medical ethical committy for assessment. There is also a diagramming method developed to specify a clinical trial (CONSORT statement), although I am doubtful about its usefulnes. However, most medical journals require such a statement for articles that describe intended research.

I have stressed the importance of properly documenting the steps taken in data analysis in our book (See Section 15.1 in Adèr, Mellenbergh and Hand, 2008). I also proposed a special diagramming method to represent methodological knowledge (See Appendix B of the same book). But in practice projects are quite varied and it is difficult to think up a general method to formally specify research procedures, let alone to develop a template that could be generally used. But maybe your own projects are quite similar and then the above may be useful.

Finally, for my own projects I always use a program called `Advanced Diary'. It is commercially available for a few dollars and makes it possibly to log each of your projects separately.

I hope this helps.

Herman Adèr


Adèr, H. J., Mellenbergh, G. J. and Hand, D. J. (2008). Advising on research methods: A consultant's companion. Johannes van Kessel: Huizen, The Netherlands.

The book has its own website: www.jvank.nl/ARMHome that links to Google Books so that you can inspect it online.