Tuesday, May 22, 2012

Calculate interobserver agreement

Which one is the best way to calculate interobserver agreement related with behavioral observations?
I became a member of Research Gate (http://www.researchgate.net/home.Home.html). Below one of the questions and the answer formulated by me.

Which one is the best way to calculate interobserver agreement related with behavioral observations? (lizard stress study)
A colleague and I performed a study with lizards, where we subjected them to 4 different types of stress (cold, heat, low frequency noise and high frequency noise). We have videos of the behaviors they expressed during the experiment (flicking, head turns and so on). Now to begin the analysis of the videos, we need to make sure that our observations are more less the same, so we can exclude differences due observers bias.

We have agreed on the behaviors that we are recording and some of them are frequencies of events meanwhile others deal with duration of events. So far, we have the data of a section of our recordings that we analyzed separately and right now, we need to statistically probe that the data each one produced has no meaningful differences. Is there any statistical method you recommend?



There is a difference between assessment of association and assessment of agreement between observers.





If the variable for which you wish to calculate the agreement between observers:

a) is continuous (or ordinal with more than 5 values) things are easiest: you can use variance component analysis and calculate the interclass correlation coefficent (ICC), possibly corrected for any background factors (See: Snijder & Bosker, below). Use procedure VARCOMP in SPSS (or a similar procedure in R)

b) is dichotomous or categorical, you can use Cohen's kappa. Kappa can be calculated in SPSS using the RELIABILITY program.

c) is ordinal (with less than five values): use weighted kappas (weights concern off-diagonal distance in a observerXobserver crosstable for the item to be assessed). This can also been done in SPSS. But even if the number of options is less than 5, you can also apply variance component analysis as in a). Actually, the quadratically weighted kappa is equivalent to the ICC.

For b) and c) there is also a commercially available program called AGREE developed by Popping.

a) can also been done using multilevel analysis (cf. MLwiN), but that requires some extra skills.

All the above can be found in section 17.8 (page 453) of Adèr, Mellenbergh & Hand (2008), which also gives the appropriate references. This book is indexed for Google Books. Use the book's website to consult GB on this topic: www.jvank.nl/ARMHome
Herman Adèr

Adèr, H.j., Mellenbergh, G. J. & Hand, D. J. (2008). Advising on research methods: A consultant's companion. Huizen, The Netherlands: van Kessel.

Snijders, T.A.B. & Bosker, R. J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London: Sage.

No comments: