Saturday, June 23, 2007

An alternative modelling procedure based on a strong Theory

Suppose we have a strong theory T that translates into several (for instance, three) alternative models M1, M2 and M3. Furthermore, suppose we are able to construct a research design R, based on theory T which allows us to verify our theory. Suppose we are able to implement an experiment (for instance, a clinical trial ) based on R and that we have collected data D during this experiment.

The usual procedure would be to test whether the data D are consistent with models M1, M2 and/or M3.

But we could also go about as follows:

Generate data according to the models M1, M2 and M3, resulting in three data sets D1, D2 and D3 and test whether these data sets could have resulted from the same population as D.

Remarks:
  • The above is only possible if we have a strong theory T on which we can base our models beforehand.
  • An methodological advantage is that the researcher is forced to formulate his/her theoretical concepts and translate them into models before (s)he starts his or her experiment.
  • A second advantage is that deviations between D and Di (i= 1, 2, 3) give information both on the relationships between variables and on the influence of the underlying (possibly multivariate) distributions (this is assuming that our models are based on known theoretical distributions like the normal distribution, which is common practice).
  • A third advantage seems to be that we can directly test the alternative hypothesis.
  • This procedure can not be combined with crossvalidation (randomly splitting the data in two parts, one part to find models consistent with the data, another part to test those models), because in the first part, models are formulated that are consistent with (possibly multivariate) distribution violations in the data: the same violations are present in the second part of the data, too.

Questions:

  1. Does a weak theory simply translates into a larger set of models?
  2. Simulating data based on models M1, M2 and M3 may not be trivial. Can we use similar procedures as are used in MCMC (Markov chain Monte Carlo) ?
  3. Can we use a Bayesian perspective, for instance by assuming that D1, D2 and D3 are based on prior distributions for the data D?
  4. Is the above approach known and described in the `simulation community'?

1 comment:

Anonymous said...

Oi, achei teu blog pelo google tá bem interessante gostei desse post. Quando der dá uma passada pelo meu blog, é sobre camisetas personalizadas, mostra passo a passo como criar uma camiseta personalizada bem maneira. Até mais.