A comparison of DIMTEST and generalized dimensionality discrepancy approaches to assessing dimensionality in item response theory

151992-Thumbnail Image.png
Description
Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under

Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools; see e.g., Froelich & Habing, 2007). It remains to be seen how such procedures perform in the context of small-scale assessments characterized by relatively small sample sizes and/or short tests. The fact that some procedures come with minimum allowable values for characteristics of the data, such as the number of items, may even render them unusable for some small-scale assessments. Other measures designed to assess dimensionality do not come with such limitations and, as such, may perform better under conditions that do not lend themselves to evaluation via statistics that rely on asymptotic theory. The current work aimed to evaluate the performance of one such metric, the standardized generalized dimensionality discrepancy measure (SGDDM; Levy & Svetina, 2011; Levy, Xu, Yel, & Svetina, 2012), under both large- and small-scale testing conditions. A Monte Carlo study was conducted to compare the performance of DIMTEST and the SGDDM statistic in terms of evaluating assumptions of unidimensionality in item response data under a variety of conditions, with an emphasis on the examination of these procedures in small-scale assessments. Similar to previous research, increases in either test length or sample size resulted in increased power. The DIMTEST procedure appeared to be a conservative test of the null hypothesis of unidimensionality. The SGDDM statistic exhibited rejection rates near the nominal rate of .05 under unidimensional conditions, though the reliability of these results may have been less than optimal due to high sampling variability resulting from a relatively limited number of replications. Power values were at or near 1.0 for many of the multidimensional conditions. It was only when the sample size was reduced to N = 100 that the two approaches diverged in performance. Results suggested that both procedures may be appropriate for sample sizes as low as N = 250 and tests as short as J = 12 (SGDDM) or J = 19 (DIMTEST). When used as a diagnostic tool, SGDDM may be appropriate with as few as N = 100 cases combined with J = 12 items. The study was somewhat limited in that it did not include any complex factorial designs, nor were the strength of item discrimination parameters or correlation between factors manipulated. It is recommended that further research be conducted with the inclusion of these factors, as well as an increase in the number of replications when using the SGDDM procedure.
Date Created
2013
Agent

An investigation of power analysis approaches for latent growth modeling

150016-Thumbnail Image.png
Description
Designing studies that use latent growth modeling to investigate change over time calls for optimal approaches for conducting power analysis for a priori determination of required sample size. This investigation (1) studied the impacts of variations in specified parameters,

Designing studies that use latent growth modeling to investigate change over time calls for optimal approaches for conducting power analysis for a priori determination of required sample size. This investigation (1) studied the impacts of variations in specified parameters, design features, and model misspecification in simulation-based power analyses and (2) compared power estimates across three common power analysis techniques: the Monte Carlo method; the Satorra-Saris method; and the method developed by MacCallum, Browne, and Cai (MBC). Choice of sample size, effect size, and slope variance parameters markedly influenced power estimates; however, level-1 error variance and number of repeated measures (3 vs. 6) when study length was held constant had little impact on resulting power. Under some conditions, having a moderate versus small effect size or using a sample size of 800 versus 200 increased power by approximately .40, and a slope variance of 10 versus 20 increased power by up to .24. Decreasing error variance from 100 to 50, however, increased power by no more than .09 and increasing measurement occasions from 3 to 6 increased power by no more than .04. Misspecification in level-1 error structure had little influence on power, whereas misspecifying the form of the growth model as linear rather than quadratic dramatically reduced power for detecting differences in slopes. Additionally, power estimates based on the Monte Carlo and Satorra-Saris techniques never differed by more than .03, even with small sample sizes, whereas power estimates for the MBC technique appeared quite discrepant from the other two techniques. Results suggest the choice between using the Satorra-Saris or Monte Carlo technique in a priori power analyses for slope differences in latent growth models is a matter of preference, although features such as missing data can only be considered within the Monte Carlo approach. Further, researchers conducting power analyses for slope differences in latent growth models should pay greatest attention to estimating slope difference, slope variance, and sample size. Arguments are also made for examining model-implied covariance matrices based on estimated parameters and graphic depictions of slope variance to help ensure parameter estimates are reasonable in a priori power analysis.
Date Created
2011
Agent