A Monte Carlo simulation of observable versus latent variable structural equation modeling techniques. Citation. Stephenson, M. T., & Holbert, R. L. ().

Enjoy!

Keywords: structural equation modeling, confirmatory factor analysis, sample sizeβ, statistical power, Monte Carlo simulation, bias, solution propriety. Determining.

Enjoy!

Similarly, because most structural equation of options for checking whether the Monte Carlo simulations and.

Enjoy!

The noncentral chi-square distribution in misspecified structural equation models: Finite sample results from a Monte Carlo simulation. Multivariate Behavioral.

Enjoy!

A function to be applied to each generated data set across replications. lavaanfun. The character of the function name used in running lavaan model ("βcfa", "sem".

Enjoy!

Besides its derivation, an Monte Carlo simulation is conducted which shows that the new estimator performs well in finite samples. Moreover, for illustration, an.

Enjoy!

Selection from Structural Equation Modeling: Applications Using Mplus [Book] Monte Carlo simulation to estimate power and sample size for a desired SEM.

Enjoy!

The noncentral chi-square distribution in misspecified structural equation models: Finite sample results from a Monte Carlo simulation. Multivariate Behavioral.

Enjoy!

Handout for the workshop 'Advancing Quantitative Science with Monte Carlo Simulations'.

Enjoy!

simulation study for various covariance-based structural equation modeling in Structural Equation Modeling: A Monte Carlo Simulation Study to Compare.

Enjoy!

When contemplating sample size, investigators usually prioritize achieving adequate statistical power to observe true relationships in the data. Finally, this study also afforded the opportunity to compare sample size requirements for SEMs versus for similar models based on single indicators i. It is important to distinguish between two overarching approaches to the use of Monte Carlo analyses. A one-factor, four-indicator model with loadings of. Finally, we also compared power, bias, and solution propriety of path models that were based on single indicators versus latent variables i. All variables were set to have a variance of 1. However, power is not the only consideration in determining sample size as bias 1 in the parameter estimates and standard errors also have bearing. All models were based on a single group with 10, replications of the simulated data. Advances in approaches to statistical modeling and in the ease of use of related software programs has contributed not only to an increasing number of studies using latent variable analyses but also raises questions about how to estimate the requisite sample size for testing such models. We did so because determining the sample size required to observe a factor correlation is often an important consideration. We varied the number of indicators of these factors such that the one-factor model was indicated by four, six, or eight indicators and the two- and three-factor models were indicated by three, six, or eight indicators. Although statisticians have addressed many of these concerns in technical papers, our impression from serving as reviewers, consultants, and readers of other articles is that this knowledge may be inaccessible to many applied researchers and so our overarching objective was to communicate this information to a broader audience. The reliability of each of the variables was specified in the data simulation phase of the analysis. Second, as is evident in this study, the number of errors in the analysis of the generated data increased with decreasing sample size, which further suggests the need to consider such instances when evaluating sample size requirements. However, this effect also leveled as the transition from six to eight indicators did not result in as dramatic a decrease in sample size i.{/INSERTKEYS}{/PARAGRAPH} This is a conservative criterion as a single error in the analysis of 10, data sets is not particularly worrisome, assuming that all other criteria are met. Although not shown in the figures or tables, all models yielded acceptable coverage i. Within the CFAs, increasing the number of latent variables in a model resulted in a significant increase in the minimum sample size when moving from one to two factors, but this effect plateaued, as the transition from two to three factors was not associated with a concomitant increase in the sample size. In these models, each factor was indicated by three observed variables, and all factor loadings were set to. We also evaluated a three-factor latent variable mediation model with regressive direct and indirect paths between factors i. All loadings and structural paths in the models were completely standardized factor variances were fixed to 1. Throughout this article, we describe the importance of conducting proactive Monte Carlo simulation studies for the purposes of sample size planning. Specifically, we set observed variable reliability to. This approach does not address the power of the overall modelβit provides estimates of power and bias for individual effects of interest i. All models were evaluated using Mplus version 5. Such rules are problematic because they are not model-specific and may lead to grossly over-or underestimated sample size requirements. We also evaluated the effects of missing data on the mediation model Figure 2 , where each factor was indicated by three observed variables loading at. However, this was not true for comparisons between two- and three-factor models. The strength of all the direct regressive paths was varied from. Readers are also referred to Paxton, Curran, Bollen, Kirby, and Chen , who provide a useful step-by-step discussion of conducting Monte Carlo analyses. This is whether there are a sufficient number of cases for the model to converge without improper solutions or impossible parameter estimates. Figure 1 provides an overview of the one-, two-, and three-factor CFAs evaluated. {PARAGRAPH}{INSERTKEYS}Determining sample size requirements for structural equation modeling SEM is a challenge often faced by investigators, peer reviewers, and grant writers. A second aim was to examine how sample size requirements change as a function of elements in an SEM, such as number of factors, number of indicators, strength of indicator loadings, strength of regressive paths, degree of missing data, and type of model. Recent years have seen a large increase in SEMs in the behavioral science literature, but consideration of sample size requirements for applied SEMs often relies on outdated rules-of-thumb. Panels A to E show the results of the Monte Carlo simulation studies. One of the strengths of SEM is its flexibility, which permits examination of complex associations, use of various types of data e. Depending on the number and type of errors, this can potentially yield overly optimistic summary statistics and also raises the possibility that a similarly sized sample of real data might produce improper solutions. We next evaluated the effects of missing data on sample size requirements for one CFA and the latent mediation model. We hope this will help researchers better understand the factors most relevant to SEM sample size determinations and encourage them to conduct their own Monte Carlo analyses rather than relying on rules-of-thumb. Therefore, we focused these analyses on the minimum sample size required to observe this indirect effect. This model is shown in Figure 2. In so doing, we aimed to demonstrate the tremendous variability in SEM sample size requirements and the inadequacy of common rules-of-thumb. A standard error may also be biased if it is under- or overestimated which increases the risk of Type I and II errors, respectively. We also sought to explore how systematically varying parameters within these models i. The former approach is a prospectively designed one that has its basis in theory and the relevant literature. A large number of data sets are then generated to match the population values; each individual data set is akin to a sample and is based on a user-determined number of cases. This model was structurally saturated i. Figure 3 shows the minimum sample size meeting all a priori criteria required for each of the models. First, Mplus provides a description of any errors in the analysis of the generated data but omits these cases from the overall statistics summarized across replications. Third, we specified that the analysis could not yield any errors i. The standardized magnitude of the three direct paths in the model was set to. Prior work has evaluated the effects of missing data on statistical power analyses in structural equation models i. We did not evaluate a three-indicator, one-factor model because it would be just-identified i. The specified associations are akin to the population estimates of the true relationships among the variables. To do so, we specified a path analysis in which the association between the observed X and Y variables was mediated by an observed third variable i. The primary aim of this study was to provide applied behavioral science researchers with an accessible evaluation of sample size requirements for common types of latent variable models and to demonstrate the range of sample sizes that may be appropriate for SEM. We focused on the mediation model because of its popularity in the behavioral science literature. For illustrative purposes, suppose that the one-factor model tests the latent construct of depression, the two-factor model tests correlated latent constructs of depression and anxiety, and the three-factor model tests latent constructs of depression, anxiety, and substance use. In these cases, sample sizes were relatively unchanged, or in some cases decreased, with the addition of the third factor. Specifically, we examined the mean parameter estimates i. With this method, associations among variables are set by the user based on a priori hypotheses. In the first model, we started with a sample size of because that has previously been suggested as the minimum for SEMs Boomsma, ; starting sample sizes for subsequent models were determined based on the results of prior models. There has been a sharp increase in the number of SEM-based research publications that evaluate the structure of psychopathology and the correlates and course of psychological disorders and symptoms, yet applied information on how to determine adequate sample size for these studies has lagged behind. The hypothesized model is then evaluated in the generated data sets and each parameter estimate is averaged across the simulations to determine if the specified number of cases is sufficient for reproducing the population values and obtaining statistically significant parameter estimates. We evaluated models with one to three factors and each factor in the model was indicated by three to eight indicators, which loaded on their respective factors at. Across a series of simulations, we systematically varied key model properties, including number of indicators and factors, magnitude of factor loadings and path coefficients, and amount of missing data. The CFA that was evaluated was a two-factor model with each factor indicated by three observed variables loading on their respective factors at. The simulated data were dimensional i. Statistical power is the probability of rejecting the null hypothesis when it is false; it is the probability of not making a Type II error i. Arrows pointing toward the dependent variable show the amount of residual variance in each variable. Each panel shows the minimum sample size required for each permutation of each type of model that was evaluated. For example, as shown in Figure 3 Panels AβC and Supplemental Tables 1β3 , sample size requirements at least doubled when comparing the simplest possible one-factor, four-indicator model, which required a sample of , 90, and 60 participants at factor loadings of. We also varied the factor loadings and hence the unreliability of the indicators using standardized loadings of. The model is fit in each subsample and the resulting parameter estimates and fit statistics are then examined across all the random draws. We investigated how changes in these parameters affected sample size requirements with respect to statistical power, bias in the parameter estimates, and overall solution propriety. However, we adopted this rule for several reasons. This is important as an investigator may only be interested in having sufficient sample size for select aspects of a given model. Three major approaches to evaluating sample size requirements in SEMs have been proposed: a the Satorra and Saris method, which estimates power based on the noncentrality parameter i. We set the sample size of a given model and then adjusted it upwards or downwards based on whether the results met our criteria for acceptable precision of the estimates, statistical power, and overall solution propriety, as detailed below. The primary aim of this study was to evaluate sample size requirements for SEMs commonly applied in the behavioral sciences literature, including confirmatory factor analyses CFAs , models with regressive paths, and models with missing data. In many mediation models, the researcher is primarily interested in the size and significance of the indirect effect i. Supplementary tables appear online. The figure shows a representation of the model characteristics that were varied in the Monte Carlo analyses of the CFAs. MacCallum et al. This study used Monte Carlo data simulation techniques to evaluate sample size requirements for common applied SEMs. Essentially, this is the assumption of a single indicator analysis, as measurement error is not removed from the variance of the measure and the indicator is treated as if it is a perfect measure of the latent construct. The permutations of the regressive model i. As an applied example, suppose this model tested if chronic stress the independent variable predicted functional impairment the dependent variable via symptoms of depression the mediator. We examined models that we thought would have broad applicability and included models with the absolute minimum number of indicators required to yield overidentified measurement models i. Overall, models with fewer indicators required a larger sample relative to models with more indicators. Results revealed a range of sample size requirements i. The latter approach is a post hoc method that is limited by the quality of the existing data and may lead to unwarranted confidence in the stability of the results or the appropriateness of the sample for the planned analyses. All criteria had to be met in order to accept a given N as the minimum sample size. Within a model, factor loadings were held constant across indicators. For example, this is crucial when evaluating convergent or discriminant validity, in determining the longitudinal stability of a trait, or in determining the similarity of scores across non-independent observations e. All analyses were conducted using the maximum likelihood ML estimator. For each model, we systematically varied the number of indicators of the latent variable s and the strength of the factor loadings and structural elements in the model to examine how these characteristics would affect statistical power, the precision of the parameter estimates, and the overall propriety of the results. Next, we continued to increase the sample size by a unit of 10 or 20 to test stability of the solution, as defined by both the minimum sample size and the next largest sample size meeting all a priori criteria.