Current Search: Research Repository (x) » * (x) » Department of Educational Psychology and Learning Systems (x) » Statistics (x)
Search results
- Title
- The Use of a Meta-Analysis Technique in Equating and Its Comparison with Several Small Sample Equating Methods.
- Creator
-
Caglak, Serdar, Paek, Insu, Patrangenaru, Victor, Almond, Russell G., Roehrig, Alysia D., Florida State University, College of Education, Department of Educational Psychology...
Show moreCaglak, Serdar, Paek, Insu, Patrangenaru, Victor, Almond, Russell G., Roehrig, Alysia D., Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less - Abstract/Description
-
The main objective of this study was to investigate the improvement of the accuracy of small sample equating, which typically occurs in teacher certification/licensure examinations due to a low volume of test takers per test administration, under the Non-Equivalent Groups with Anchor Test (NEAT) design by combining previous and current equating outcomes using a meta-analysis technique. The proposed meta-analytic score transformation procedure was called "meta-equating" throughout this study....
Show moreThe main objective of this study was to investigate the improvement of the accuracy of small sample equating, which typically occurs in teacher certification/licensure examinations due to a low volume of test takers per test administration, under the Non-Equivalent Groups with Anchor Test (NEAT) design by combining previous and current equating outcomes using a meta-analysis technique. The proposed meta-analytic score transformation procedure was called "meta-equating" throughout this study. To conduct meta-equating, the previous and current equating outcomes obtained from the chosen equating methods (ID (Identity Equating), Circle-Arc (CA) and Nominal Weights Mean (NW)) and synthetic functions (SFs) of these methods (CAS and NWS) were used, and then, empirical Bayesian (EB) and meta-equating (META) procedures were implemented to estimate the equating relationship between test forms at the population level. The SFs were created by giving equal weight to each of the chosen equating methods and the identity (ID) equating. Finally, the chosen equating methods, the SFs of each method (e.g., CAS, NWS, etc.), and also the META and EB versions (e.g., NW-EB, CA-META, NWS-META, etc.) were investigated and compared under varying testing conditions. These steps involved manipulating some of the factors that influence the accuracy of test score equating. In particular, the effect of test form difficulty levels, the group-mean ability differences, the number of previous equatings, and the sample size on the accuracy of the equating outcomes were investigated. The Chained Equipercentile (CE) equating with 6-univariate and 2-bivariate moments log-linear presmoothing was used as the criterion equating function to establish the equating relationship between the new form and the base (reference) form with 50,000 examinees per test form. To compare the performance of the equating methods, small numbers of examinee samples were randomly drawn from examinee populations with different ability levels in each simulation replication. Each pairs of the new and base test forms were randomly and independently selected from all available condition specific test form pairs. Those test forms were then used to obtain previous equating outcomes. However, purposeful selections of the examinee ability and test form difficulty distributions were made to obtain the current equating outcomes in each simulation replication. The previous equating outcomes were later used for the implementation of both the META and EB score transformation procedures. The effect of study factors and their possible interactions on each of the accuracy measures were investigated along the entire-score range and the cut (reduced)-score range using a series of mixed-factorial ANOVA (MFA) procedures. The performances of the equating methods were also compared based on post-hoc tests. Results show that the behaviors of the equating methods vary based on the each level of the group ability difference, test form difficult difference, and new group examinee sample size. Also, the use of both META and EB procedures improved the accuracy of equating results on average. The META and EB versions of the chosen equating methods therefore might be a solution to equate the test forms that are similar in their psychometric characteristics and also taken by new form examinee samples less than 50. However, since there are many factors affecting the equating results in reality, one should always expect that equating methods and score transformation procedures, or in more general terms, estimation procedures may function differently, to some degree, depending on conditions in which they are implemented. Therefore, one should consider the recommendations for the use of the proposed equating methods in this study as a piece of information, not an absolute guideline, for a rule of thumbs for practicing small sample test equating in teacher certification/licensure examinations.
Show less - Date Issued
- 2015
- Identifier
- FSU_2015fall_Caglak_fsu_0071E_12863
- Format
- Thesis
- Title
- Four Methods for Combining Dependent Effects from Studies Reporting Regression Analysis.
- Creator
-
Gunter, Tracey Danielle, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Almond, Russell G., Paek, Insu, Florida State University, College of Education, Department of...
Show moreGunter, Tracey Danielle, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Almond, Russell G., Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less - Abstract/Description
-
Over the years a variety of indices have been proposed to summarize regression analyses. Unfortunately the proposed indices are only appropriate when meta-analysts want to understand the role of a single predictor variable in predicting the outcome variable. However, sometimes meta-analysts want to understand the effect of a set of variables on an outcome variable. In this paper, four methods are presented for obtaining a composite effect for two focal predictor variables from a single...
Show moreOver the years a variety of indices have been proposed to summarize regression analyses. Unfortunately the proposed indices are only appropriate when meta-analysts want to understand the role of a single predictor variable in predicting the outcome variable. However, sometimes meta-analysts want to understand the effect of a set of variables on an outcome variable. In this paper, four methods are presented for obtaining a composite effect for two focal predictor variables from a single regression model. The indices are the average of the standardized regression coefficients (ASC), the average of the standardized regression coefficients using Hedges and Olkin's (1985) approach (AHO), the sheaf coefficient (SC), and the squared multiple semi-partial correlation coefficient (MSP). A simulation study was conducted to examine the behavior of the indices and their variance when the number of predictor variables in the model, the sample size, the correlations between the focal predictor variables in the model, and the correlations between the focal and non-focal predictor variables in the model were manipulated. The results of the study show that the average bias values of the ASC and AHO estimates are small even when the sample size is small. Furthermore, the ASC and AHO estimates and their estimated variances are more precise than the other indices under all conditions examined. Therefore, when meta-analysts are interested in estimating the effect of a set of predictor variables on an outcome variable from a single regression model, the ASC or AHO procedures are preferred.
Show less - Date Issued
- 2015
- Identifier
- FSU_2015fall_Gunter_fsu_0071E_12829
- Format
- Thesis
- Title
- A Comparison of Three Approaches to Confidence Interval Estimation for Coefficient Omega.
- Creator
-
Xu, Jie, Yang, Yanyun, Becker, Betsy Jane, Almond, Russell G., Florida State University, College of Education, Department of Educational Psychology and Learning Systems
- Abstract/Description
-
Coefficient Omega was introduced by McDonald (1978) as a reliability coefficient of composite scores for the congeneric model. Interval estimation (Neyman, 1937) on coefficient Omega provides a range of plausible values which is likely to capture the population reliability of composite scores. The Wald method, likelihood method, and bias-corrected and accelerated bootstrap method are three methods to construct confidence interval for coefficient Omega (e.g., Cheung, 2009b; Kelley & Cheng,...
Show moreCoefficient Omega was introduced by McDonald (1978) as a reliability coefficient of composite scores for the congeneric model. Interval estimation (Neyman, 1937) on coefficient Omega provides a range of plausible values which is likely to capture the population reliability of composite scores. The Wald method, likelihood method, and bias-corrected and accelerated bootstrap method are three methods to construct confidence interval for coefficient Omega (e.g., Cheung, 2009b; Kelley & Cheng, 2012; Raykov, 2002, 2004, 2009; Raykov & Marcoulides, 2004; Padilla & Divers, 2013). Very limited number of studies on the evaluation of these three methods can be found in the literature (e.g., Cheung, 2007, 2009a, 2009b; Kelley & Cheng, 2012; Padilla & Divers, 2013). No simulation study has been conducted to evaluate the performance of these three methods for interval construction on coefficient Omega. In the current simulation study, I assessed these three methods by comparing their empirical performance on interval estimation for coefficient Omega. Four factors were included in the simulation design: sample size, number of items, factor loading, and degree of nonnormality. Two thousands datasets were generated in R 2.15.0 (R Core Team, 2012) for each condition. For each generated dataset, three approaches (i.e., the Wald method, likelihood method, and bias-corrected and accelerated bootstrap method) were used to construct 95% confidence interval of coefficient Omega in R 2.15.0. The results showed that when the data were multivariate normally distributed, three methods performed equally well and coverage probabilities were very close to the prespecified .95 confidence level. When the data were multivariate nonnormally distributed, coverage probabilities decreased and interval widths became wider for all three methods as the degree of nonnormality increased. In general, when the data departed from the multivariate normality, the BCa bootstrap method performed better than the other two methods, with relatively higher coverage probabilities, while the Wald and likelihood methods were comparable and yielded narrower interval width than the BCa bootstrap method.
Show less - Date Issued
- 2014
- Identifier
- FSU_migr_etd-9269
- Format
- Thesis
- Title
- Meta-Analysis of Factor Analyses: Comparison of Univariate and Multivariate Approaches Using Correlation Matrices and Factor Loadings.
- Creator
-
Cho, Kyunghwa, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology...
Show moreCho, Kyunghwa, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less - Abstract/Description
-
Currently, more sophisticated techniques such as factor analyses are frequently applied in primary research thus may need to be meta-analyzed. This topic has been given little attention in the past due to its complexity. Because factor analysis is becoming more popular in research in many areas including education, social work, social science, and so on, the study of methods for the meta-analysis of factor analyses is also becoming more important. The first main purpose of this dissertation...
Show moreCurrently, more sophisticated techniques such as factor analyses are frequently applied in primary research thus may need to be meta-analyzed. This topic has been given little attention in the past due to its complexity. Because factor analysis is becoming more popular in research in many areas including education, social work, social science, and so on, the study of methods for the meta-analysis of factor analyses is also becoming more important. The first main purpose of this dissertation is to compare the results of seven different approaches to doing meta-analysis of confirmatory factor analyses. Specifically, five approaches are based on univariate meta-analysis methods. The next two approaches use multivariate meta-analysis to obtain the results of factor loadings and the standard errors of factor loadings. The results from each approach are compared. Given the fact that factor analyses are commonly used in many areas, the second purpose of this dissertation is to explore the appropriate approach or approaches to use for the meta-analysis of factor analyses, especially Confirmatory Factor Analysis (CFA). When the average sample size was small, the results of IRD, WMC, WMFL, and GLS-MFL approaches showed better performance than those of UMC, MFL, and GLS-MC approaches to estimating parameters. With large average sample sizes (larger than 150), the performance to estimate the parameters across all seven approaches seemed to be similar in this dissertation. Based on my simulation results, researchers who want to conduct meta-analytic confirmatory factor analysis can apply any of these approaches to synthesize the results from primary studies it their studies have n > 150.
Show less - Date Issued
- 2015
- Identifier
- FSU_migr_etd-9570
- Format
- Thesis
- Title
- Investigating the Chi-Square-Based Model-Fit Indexes for WLSMV and ULSMV Estimators.
- Creator
-
Xia, Yan, Yang, Yanyun, Huffer, Fred W. (Fred William), Almond, Russell G., Becker, Betsy Jane, Paek, Insu, Florida State University, College of Education, Department of...
Show moreXia, Yan, Yang, Yanyun, Huffer, Fred W. (Fred William), Almond, Russell G., Becker, Betsy Jane, Paek, Insu, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less - Abstract/Description
-
In structural equation modeling (SEM), researchers use the model chi-square statistic and model-fit indexes to evaluate model-data fit. Root mean square error of approximation (RMSEA), comparative fit index (CFI), and Tucker-Lewis index (TLI) are widely applied model-fit indexes. When data are ordered and categorical, the most popular estimator is the diagonally weighted least squares (DWLS) estimator. Robust corrections have been proposed to adjust the uncorrected chi-square statistic from...
Show moreIn structural equation modeling (SEM), researchers use the model chi-square statistic and model-fit indexes to evaluate model-data fit. Root mean square error of approximation (RMSEA), comparative fit index (CFI), and Tucker-Lewis index (TLI) are widely applied model-fit indexes. When data are ordered and categorical, the most popular estimator is the diagonally weighted least squares (DWLS) estimator. Robust corrections have been proposed to adjust the uncorrected chi-square statistic from DWLS so that its first and second order moments are in alignment with the target central chi-square distribution under correctly specified models. DWLS with such a correction is called the mean- and variance-adjusted weighted least squares (WLSMV) estimator. An alternative to WLSMV is the mean-and variance-adjusted unweighted least squares (ULSMV) estimator, which has been shown to perform as well as, or slightly better than WLSMV. Because the chi-square statistic is corrected, the chi-square-based RMSEA, CFI, and TLI are thus also corrected by replacing the uncorrected chi-square statistic with the robust chi-square statistic. The robust model fit indexes calculated in such a way are named as the population-corrected robust (PR) model fit indexes following Brosseau-Liard, Savalei, and Li (2012). The PR model fit indexes are currently reported in almost every application when WLSMV or ULSMV is used. Nevertheless, previous studies have found the PR model fit indexes from WLSMV are sensitive to several factors such as sample sizes, model sizes, and thresholds for categorization. The first focus of this dissertation is on the dependency of model fit indexes on the thresholds for ordered categorical data. Because the weight matrix in the WLSMV fit function and the correction factors for both WLSMV and ULSMV include the asymptotic variances of thresholds and polychoric correlations, the model fit indexes are very likely to depend on the thresholds. The dependency of model fit indexes on the thresholds is not a desirable property, because when the misspecification lies in the factor structures (e.g., cross loadings are ignored or two factors are considered as a single factor), model fit indexes should reflect such misspecification rather than the threshold values. As alternatives to the PR model fit indexes, Brosseau-Liard et al. (2012), Brosseau-Liard and Savalei (2014), and Li and Bentler (2006) proposed the sample-corrected robust (SR) model fit indexes. The PR fit indexes are found to converge to distorted asymptotic values, but the SR fit indexes converge to their definitions asymptotically. However, the SR model fit indexes were proposed for continuous data, and have been neither investigated nor implemented in SEM software when WLSMV and ULSMV are applied. This dissertation thus investigates the PR and SR model fit indexes for WLSMV and ULSMV. The first part of the simulation study examines the dependency of the model fit indexes on the thresholds when the model misspecification results from omitting cross-loadings or collapsing factors in confirmatory factor analysis. The study is conducted on extremely large computer-generated datasets in order to approximate the asymptotic values of model fit indexes. The results find that only the SR fit indexes from ULSMV are independent of the population threshold values, given the other design factors. The PR fit indexes from ULSMV, and the PR and SR fit indexes from WLSMV are influenced by thresholds, especially when data are binary and the hypothesized model is greatly misspecified. The second part of the simulation varies the sample sizes from 100 to 1000 to investigate whether the SR fit indexes under finite samples are more accurate estimates of the defined values of RMSEA, CFI, and TLI, compared with the uncorrected model fit indexes without robust correction and the PR fit indexes. Results show that the SR fit indexes are the more accurate in general. However, when the thresholds are different across items, data are binary, and sample size is less than 500, all versions of these indexes can be very inaccurate. In such situations, larger sample sizes are needed. In addition, the conventional cutoffs developed from continuous data with maximum likelihood (e.g., RMSEA < .06, CFI > .95, and TLI > .95; Hu & Bentler, 1999) have been applied to WLSMV and ULSMV regardless of the arguments against such a practice (e.g., Marsh, Hau, & Wen, 2004). For comparison purposes, this dissertation reports the RMSEA, CFI, and TLI based on continuous data using maximum likelihood before the variables are categorized to create ordered categorical data. Results show that the model fit indexes from maximum likelihood are very different from those from WLSMV and ULSMV, suggesting that the conventional rules should not be applied to WLSMV and ULSMV.
Show less - Date Issued
- 2016
- Identifier
- FSU_2016SU_Xia_fsu_0071E_13379
- Format
- Thesis
- Title
- The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring.
- Creator
-
Yun, Jiyeo, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Zhang, Qian, Florida State University, College of Education, Department of Educational Psychology and...
Show moreYun, Jiyeo, Becker, Betsy Jane, Huffer, Fred W. (Fred William), Paek, Insu, Zhang, Qian, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less - Abstract/Description
-
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used to assess the relatedness of human and automated essay scoring, and to examine impacts of rater variability on inter-rater agreement. To implement...
Show moreSince researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used to assess the relatedness of human and automated essay scoring, and to examine impacts of rater variability on inter-rater agreement. To implement the investigations, my study consists of two parts: empirical and simulation studies. Based on the results from the empirical study, the overall effects for inter-rater agreement were .63 and .99 for exact and adjacent proportions of agreement, .48 for kappas, and between .75 and .78 for correlations. Additionally, significant differences between 6-point scales and the other scales (i.e., 3-, 4-, and 5-point scales) for correlations, kappas and proportions of agreement existed. Moreover, based on the results of the simulated data, the highest agreements and lowest discrepancies achieved in the matched rater distribution pairs. Specifically, the means of exact and adjacent proportions of agreement, kappa and weighted kappa values, and correlations were .58, .95, .42, .78, and .78, respectively. Meanwhile the average standardized mean difference was .0005 in the matched rater distribution pairs. Acceptable values for inter-rater agreement as evaluation criteria for automated essay scoring, impacts of rater variability on inter-rater agreement, and relationships among inter-rater agreement indices were discussed.
Show less - Date Issued
- 2017
- Identifier
- FSU_FALL2017_Yun_fsu_0071E_14144
- Format
- Thesis
- Title
- A Weakly-Informative Group-Specific Prior Distribution for Meta-Analysis.
- Creator
-
Thompson, Christopher, Becker, Betsy Jane, Clark, Kathleen M., Almond, Russell G., Aloe, Ariel M., Yang, Yanyun, Florida State University, College of Education, Department of...
Show moreThompson, Christopher, Becker, Betsy Jane, Clark, Kathleen M., Almond, Russell G., Aloe, Ariel M., Yang, Yanyun, Florida State University, College of Education, Department of Educational Psychology and Learning Systems
Show less - Abstract/Description
-
While Bayesian meta-analysis has flourished both in methodological and substantive work, group-specific Bayesian modeling remains scarce. Common practice for choosing prior distributions entails using typical non-informative priors. Currently, there is a push to use more informative prior distributions. In this dissertation I propose a group specific weakly informative prior distribution. The new prior distribution uses a frequentist estimate of between-studies heterogeneity as the...
Show moreWhile Bayesian meta-analysis has flourished both in methodological and substantive work, group-specific Bayesian modeling remains scarce. Common practice for choosing prior distributions entails using typical non-informative priors. Currently, there is a push to use more informative prior distributions. In this dissertation I propose a group specific weakly informative prior distribution. The new prior distribution uses a frequentist estimate of between-studies heterogeneity as the noncentrality parameter in a folded noncentral t distribution. This new distribution is then modeled individually for groups based on some categorical factor. An extensive simulation study was performed to assess the performance of the new group-specific prior distribution to several non-informative prior distributions in a variety of meta-analytic scenarios. An application using data from a previously published meta-analysis on dynamic geometry software is also provided.
Show less - Date Issued
- 2016
- Identifier
- FSU_2016SP_Thompson_fsu_0071E_13051
- Format
- Thesis