A Quant Methods Blog

What are ROC curves and how are these used to aid decision making?

One of the most vexing challenges in all of statistics is the need to make a valid and reliable probabilistic assessment about some unknown condition or state of affairs based solely on information gathered from sample data. Indeed, this is the foundation of traditional null hypothesis testing: there is some unknown condition in the population (the null hypothesis is either...
Keep Reading about What are ROC curves and how are these used to aid decision making?

What exactly qualifies as intensive longitudinal data and why am I not able to use more traditional growth models to study stability and change over time?

Recent years have seen increasing interest in the collection and analysis of intensive longitudinal data (or ILD) to generate unique insights into within-person processes and change over time. In this post, we first define ILD by contrasting it to data obtained from other common longitudinal designs. Next, we consider the distinct features of ILD that we must address and can...
Keep Reading about What exactly qualifies as intensive longitudinal data and why am I not able to use more traditional growth models to study stability and change over time?

What’s the best way to determine the number of latent classes in a finite mixture analysis?

Finite mixture models, which include latent class analysis, latent profile analysis, and growth mixture models, have grown greatly in popularity over the past decade or so.  Most statistical models assume a unitary (or homogeneous) population wherein all observations are governed by the same basic process.  In contrast, finite mixture models aim to identify latent subgroups within the population, sometimes referred...
Keep Reading about What’s the best way to determine the number of latent classes in a finite mixture analysis?

My advisor told me to use principal components analysis to examine the structure of my items and compute scale scores, but I was taught not to use it because it is not a “true” factor analysis. Help!

Help, indeed. This issue has been a source of both confusion and contention for more than 75 years, and papers have been published on this topic as recently as just a few years ago. A thorough discussion of principal components analysis (PCA) and the closely related methods of exploratory factor analysis (EFA) would require pages of text and dozens of...
Keep Reading about My advisor told me to use principal components analysis to examine the structure of my items and compute scale scores, but I was taught not to use it because it is not a “true” factor analysis. Help!

I fit a multilevel model and got the warning message “G Matrix is Non-Positive Definite.” What does this mean and what should I do about it?

Anyone who uses multilevel models will eventually encounter the dreaded “G matrix is non-positive definite” message (or “Tau Matrix” or “Psi Matrix”, depending on the labeling used by your software). This anxiety-inducing warning is often a signal that something is wrong in your model that needs fixing. With an NPD G matrix, the obtained estimates usually reflect an “improper” solution...
Keep Reading about I fit a multilevel model and got the warning message “G Matrix is Non-Positive Definite.” What does this mean and what should I do about it?

I’m reporting within- and between-group effects in from a multilevel model, and my reviewer says I need to address “sampling error” in the group means. What does this mean, and what can I do to address this?

This is a long-neglected topic, and one that is receiving increasing attention in the methodological literature. The problem that the reviewer is referring to is that the usual ways we obtain within- and between-group effects for lower-level predictors within the multilevel model (MLM) oftentimes generate biased estimates. The same issue arises with any form of clustering, such as when trying...
Keep Reading about I’m reporting within- and between-group effects in from a multilevel model, and my reviewer says I need to address “sampling error” in the group means. What does this mean, and what can I do to address this?

My advisor told me I should group-mean center my predictors in my multilevel model because it might “make my effects significant” but this doesn’t seem right to me. What exactly is involved in centering predictors within the multilevel model?

This is an excellent question and the topic of centering is often a source of confusion when using multilevel models (MLMs) in practice. This confusion is in part due to the need to address a complexity that arises within the MLM but is not relevant within the traditional multiple regression model: when modeling the effect of a lower-level predictor on...
Keep Reading about My advisor told me I should group-mean center my predictors in my multilevel model because it might “make my effects significant” but this doesn’t seem right to me. What exactly is involved in centering predictors within the multilevel model?

A reviewer recently asked me to comment on the issue of equivalent models in my structural equation model. What is the difference between alternative models and equivalent models within an SEM?

Differentiating between alternative models and equivalent models has long been a point of confusion in many research applications. Although the challenge of equivalent models can arise within almost any analytic setting, this is particularly salient within the structural equation model (or SEM). It is helpful first distinguish between alternative and equivalent models. To begin, one of the greatest strengths of...
Keep Reading about A reviewer recently asked me to comment on the issue of equivalent models in my structural equation model. What is the difference between alternative models and equivalent models within an SEM?

I have a fair amount of missing data that I don’t want to delete prior to my analysis. What are the best options available for me to retain these partially missing cases?

Missing data are a common problem faced by nearly all data analysts, particularly with the increasing emphasis on the collection of repeated assessments over time. Data values can be missing for a variety of reasons. A common situation is when a subject provides data at one time point but fails to provide data at a later time point; this is...
Keep Reading about I have a fair amount of missing data that I don’t want to delete prior to my analysis. What are the best options available for me to retain these partially missing cases?

The Cronbach’s Alphas for all the scales in my path analysis are in the .7s, so why is a reviewer criticizing me for not paying sufficient attention to reliability?

The issue of reliability can be a complex and often misunderstood issue. Entire text books have been written about reliability, validity, and scale construction, so we only briefly touch on the key issues here (see Bandalos, 2018, for an excellent recent example). To begin, in most areas across the behavioral, educational, and health sciences, theoretical constructs are hypothesized to exist...
Keep Reading about The Cronbach’s Alphas for all the scales in my path analysis are in the .7s, so why is a reviewer criticizing me for not paying sufficient attention to reliability?

In their blog, Dan and Patrick respond to commonly asked questions about a variety of topics behavioral, educational, and health research including experimental design, measurement, data analysis, and interpretation of findings. The responses are intentionally brief and concise (sort of), and additional resources are provided such as recommended readings, provision of exemplar data and computer code, or links to other potential learning materials. Readers are welcome to submit questions for future responses.