StatLab Articles

Detecting Influential Points in Regression with DFBETA(S)

In regression modeling, influential points are observations that, individually, exert large effects on a model’s results—the parameter estimates (\(\hat{\beta_0}, \hat{\beta_1}, ..., \hat{\beta_j}\)) and, consequently, the model’s predictions (\(\hat{y_1}, \hat{y_2}, ..., \hat{y_i}\)).

R, statistical methods, DFBETA, regression diagnostics, Jacob Goldstein-Greenwood

Comparing Mixed-Effect Models in R and SPSS

Occasionally we are asked to help students or faculty implement a mixed-effect model in SPSS. Our training and expertise is primarily in R, so it can be challenging to transfer and apply our knowledge to SPSS. In this article we document for posterity how to fit some basic mixed-effect models in R using the lme4 and nlme packages, and how to replicate the results in SPSS.

In this article we work with R 4.2.0, lme4 version 1.1-29, nlme version 3.1-157, and SPSS version 28.0.1.1.

R, mixed effect models, statistical methods, SPSS, Clay Ford

ROC Curves and AUC for Models Used for Binary Classification

This article assumes basic familiarity with the use and interpretation of logistic regression, odds and probabilities, and true/false positives/negatives. The examples are coded in R. ROC curves and AUC have important limitations, and I encourage reading through the section at the end of the article to get a sense of when and why the tools can be of limited use.

R, logistic regression, statistical methods, visualization, AUC, binary classification, ROC curves, Jacob Goldstein-Greenwood

Comparing the Accuracy of Two Binary Diagnostic Tests in a Paired Study Design

There are many medical tests for detecting the presence of a disease or condition. Some examples include tests for lesions, cancer, pregnancy, or COVID-19. While these tests are usually accurate, they’re not perfect. In addition, some tests are designed to detect the same condition, but use a different method. A recent example are PCR and antigen tests for COVID-19. In these cases we might want to compare the two tests on the same subjects. This is known as a paired study design.

R, statistical methods, McNemar's test, sensitivity, specificity, NPV, PPV, Clay Ford

Correlation of Fixed Effects in lme4

If you have ever used the R package lme4 to perform mixed-effect modeling you may have noticed the “Correlation of Fixed Effects” section at the bottom of the summary output. This article intends to shed some light on what this section means and how you might interpret it.

R, mixed effect models, statistical methods, lme4, Clay Ford

Getting Started with the Kruskal-Wallis Test

One of the most well-known statistical tests to analyze the differences between means of given groups is the ANOVA (analysis of variance) test. While ANOVA is a great tool, it assumes that the data in question follows a normal distribution. What if your data doesn’t follow a normal distribution or if your sample size is too small to determine a normal distribution? That’s where the Kruskal-Wallis test comes in.

Python, statistical methods, nonparametric statistics, kruskal-wallis, Samantha Lomuscio

A Beginner’s Guide to Marginal Effects

What are average marginal effects? If we unpack the phrase, it looks like we have effects that are marginal to something, all of which we average. So let’s look at each piece of this phrase and see if we can help you get a better handle on this topic.

R, logistic regression, statistical methods, marginal effects, marginal means, emmeans, Clay Ford

The Intuition Behind Confidence Intervals

Say it with me: An X% confidence interval captures the population parameter in X% of repeated samples.

In the course of our statistical educations, many of us had that line (or some variant of it) crammed, wedged, stuffed, and shoved into our skulls until definitional precision was leaking out of noses and pooling on our upper lips like prop blood.

Or, at least, I felt that way.

R, simulation, statistical methods, confidence intervals, Jacob Goldstein-Greenwood

Power and Sample Size Analysis Using Simulation

The power of a test is the probability of correctly rejecting a null hypothesis. For example, let’s say we suspect a coin is not fair and lands heads 65% of the time.

R, power analysis, simulation, statistical methods, Clay Ford

Post Hoc Power Calculations Are Not Useful

It is well documented that post hoc power calculations are not useful (Althouse, 2020; Goodman & Berlin, 1994; Hoenig & Heisey, 2001). Also known as observed power or retrospective power, post hoc power purports to estimate the power of a test given an observed effect size. The idea is to show that a “non-significant” hypothesis test failed to achieve significance because it wasn’t powerful enough. This allows researchers to entertain the notion that their hypothesized effect may actually exist; they just needed to use a bigger sample size.

R, power analysis, simulation, statistical methods, Clay Ford