What the Heck Are Sums of Squares in Regression?
In regression, "sums of squares" are used to represent variation. In this post, we’ll use some sample data to walk through these calculations.The sample data used in this post is available within...
View ArticleCreating Value from Your Data
There may be huge potential benefits waiting in the data in your servers. These data may be used for many different purposes. Better data allows better decisions, of course. Banks, insurance firms, and...
View ArticleHow to Save a Failing Regression with PLS
Face it, you love regression analysis as much as I do. Regression is one of the most satisfying analyses in Minitab: get some predictors that should have a relationship to a response, go through a...
View ArticleHow to Identify the Most Important Predictor Variables in Regression Models
You’ve performed multiple linear regression and have settled on a model which contains several predictor variables that are statistically significant. At this point, it’s common to ask, “Which variable...
View ArticleProblems Using Data Mining to Build Regression Models
Data mining uses algorithms to explore correlations in data sets. An automated procedure sorts through large numbers of variables and includes them in the model based on statistical significance alone....
View ArticleProblems Using Data Mining to Build Regression Models, Part Two
Data mining can be helpful in the exploratory phase of an analysis. If you're in the early stages and you're just figuring out which predictors are potentially correlated with your response variable,...
View ArticleSo Why Is It Called "Regression," Anyway?
Did you ever wonder why statistical analyses and concepts often have such weird, cryptic names?One conspiracy theory points to the workings of a secret committee called the ICSSNN. The International...
View ArticleR-Squared: Sometimes, a Square is just a Square
If you regularly perform regression analysis, you know that R2 is a statistic used to evaluate the fit of your model. You may even know the standard definition of R2: the percentage of variation in the...
View ArticleGleaning Insights from Election Data with Basic Statistical Tools
One of the biggest pieces of international news last year was the so-called "Brexit" referendum, in which a majority of voters in the United Kingdom cast their ballots to leave the European Union...
View ArticleWhat Is the Difference between Linear and Nonlinear Equations in Regression...
Previously, I’ve written about when to choose nonlinear regression and how to model curvature with both linear and nonlinear regression. Since then, I’ve received several comments expressing confusion...
View ArticleHow to Estimate the Probability of a No-Show using Binary Logistic Regression
In April 2017, overbooking of flight seats hit the headlines when a United Airlines customer was dragged off a flight. A TED talk by Nina Klietsch gives a good, but simplistic explanation of why...
View ArticleThe Easiest Way to Do Multiple Regression Analysis
Maybe you're just getting started with analyzing data. Maybe you're reasonably knowledgeable about statistics, but it's been a long time since you did a particular analysis and you feel a little bit...
View ArticleHow to Avoid Overfitting Your Regression Model
Overfitting a model is a real problem you need to beware of when performing regression analysis. An overfit model result in misleading regression coefficients, p-values, and R-squared statistics....
View ArticleFighting Wildfires with Statistical Analysis
Wildfires in California have killed at least 40 people and burned more than 217,000 acres in the past few weeks. Nearly 8,000 firefighters are trying to contain the blazes with the aid of more than 800...
View ArticleCan Regression and Statistical Software Help You Find a Great Deal on a Used...
You need to consider many factors when you’re buying a used car. Once you narrow your choice down to a particular car model, you can get a wealth of information about individual cars on the market...
View Article