ERIC Identifier: ED470205 Publication Date: 2002-08-00
Author: Osborne, Jason W. - Waters, Elaine Source: ERIC
Clearinghouse on Assessment and Evaluation College Park MD.
Multiple Regression Assumptions. ERIC Digest.
Most statistical tests rely upon certain assumptions about the variables used
in the analysis. When these assumptions are not met, the results may not be
trustworthy, resulting in a Type I or Type II error, or over- or underestimation
of significance or effect size(s). As Pedhazur (1997, p. 33) notes, "Knowledge
and understanding of the situations when violations of assumptions lead to
serious biases, and when they are of little consequence, are essential to
meaningful data analysis." However, as Osborne, Christensen, and Gunter (2001)
observe, few articles report having tested assumptions of the statistical tests
they rely on for drawing their conclusions. This creates a situation in which we
have a rich literature in education and social science, but we are forced to
call into question the validity of many of these results, conclusions, and
assertions, as we have no idea whether the assumptions of the statistical tests
were met. Our goal for this Digest is to present a discussion of the assumptions
of multiple regression tailored toward the practicing researcher.
Several assumptions of multiple regression are "robust" to violation (e.g.,
normal distribution of errors), and others are fulfilled in the proper design of
a study (e.g., independence of observations). Therefore, we will focus on the
assumptions of multiple regression that are not robust to violation, and that
researchers can deal with if violated. Specifically, we will discuss the
assumptions of normality, linearity, reliability of measurement, and
Regression assumes that variables have
normal distributions. Non-normally distributed variables (highly skewed or
kurtotic variables, or variables with substantial outliers) can distort
relationships and significance tests. There are several pieces of information
that are useful to the researcher in testing this assumption: visual inspection
of data plots, skew, kurtosis, and P-P plots give researchers information about
normality, and Kolmogorov-Smirnov tests provide inferential statistics on
normality. Outliers can be identified either through visual inspection of
histograms or frequency distributions, or by converting data to z-scores.
Bivariate/multivariate data cleaning can also be important (Tabachnick &
Fidell, 2001, p 139) in multiple regression. Most regression or multivariate
statistics texts (e.g., Pedhazur, 1997; Tabachnick & Fidell, 2001) discuss
the examination of standardized or studentized residuals, or indices of
leverage. The removal of univariate and bivariate outliers can reduce the
probability of Type I and Type II errors and improve accuracy of estimates.
Outlier (univariate or bivariate) removal is straightforward in most
statistical software. However, it is not always desirable to remove outliers. In
this case, transformations (e.g., square root, log, or inverse) can improve
normality but complicate the interpretation of the results and should be used
deliberately and in an informed manner. A full treatment of transformations is
beyond the scope of this Digest, but is discussed in many popular statistical
Standard multiple regression can only
accurately estimate the relationship between dependent and independent variables
if the relationships are linear in nature. Because there are many instances in
the social sciences in which nonlinear relationships occur (e.g., anxiety), it
is essential to examine analyses for nonlinearity. If the relationship between
independent variables (IV) and the dependent variable (DV) is not linear, the
results of the regression analysis will underestimate the true relationship.
This underestimation carries two risks: increased chance of a Type II error for
that IV, and, in the case of multiple regression, an increased risk of Type I
errors (overestimation) for other IVs that share variance with that IV.
Authors such as Pedhazur (1997), Cohen and Cohen (1983), and Berry and
Feldman (1985) suggest three primary ways to detect nonlinearity. The first
method is to use theory or previous research to inform current analyses.
However, because many prior researchers have probably overlooked the possibility
of nonlinear relationships, this method is not foolproof. A preferable method of
detection is to examine residual plots (plots of the standardized residuals as a
function of standardized predicted values, readily available in most statistical
The third method of detecting curvilinearity is to routinely run regression
analyses that incorporate curvilinear components (squared and cubic terms; see
Goldfeld and Quandt, 1976 or most regression texts for details on how to do
this) or use the nonlinear regression option available in many statistical
packages. It is important that the nonlinear aspects of the relationship be
accounted for in order to best assess the relationship between variables.
The nature of our educational and
social science research means that many variables we are interested in are also
difficult to measure, making measurement error a particular concern. In simple
correlation and regression, unreliable measurement causes relationships to be
underestimated, increasing the risk of Type II errors. In the case of multiple
regression or partial correlation, effect sizes of other variables can be
overestimated if the covariate is not reliably measured because the full effect
of the covariate(s) would not be removed. This is a significant concern if the
goal of research is to accurately model the "real" relationships evident in the
population. Although most authors assume that reliability estimates (Cronbach
alphas) of .7 to .8 are acceptable (e.g., Nunnally, 1978), and Osborne,
Christensen, and Gunter (2001) report that the average alpha reported in top
educational psychology journals is .83, measurement of this quality still
contains enough measurement error to make correction worthwhile, as illustrated
Correction for low reliability is simple, and widely disseminated in most
texts on regression, but rarely seen in the literature. We argue that authors
should correct for low reliability to obtain a more accurate picture of the
"true" relationship in the population and, in the case of multiple regression or
partial correlation, to avoid over-estimating the effect of another variable.
Since "the presence of measurement errors in behavioral research is the rule
rather than the exception" and "reliabilities of many measures used in the
behavioral sciences are, at best, moderate" (Pedhazur, 1997, p. 172), it is
important that researchers be aware of accepted methods of dealing with this
issue. For simple regression, provides an estimate of the "true" relationship
between the IV and DV in the population. In this equation, r12 is the observed
correlation, and r11 and r22 are the reliability estimates of the variables.
Even in cases in which reliability is .80, correction for attenuation
substantially changes the effect size. When reliability drops to .70 or below,
this correction yields a substantially different picture of the "true" nature of
the relationship and potentially avoids a Type II error. With each independent
variable added to the regression equation, the effects of less-than-perfect
reliability on the strength of the relationship become more complex, and the
results of the analysis become more questionable.
Homoscedasticity means that the
variance of errors is the same across all levels of the IV. When the variance of
errors differs at different values of the IV, heteroscedasticity is indicated.
According to Berry and Feldman (1985), slight heteroscedasticity has little
effect on significance tests; however, when heteroscedasticity is marked, it can
lead to serious distortion of findings and seriously weaken the analysis, thus
increasing the possibility of a Type I error.
This assumption can be checked by visual examination of a plot of the
standardized residuals (the errors) by the regression standardized predicted
value. Most modern statistical packages include this as an option.
Ideally, residuals are randomly scattered around 0 (the horizontal line),
providing a relatively even distribution. Heteroscedasticity is indicated when
the residuals are not evenly scattered around the line. There are many forms
heteroscedasticity can take, such as a bow-tie or fan shape. When the plot of
residuals appears to deviate substantially from normal, more formal tests for
heteroscedasticity should be performed. Possible tests for this are the
Goldfeld-Quandt test when the error term either decreases or increases
consistently as the value of the DV increases as shown in the fan-shaped plot,
or the Glejser tests for heteroscedasticity when the error term has small
variances at central observations and larger variance at the extremes of the
observations as in the bow tie-shaped plot (Berry & Feldman, 1985). In cases
where skew is present in the IVs, transformation of variables can reduce the
The goal of this Digest was to raise awareness
of the importance of checking assumptions in simple and multiple regression. We
focused on four assumptions that were not highly robust to violations, or easily
dealt with through design of the study, that researchers could easily check and
deal with, and that, in our opinion, appear to carry substantial benefits.
We believe that checking these assumptions carries significant benefits for
the researcher. Making sure an analysis meets the associated assumptions helps
avoid Type I and II errors. Attending to issues such as attenuation due to low
reliability, curvilinearity, and non-normality often boosts effect sizes,
usually a desirable outcome.
Finally, there are many nonparametric statistical techniques available to
researchers when the assumptions of a parametric statistical technique are not
met. Although these are often somewhat lower in power than parametric
techniques, they provide valuable alternatives, and researchers should be
familiar with them.
Berry, W. D., & Feldman, S. (1985). Multiple
Regression in Practice. Sage University Paper Series on Quantitative
Applications in the Social Sciences, Series No. 07-050. Newbury Park, CA: Sage
Cohen, J., & Cohen, P. (1983). Applied Multiple Regression/Correlation
Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum
Goldfeld, S.M. and Quandt, R.E. (1976). Studies in Nonlinear Estimation.
Cambridge, MA: Ballinger Publishing Company. Nunnally, J. C. (1978).
Psychometric Theory (2nd ed.). New York: McGraw Hill.
Osborne, J. W., Christensen, W. R., & Gunter, J. (April 2001).
Educational psychology from a statistician's perspective: A review of the power
and goodness of educational psychology research. Paper presented at the annual
meeting of the American Educational Research Association, Seattle, WA.
Pedhazur, E. J. (1997). Multiple Regression in Behavioral Research (3rd ed.).
Orlando, FL: Harcourt Brace.
Tabachnick, B. G., & Fidell, L. S. (2001). Using Multivariate Statistics
(4th ed.). Needham Heights, MA: Allyn and Bacon.
Please note that this site is privately owned and is in no way related
to any Federal agency or ERIC unit. Further, this site is using a
privately owned and located server. This is NOT a government sponsored
or government sanctioned site. ERIC is a Service Mark of the U.S. Government.
This site exists to provide the text of the public domain ERIC Documents
previously produced by ERIC. No new content will ever appear here
that would in any way challenge the ERIC Service Mark of the U.S. Government.