ERIC Identifier: ED470205 Publication Date: 20020800 Author: Osborne, Jason W.  Waters, Elaine Source: ERIC Clearinghouse on Assessment and Evaluation College Park MD. Multiple Regression Assumptions. ERIC Digest.Most statistical tests rely upon certain assumptions about the variables used in the analysis. When these assumptions are not met, the results may not be trustworthy, resulting in a Type I or Type II error, or over or underestimation of significance or effect size(s). As Pedhazur (1997, p. 33) notes, "Knowledge and understanding of the situations when violations of assumptions lead to serious biases, and when they are of little consequence, are essential to meaningful data analysis." However, as Osborne, Christensen, and Gunter (2001) observe, few articles report having tested assumptions of the statistical tests they rely on for drawing their conclusions. This creates a situation in which we have a rich literature in education and social science, but we are forced to call into question the validity of many of these results, conclusions, and assertions, as we have no idea whether the assumptions of the statistical tests were met. Our goal for this Digest is to present a discussion of the assumptions of multiple regression tailored toward the practicing researcher. Several assumptions of multiple regression are "robust" to violation (e.g., normal distribution of errors), and others are fulfilled in the proper design of a study (e.g., independence of observations). Therefore, we will focus on the assumptions of multiple regression that are not robust to violation, and that researchers can deal with if violated. Specifically, we will discuss the assumptions of normality, linearity, reliability of measurement, and homoscedasticity. NORMALITY ASSUMPTIONRegression assumes that variables have
normal distributions. Nonnormally distributed variables (highly skewed or
kurtotic variables, or variables with substantial outliers) can distort
relationships and significance tests. There are several pieces of information
that are useful to the researcher in testing this assumption: visual inspection
of data plots, skew, kurtosis, and PP plots give researchers information about
normality, and KolmogorovSmirnov tests provide inferential statistics on
normality. Outliers can be identified either through visual inspection of
histograms or frequency distributions, or by converting data to zscores.
Bivariate/multivariate data cleaning can also be important (Tabachnick & Fidell, 2001, p 139) in multiple regression. Most regression or multivariate statistics texts (e.g., Pedhazur, 1997; Tabachnick & Fidell, 2001) discuss the examination of standardized or studentized residuals, or indices of leverage. The removal of univariate and bivariate outliers can reduce the probability of Type I and Type II errors and improve accuracy of estimates. Outlier (univariate or bivariate) removal is straightforward in most statistical software. However, it is not always desirable to remove outliers. In this case, transformations (e.g., square root, log, or inverse) can improve normality but complicate the interpretation of the results and should be used deliberately and in an informed manner. A full treatment of transformations is beyond the scope of this Digest, but is discussed in many popular statistical textbooks. LINEARITY ASSUMPTIONStandard multiple regression can only
accurately estimate the relationship between dependent and independent variables
if the relationships are linear in nature. Because there are many instances in
the social sciences in which nonlinear relationships occur (e.g., anxiety), it
is essential to examine analyses for nonlinearity. If the relationship between
independent variables (IV) and the dependent variable (DV) is not linear, the
results of the regression analysis will underestimate the true relationship.
This underestimation carries two risks: increased chance of a Type II error for
that IV, and, in the case of multiple regression, an increased risk of Type I
errors (overestimation) for other IVs that share variance with that IV.
Authors such as Pedhazur (1997), Cohen and Cohen (1983), and Berry and Feldman (1985) suggest three primary ways to detect nonlinearity. The first method is to use theory or previous research to inform current analyses. However, because many prior researchers have probably overlooked the possibility of nonlinear relationships, this method is not foolproof. A preferable method of detection is to examine residual plots (plots of the standardized residuals as a function of standardized predicted values, readily available in most statistical software). The third method of detecting curvilinearity is to routinely run regression analyses that incorporate curvilinear components (squared and cubic terms; see Goldfeld and Quandt, 1976 or most regression texts for details on how to do this) or use the nonlinear regression option available in many statistical packages. It is important that the nonlinear aspects of the relationship be accounted for in order to best assess the relationship between variables. RELIABILITY ASSUMPTIONThe nature of our educational and
social science research means that many variables we are interested in are also
difficult to measure, making measurement error a particular concern. In simple
correlation and regression, unreliable measurement causes relationships to be
underestimated, increasing the risk of Type II errors. In the case of multiple
regression or partial correlation, effect sizes of other variables can be
overestimated if the covariate is not reliably measured because the full effect
of the covariate(s) would not be removed. This is a significant concern if the
goal of research is to accurately model the "real" relationships evident in the
population. Although most authors assume that reliability estimates (Cronbach
alphas) of .7 to .8 are acceptable (e.g., Nunnally, 1978), and Osborne,
Christensen, and Gunter (2001) report that the average alpha reported in top
educational psychology journals is .83, measurement of this quality still
contains enough measurement error to make correction worthwhile, as illustrated
below. Correction for low reliability is simple, and widely disseminated in most texts on regression, but rarely seen in the literature. We argue that authors should correct for low reliability to obtain a more accurate picture of the "true" relationship in the population and, in the case of multiple regression or partial correlation, to avoid overestimating the effect of another variable. Since "the presence of measurement errors in behavioral research is the rule rather than the exception" and "reliabilities of many measures used in the behavioral sciences are, at best, moderate" (Pedhazur, 1997, p. 172), it is important that researchers be aware of accepted methods of dealing with this issue. For simple regression, provides an estimate of the "true" relationship between the IV and DV in the population. In this equation, r12 is the observed correlation, and r11 and r22 are the reliability estimates of the variables. Even in cases in which reliability is .80, correction for attenuation substantially changes the effect size. When reliability drops to .70 or below, this correction yields a substantially different picture of the "true" nature of the relationship and potentially avoids a Type II error. With each independent variable added to the regression equation, the effects of lessthanperfect reliability on the strength of the relationship become more complex, and the results of the analysis become more questionable. HOMOSCEDASTICITY ASSUMPTIONHomoscedasticity means that the
variance of errors is the same across all levels of the IV. When the variance of
errors differs at different values of the IV, heteroscedasticity is indicated.
According to Berry and Feldman (1985), slight heteroscedasticity has little
effect on significance tests; however, when heteroscedasticity is marked, it can
lead to serious distortion of findings and seriously weaken the analysis, thus
increasing the possibility of a Type I error.
This assumption can be checked by visual examination of a plot of the standardized residuals (the errors) by the regression standardized predicted value. Most modern statistical packages include this as an option. Ideally, residuals are randomly scattered around 0 (the horizontal line), providing a relatively even distribution. Heteroscedasticity is indicated when the residuals are not evenly scattered around the line. There are many forms heteroscedasticity can take, such as a bowtie or fan shape. When the plot of residuals appears to deviate substantially from normal, more formal tests for heteroscedasticity should be performed. Possible tests for this are the GoldfeldQuandt test when the error term either decreases or increases consistently as the value of the DV increases as shown in the fanshaped plot, or the Glejser tests for heteroscedasticity when the error term has small variances at central observations and larger variance at the extremes of the observations as in the bow tieshaped plot (Berry & Feldman, 1985). In cases where skew is present in the IVs, transformation of variables can reduce the heteroscedasticity. CONCLUSIONThe goal of this Digest was to raise awareness
of the importance of checking assumptions in simple and multiple regression. We
focused on four assumptions that were not highly robust to violations, or easily
dealt with through design of the study, that researchers could easily check and
deal with, and that, in our opinion, appear to carry substantial benefits.
We believe that checking these assumptions carries significant benefits for the researcher. Making sure an analysis meets the associated assumptions helps avoid Type I and II errors. Attending to issues such as attenuation due to low reliability, curvilinearity, and nonnormality often boosts effect sizes, usually a desirable outcome. Finally, there are many nonparametric statistical techniques available to researchers when the assumptions of a parametric statistical technique are not met. Although these are often somewhat lower in power than parametric techniques, they provide valuable alternatives, and researchers should be familiar with them. REFERENCESBerry, W. D., & Feldman, S. (1985). Multiple
Regression in Practice. Sage University Paper Series on Quantitative
Applications in the Social Sciences, Series No. 07050. Newbury Park, CA: Sage
Publications, Inc. Cohen, J., & Cohen, P. (1983). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Goldfeld, S.M. and Quandt, R.E. (1976). Studies in Nonlinear Estimation. Cambridge, MA: Ballinger Publishing Company. Nunnally, J. C. (1978). Psychometric Theory (2nd ed.). New York: McGraw Hill. Osborne, J. W., Christensen, W. R., & Gunter, J. (April 2001). Educational psychology from a statistician's perspective: A review of the power and goodness of educational psychology research. Paper presented at the annual meeting of the American Educational Research Association, Seattle, WA. Pedhazur, E. J. (1997). Multiple Regression in Behavioral Research (3rd ed.). Orlando, FL: Harcourt Brace. Tabachnick, B. G., & Fidell, L. S. (2001). Using Multivariate Statistics (4th ed.). Needham Heights, MA: Allyn and Bacon.
