Plotting partial correlation and regression in ecological studies
- Depto. de Ecología Funcional y Evolutiva, Estación Experimental de Zonas Áridas, CSIC, General Segura, 1, Almería, 04001 Almería, Spain
Abstract. Multiple regression, the General linear model (GLM) and the Generalized linear model (GLZ) are widely used in ecology. The widespread use of graphs that include fitted regression lines to document patterns in simple linear regression can be easily extended to these multivariate techniques in plots that show the partial relationship of the dependent variable with each independent variable. However, the latter procedure is not nearly as widely used in ecological studies. In fact, a brief review of the recent ecological literature showed that in ca. 20% of the papers the results of multiple regression are displayed by plotting the dependent variable against the raw values of the independent variable. This latter procedure may be misleading because the value of the partial slope may change in magnitude and even in sign relative to the slope obtained in simple least-squares regression. Plots of partial relationships should be used in these situations. Using numerical simulations and real data we show how displaying plots of partial relationships may also be useful for: 1) visualizing the true scatter of points around the partial regression line, and 2) identifying influential observations and non-linear patterns more efficiently than using plots of residuals vs. fitted values. With the aim to help in the assessment of data quality, we show how partial residual plots (residuals from overall model + predicted values from the explanatory variable vs. the explanatory variable) should only be used in restricted situations, and how partial regression plots (residuals of Y on the remaining explanatory variables vs. residuals of the target explanatory variable on the remaining explanatory variables) should be the ones displayed in publications because they accurately reflect the scatter of partial correlations. Similarly, these partial plots can be applied to visualize the effect of continuous variables in GLM and GLZ for normal distributions and identity link functions.