Chapter 5 Generalized Least Squares (GLS)

5.1 Exercise 1

This first exercise will consider data collected by Eric North, an urban forester who took my class the first or second year I taught it. Some of Eric’s masters research was aimed at predicting damage to sidewalks from boulevard trees planted between the curb and sidewalk (North, Johnson, & Burk, 2015). The degree of damage largely depends on how much the trunk of the tree spreads out (measured by “trunk flare” Figure 5.1 right), which can be predicted from measures of the diameter at breast height (Figure 5.1 left).

Measurements of truck flare diameter (right photo) and diameter at breast height (DBH, left photo).

Figure 5.1: Measurements of truck flare diameter (right photo) and diameter at breast height (DBH, left photo).

We will explore linear models relating trunk flare circumference (\(Y\)) to diameter at breast height (\(X\)). As noted by Hilbert et al. (2020) and North et al. (2015), there are several potential uses of this type of model, including:

  • determining planting space needed for trees based on their potential size
  • identifying areas where there might be current or future damage to infrastructure due to tree roots
  • estimating stump size for cost estimates associated with stump removal.

The data for this set of exercises is contained in the Data4Ecologists package and can be accessed using:

library(Data4Ecologists)
data(trunkfl)
  1. Subset the data to include only Acer saccharinum (Species = 4; sugar maple).
  2. Fit a linear model relating trunk flare (\(Y\)) to diameter at breast height (\(X\)) using lm. Evaluate the assumptions of the model using residual plots.
  3. Fit a model where the variance is assumed to increase with dbh using the varPower function.
  4. Describe the fitted GLS model using a set of equations. Match the parameters in your equations to the output generated using the summary function applied to your fitted GLS model.
  5. Create a plot of Pearson residuals, \(r_i = \frac{Y_i - \hat{Y}_i}{\sqrt{\widehat{var}(Y_i)}}\) versus \(\hat{Y}_i\) (you can use plot(modelname) to accomplish this task). Comment on whether you think the GLS model is appropriate.
  6. Use AIC to compare the fit of the standard linear model and the fit of the GLS model (use AIC(model1, model2), where model1 and model2 are the names you assign to the two fitted models). Before making this comparison, refit the linear model using gls (this will ensure both models are fit using “REML”, which will be discussed in a later chapter). Which model appears to give the better fit (note: lower AIC is “better”)?
  7. Inspect the estimated regression coefficients and their standard errors. Do they change much when you account for non-constant variance?

References

Hilbert, D. R., North, E. A., Hauer, R. J., Koeser, A. K., McLean, D. C., Northrop, R. J., … Parbs, S. (2020). Predicting trunk flare diameter to prevent tree damage to infrastructure. Urban Forestry & Urban Greening, 49, 126645.
North, E. A., Johnson, G. R., & Burk, T. E. (2015). Trunk flare diameter predictions as an infrastructure planning tool to reduce tree and sidewalk conflicts. Urban Forestry & Urban Greening, 14(1), 65–71.