Skip to main content
U.S. flag

An official website of the United States government

Curious or spurious correlations within a national-scale forest inventory?

Informally Refereed

Abstract

Foresters are increasingly required to assess trends not only in traditional forest attributes (e.g., growing-stock volumes), but also across suites of forest health indicators and site/climate variables. Given the tenuous relationship between correlation and causality within extremely large datasets, the goal of this study was to use a nationwide annual forest inventory to determine levels of correlation among a wide array of database fields to aid foresters in separating correlation from causality in comprehensive forest resource assessments. In examining more than 15,000 individual correlations, we found the overwhelming majority (> 85 percent) of correlation coefficients were under 0.1. Site variables (e.g., elevation) had the highest mean correlations, while tree variables (e.g., live aboveground biomass) had the lowest mean correlations with all other variables. Nearly all the high correlations (>0.6) were between variables substantially autocorrelated (e.g., site class code and site index). Given that most correlations within a large-scale forest inventory dataset are very low with the remainder being nonsensical or autocorrelates, finding a highly correlated pair of variables with no apparent autocorrelation deserves further exploration.

Parent Publication

Citation

Woodall, Christopher W.; Westfall, James A. 2012. Curious or spurious correlations within a national-scale forest inventory? In: McWilliams, Will; Roesch, Francis A. eds. 2012. Monitoring Across Borders: 2010 Joint Meeting of the Forest Inventory and Analysis (FIA) Symposium and the Southern Mensurationists. e-Gen. Tech. Rep. SRS-157. Asheville, NC: U.S. Department of Agriculture Forest Service, Southern Research Station. 39-43.
https://www.fs.usda.gov/research/treesearch/40970