Classifying forest inventory data into species-based forest community types at broad extents: exploring tradeoffs among supervised and unsupervised approaches


Background: Knowledge of the different kinds of tree communities that currently exist can provide a baseline for assessing the ecological attributes of forests and monitoring future changes. Forest inventory data can facilitate the development of this baseline knowledge across broad extents, but they first must be classified into forest community types. Here, we compared three alternative classifications across the United States using data from over 117,000 U.S. Department of Agriculture Forest Service Forest Inventory and Analysis (FIA) plots.

Methods: Each plot had three forest community type labels: (1) “FIA” types were assigned by the FIA program using a supervised method; (2) “USNVC” types were assigned via a key based on the U.S. National Vegetation Classification; (3) “empirical” types resulted from unsupervised clustering of tree species information. We assessed the degree to which analog classes occurred among classifications, compared indicator species values, and used random forest models to determine how well the classifications could be predicted using environmental variables.

Results: The classifications generated groups of classes that had broadly similar distributions, but often there was no one-to-one analog across the classifications. The longleaf pine forest community type stood out as the exception: it was the only class with strong analogs across all classifications. Analogs were most lacking for forest community types with species that occurred across a range of geographic and environmental conditions, such as loblolly pine types. Indicator species metrics were generally high for the USNVC, suggesting that USNVC classes are floristically well-defined. The empirical classification was best predicted by environmental variables. The most important predictors differed slightly but were broadly similar across all classifications, and included slope, amount of forest in the surrounding landscape, average minimum temperature, and other climate variables.

Conclusions: The classifications have similarities and differences that reflect their differing approaches and objectives. They are most consistent for forest community types that occur in a relatively narrow range of environmental conditions, and differ most for types with wide-ranging tree species. Environmental variables at a variety of scales were important for predicting all classifications, though strongest for the empirical and FIA, suggesting that each is useful for studying how forest communities respond to of multi-scale environmental processes, including global change drivers

  • Citation: Costanza, Jennifer K.; Faber-Langendoen, Don; Coulston, John W.; Wear, David N. 2018. Classifying forest inventory data into species-based forest community types at broad extents: exploring tradeoffs among supervised and unsupervised approaches. Forest Ecosystems. 5(1): 121-.
  • Keywords: Big data, Correspondence analysis, Dominant species, Forest communities, Global change, Hierarchical classification, Indicator species, Random forests, Species assemblages
  • Posted Date: May 23, 2018
  • Modified Date: September 6, 2018
  • Requesting Print Publications

    Publication requests are subject to availability. Fiscal responsibility limits the hardcopies of publications we produce and distribute. Electronic versions of publications may be downloaded, distributed and printed.

    Please make any requests at

    Publication Notes

    • This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.
    • Our online publications are scanned and captured using Adobe Acrobat. During the capture process some typographical errors may occur. Please contact the SRS webmaster if you notice any errors which make this publication unusable.
    • To view this article, download the latest version of Adobe Acrobat Reader.