Abstract
The three nonparametric k nearest neighbour (kNN) approaches, most similar neighbour inference (MSN), random forests (RF) and random forests based on conditional inference trees (CF) were compared for spatial predictions of standing timber volume with respect to tree species compositions and for predictions of stem number distributions over diameter classes. Various metrics derived from airborne laser scanning (ALS) data and the characteristics of tree species composition obtained from coarse stand level ground surveys were applied as auxiliary variables. Due to the results of iterative variable selections, only the ALS data proved to be a relevant predictor variable set. The three applied NN approaches were tested in terms of bias and root mean squared difference (RMSD) at the plot level and standard errors at the stand level. Spatial correlations were considered in the statistical models. While CF and MSN performed almost similarly well, large biases were observed for RF. The obtained results suggest that biases in the RF predictions were caused by inherent problems of the RF approach. Maps for Norway spruce and European beech timber volume were exemplarily created. The RMSD values of CF at the plot level for total volume and the species-specific volumes for European beech, Norway spruce, European silver fir and Douglas fir were 32.8, 80.5, 99.0, 137.0 and 261.1%. These RMSD values were smaller than the standard deviation, although Douglas fir volume did not belong to the actual response variables. All three non-parametric approaches were also capable of predicting diameter distributions. The standard errors of the nearest neighbour predictions on the stand level were generally smaller than the standard error of the sample plot inventory. In addition, the employed model-based approach allowed kNN predictions of means and standard errors for stands without sample plots.
Similar content being viewed by others
References
Andersen HE, Breidenbach J (2007) Statistical properties of mean stand biomass estimators in a lidar-based double sampling forest survey. In: IAPRS Volume XXXVI, Part 3/W52. ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, September 12–14, 2007, Espoo, Finland
Baffetta F, Fattorini L, Franceschi S, Corona P (2009) Design-based approach to k-nearest neighbours technique for coupling field and remotely sensed data in forest surveys. Remote Sens Environ 113(3):463–475
Breidenbach J (2008a) Regionalisierung von Waldinventuren mittels aktiver Fernerkundungstechniken. PhD-Thesis in English and German. Universität Freiburg. http://www.freidok.uni-freiburg.de/volltexte/5440/pdf/Disertation_Breidenbach.pdf
Breidenbach J, Gläser C, Schmidt M (2008a) Estimation of diameter distributions by means of airborne laser scanner data. Can J For Res 38(6):1611–1620
Breidenbach J, Kublin E, McGaughey R, Andersen H, Reutebuch S (2008b) Mixed-effects models for estimating stand volume by means of small footprint airborne laser scanner data. Photogramm J Finl 21:4–15
Breidenbach J, Næsset E, Lien V, Gobakken T, Solberg S (2010) Prediction of species specific forest inventory attributes using a nonparametric semi-individual tree crown approach based on fused airborne laser scanning and multispectral data. Remote Sens Environ 114(4):911–924
Breiman L (1984) Classification and regression trees. Chapman & Hall/CRC, New York
Breiman L (2001) Random forests. Machine Learn 45:5–32
Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data. Deptartment of Statistics, University of California Berkeley, Technical Report, 666
Crookston N, Finley A (2008) yaImpute: an R package for k-NN imputation. J Stat Softw 23(10):1–16
Eskelson B, Temesgen H, Barrett T (2009) Estimating cavity tree and snag abundance using negative binomial regression models and nearest neighbour imputation methods. Can J For Res 39(9):1749–1765
Faraway J (2002) Practical regression and Anova using R. Ann Arbor, MI, self-published http://www.stat.lsa.umich.edu/faraway/book
Gittins R (1985) Canonical analysis: a review with applications in ecology. Springer, Berlin
Gobakken T, Næsset E (2004) Estimation of diameter and basal area distributions in coniferous forest by means of airborne laser scanner data. Scand J For Res 19:529–542
Heurich M (2008) Automatic recognition and measurement of single trees based on data from airborne laser scanning over the richly structured natural forests of the Bavarian Forest National Park. For Ecol Manag 255(7):2416–2433
Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat 15(3):651–674
Hothorn T, Hornik K, van de Wiel M, Zeileis A (2008) Implementing a class of permutation tests: the coin package. J Stat Softw 28(8):1–23
Hudak A, Crookston N, Evans J, Hall D, Falkowski M (2008a) Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data. Remote Sens Environ 112:2232–2245
Hudak A, Crookston N, Evans J, Hall D, Falkowski M (2008b) Aggregating pixel-level basal area predictions derived from LiDAR data to industrial forest stands in North-Central Idaho. In: Proceedings of the third forest vegetation simulator conference, pp 133–146
Hudak A, Crookston N, Evans J, Hall D, Falkowski M (2008c) Corrigendum to “Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data”. Remote Sens Environ 113(1):289–290
Kangas A, Maltamo M (2006) Forest inventory: methodology and applications. Springer, Dordrecht
Knaus J, Porzelius C, Binder H, Schwarzer G (2009) Easier parallel computing in R with snowfall and sfCluster. R J 1(1):54–59
Kublin E (2003) A uniform description of stem profiles—methods and programs—BDATPro. Forstw Cbl, 122(3): 183–200. http://www.blackwell-synergy.com/links/doi/10.1046/j.1439-0337.2003.00183.x/abs/
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2:18–22
Lin Y, Jeon Y (2006) Random forests and adaptive nearest neighbours. J Am Stat Assoc 101(474):578–590
Lindberg E, Holmgren J, Olofsson K, Olsson H, Wallerman J (2010) Estimation of tree lists from airborne laser scanning by combining single-tree and area-based methods. Int J Remote Sens 31(5):1175–1192
Loh W (2007) Classification and regression tree methods. In: Ruggeri F, Kenett R, Faltin FW (eds) Encyclopedia of statistics in quality and reliability. Wiley, Chichester, pp 315–323
Magnussen S, Boudewyn P (1998) Derivations of stand heights from airborne laser scanner data with canopy-based quantile estimators. Can J For Res 28(7):1016–1031
Magnussen S, McRoberts R, Tomppo E (2009) Model-based mean square error estimators for k-nearest neighbour predictions and applications using remotely sensed data for forest inventories. Remote Sens Environ 113(3):476–488
Maloney K, Weller D, Russell M, Hothorn T (2009) Classifying the biological condition of small streams: an example using benthic macroinvertebrates. J North Am Benthol Soc 28(4):869–884
Maltamo M, Packalén P, Suvanto A, Korhonen K, Mehtätalo L, Hyvönen P (2009) Combining ALS and NFI training data for forest management planning: a case study in Kuortane, Western Finland. Eur J For Res 128(3):305–317
McGaughey R (2004) Fusion, software for analyzing lidar data. http://forsys.cfr.washington.edu/JFSP06/lidar_&_ifsar_tools.htm
McRoberts R, Tomppo E, Finley A, Heikkinen J (2007) Estimating areal means and variances of forest attributes using the k-Nearest Neighbors technique and satellite imagery. Remote Sens Environ 111(4):466–480
Moeur M, Stage A (1995) Most similar neighbor: an improved sampling inference procedure for natural resource planning. For Sci 41:337–359
Morgan J, Sonquist J (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58:415–434
Næsset E (1997) Estimating timber volume of forest stands using airborne laser scanner data. Remote Sens Environ 61:246–253
Næsset E (2002) Predicting forest stand characteristics with airborne scanning laser using a practical two-stage procedure and field data. Remote Sens Environ 80(1):88–99
Næsset E (2004) Practical large-scale forest stand inventory using a small-footprint airborne scanning laser. Scand J For Res 19(2):164–179
Nelson R, Krabill W, Tonelli J (1988) Estimating forest biomass and volume using airborne laser data. Remote Sens Environ 24:247–267
Nilsson M (1996) Estimation of tree heights and stand volume using an airborne lidar system. Remote Sens Environ 56:1–7
Nothdurft A, Saborowski J, Breidenbach J (2009) Spatial prediction of forest stand variables. Eur J For Res 128(3):241–251
Ørka HO, Næsset E, Bollandsås OM (2009) Classifying species of individual trees by intensity and structure features derived from airborne laser scanner data. Remote Sens Environ 113(6):1163–1174
Packalén P, Maltamo M (2006) Predicting the plot volume by tree species using airborne laser scanning and aerial photographs. For Sci 52(6):611–622
Packalén P, Maltamo M (2008) Estimation of species-specific diameter distributions using airborne laser scanning and aerial photographs. Can J For Res 38(7):1750–1760
Packalén P, Suvanto A, Maltamo M (2009) A two stage method to estimate species-specific growing stock by combining ALS data and aerial photographs of known orientation parameters. Photogramm Eng Remote Sens (in press)
Persson A, Holmgren J, Söderman U (2002) Detecting and measuring individual trees using an airborne laser scanner. Photogramm Eng Remote Sens 68:925–932
R Development Core Team (2009) R: a language and environment for statistical computing. Vienna, Austria, R Foundation for Statistical Computing. http://www.R-project.org
Rao J (2003) Small area estimation. Wiley-Interscience, New York
Reynolds M, Burk T, Huang W-C (1988) Goodness-of-fit tests and model selection procedures for diameter distribution models. For Sci 34(2):373–399
Stage A, Crookston N (2007) Partitioning error components for accuracy-assessment of near-neighbour methods of imputation. For Sci 53:62–72
Straub C (2006) Automatic delineation and classification of forest stands based on airborne laser scanner data. Master’s thesis, Hochschule für Technik Stuttgart
Strobl C, Boulesteix AL, Kneib T, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 25(8):1471–2105
Strobl C, Boulesteix AL, Kneib T, Augustin T, Zeileis A (2008) Conditional variable importance for random forests. BMC Bioinformatics 307(9):1471–2105
Strobl C, Hothorn T, Zeileis A (2009) Party on! A new, conditional variable importance measure for random forests available in the party package. The R Journal (Accepted), http://epub.ub.uni-muenchen.de/9387/1/techreport.pdf
Venables W, Ripley B (2002) Modern applied statistics with S. Springer, New York
Weinacker H, Koch B, Heyder U, Weinacker R (2004) Development of filtering, segmentation and modelling modules for lidar and multispectral data as a fundament of an automatic forest inventory system. In: ISPRS Working Group VIII/2 ‘Laser-scanners for forest and landscape assessment’. University of Freiburg, Freiburg, Germany
Acknowledgments
We would like to thank Ron McRoberts for comments on variance estimators and Torsten Hothorn for his advice regarding conditional inference trees and Andrew Hudak for the provision of his unpublished papers. Valuable comments of two reviewers helped to improve the quality of the manuscript. This study was in part funded by the WoodWisdom project WW-IRIS.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by T. Knoke.
This article belongs to the special issue “Linking Forest Inventory and Optimization.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Breidenbach, J., Nothdurft, A. & Kändler, G. Comparison of nearest neighbour approaches for small area estimation of tree species-specific forest inventory attributes in central Europe using airborne laser scanner data. Eur J Forest Res 129, 833–846 (2010). https://doi.org/10.1007/s10342-010-0384-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10342-010-0384-1