Applying Nearest Neighbor Gaussian Processes to Massive Spatial Data Sets: Forest Canopy Height Prediction Across Tanana Valley Alaska

2017 
Light detection and ranging (LiDAR) data provide critical information on the three-dimensional structure of forests. However, collecting wall-to-wall LiDAR data at regional and global scales is cost prohibitive. As a result, studies employing LiDAR data from airborne platforms typically collect data via strip sampling; leaving large swaths of the forest domain unmeasured by the instrument. Frameworks to accommodate incomplete coverage information from LiDAR instruments are essential to advance our understanding of forest structure and begin effectively monitoring forest resource dynamics over time. Here, we define and assess several spatial regression models capable of delivering complete coverage forest canopy height prediction maps with associated uncertainty estimates using sparsely sampled LiDAR data. Despite the sparsity of the LiDAR data considered, the number of observations is large, e.g., n=5x10^6. Computational hurdles associated with developing the desired data products is overcome by using highly scalable hierarchical Nearest Neighbor Gaussian Process (NNGP) models. We outline new Markov chain Monte Carlo (MCMC) algorithms that provide improved convergence and run time over existing algorithms. We also propose a MCMC free hybrid implementation of NNGP. We assess the computational and inferential benefits of these alternate NNGP specifications using simulated data sets and LiDAR data collected over the US Forest Service Tanana Inventory Unit (TIU) in a remote portion of Interior Alaska. The resulting data product is the first statistically robust map of forest canopy for the TIU.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    23
    Citations
    NaN
    KQI
    []