Modified 4 Jun 2024; Added by Bas Kempen
Tutorial
Machine Learning for Digital Soil Mapping
This tutorial shows how to fit a random forest model for soil data and use this model to spatially predict a soil property of interest across a mapping area using the statistical software R. The processing steps are illustrated with a sample dataset from North Macedonia. The tutorial consists of five parts:
-
‘Model fit’: fitting a random forest model with the ranger package.
-
‘Recursive feature elimination’: method for removing redundant covariates from the covariate stack.
-
‘Model accuracy’: assessing accuracy statistics of the fitted random forest model.
-
‘Spatial prediction’: applying the model to create a gridded soil map for an area of interest.
-
‘Uncertainty assessment’: calculating the 90% prediction interval with the quantile random forest method.
After completing this tutorial you will be able to:
- Train a random forest model with the ranger package.
- Assess and interpret the model outputs, including the accuracy statistics.
- Apply the model to a covariate stack to generate a gridded soil map.
- Quantify prediction uncertainty of a trained random forest model.
Materials
License: This tutorial is released under the GNU GPL v3.0 license. GNU GPL v3.0 is a strong copyleft license. This means that you may use the code and change/modify the code. If you distribute copies or modifications of the code, you are required to release these updates under the GPL v3 license.
Disclaimer: This tutorial is provided without warranty. ISRIC is not obliged to provide updates or “bug fixes” of any kind. ISRIC will not provide user support for this tutorial.
Even though this tutorial is created with utmost care, ISRIC cannot be held liable for any damage caused by using this tutorial or any content therein in whatever form, whether or not caused by possible errors or faults nor for any consequences thereof.