Digital Soil Mapping

Digital soil mapping is the creation of spatial soil information systems using field and laboratory methods coupled with spatial and non-spatial soil inference systems (Lagacherie, McBratney and Voltz, 2006). AfSIS is producing digital soil maps using legacy data (e.g. from the existing ISRIC-WISE and SOTER databases as well as the new legacy data collection Africa Soil Profiles Database (version 1.0 now available). These maps are being developed using approaches and standards that are fully compliant with the Global Digital Soil Map initiative. The first round of these maps, using the Africa Soil Profiles Database, is now available.

We will also produce digital soil maps using the sentinel site soil data, MODIS, Landsat and SRTM derivatives. Both types of maps are expected to be available for the entire project area as “version 1.0” products at the end of the 4-year period. 

As illustrated in the diagram below, digital soil map is a spatial database of soil properties that is based on a statistical sample of landscapes or regions and that permits functional interpretation, spatial prediction and mapping of soil properties relevant to soil management and policy decisions. A digital soil map provides (i) information on a soil’s capacity to provide ecosystem services (such as ability to infiltrate water, produce crops, store carbon); (ii) a geographical representation of soil constraints (such as aluminum toxicity, carbon deficit, sub-soil restrictions) with known confidence, (iii) spatial targeting of management recommendations, and (iv) a baseline for change detection and impact assessment.


Digital soil mapping uses statistical models to predict soil functional properties and degradation prevalence at unobserved locations in the landscape. The most basic model for soil-landscape prediction can be written as:

si = f(Q)i + ei

Where si is a soil property or condition of interest at a given geographical location (i), Q is a vector of covariates (such as reflectance data from satellite images, digital terrain models and/or climate surfaces), and e is an uncertainty parameter. This is essentially the classical state factor model of soil formation, which states that soil condition, or more broadly ecosystem condition (L, for larger system), is a function of state factors including climate (cl), organisms (o), relief (r), parent material (p), system age (or time, t) and any other, typically more local and historically contingent factors (Jenny, 1941; Amundson & Jenny, 1997). The model implies that once the spatial distribution of state factors is known, specific soil properties or conditions may be inferred geographically on the basis of f(Q) and the residual (spatial) distribution of e.

There are a variety of statistical approaches that have been used to parameterize this basic model. These differ in terms of their representational realism and computational complexity and include: classical geostatistics (e.g., regression kriging, co-simulation, etc.), as well as more recent approaches based on hierarchical models (Pinheiro & Bates, 2002), generalized estimating equations (Liang & Zeger, 1986), additive models (Hasti & Tibshirani, 1990), and Markov Chain, Monte Carlo simulation (MCMC, Clark & Gelfand, 2006) among others. In the context of this project we will develop pragmatic guidelines as to when and how these different techniques can be used appropriately. The guidelines will be supported by comparative, worked examples and code implemented in the freely available R environment for statistical computing (

The example from Western Kenya below illustrates how remote sensing and new ground observations are being combined to produce digital soil maps.


The grids shown in the figure below are a sampling of those currently being used to produce the first generation of continent-wide soil property maps.


Characterization of additional sentinel sites would further reduce the statistical uncertainties in the underlying spatial models that will be developed under this activity. Based on the current sample, we will be able to describe these in quantitative terms, and thus provide spatially explicit recommendations as to where and over what aerial extent additional sampling, and surveillance activities should be undertaken.