DNA detected biodiversity in cryptic habitats

Introduction to DNA detected biodiversity in cryptic habitats

The purpose of this use case is to help biodiversity researchers, monitoring initiatives with the selection of localities and areas for further sampling when targeting cryptic biodiversity with eDNA metabarcoding methods. It is envisioned that every new sample will update the model and recalculate priorities. DNA-based methods are highly efficient in targeting organism groups that normally receive little attention, e.g. fungi, bacteria, archaea, protists, nematodes and micro-invertebrates. Methods are simple and cost-effective, also at scales where traditional sampling regimes would be too costly in terms of money, time and labour. However, to be able to include such data in global biodiversity conservation efforts, it is necessary to both to collect a wider global sampling, and understand diversity in cryptic environments better. Therefore this case-study will focus on how a digital twin such as BioDT can be used to identify priority areas for further sampling based on some user-defined criteria/constraints such as;

  • Geographical constraints (countries, user defined larger areas)
  • Landscape constraints (e.g. habitat type / land use class / ecoregion constraints)
  • Taxonomic constraints (bacteria, fungi, protozoa, eukaryotes)
  • Prioritization parameters (e.g. heterogeneity of communities within units)
  • Number of samples

The DT will focus on Denmark and use eDNA metabarcoding datasets for fungi and bacteria, as well as national maps of land-use types. It will allow the user to set basic constraints for future samples, and then use existing data to predict where further samples are best placed. Information from new samples will eventually feed into the model and provide an updated map of priority areas.

usa case

DNA detected biodiversity, poorly known habitats and Digital Twin Models

For the development of this BioDT prototype this use case will use GBIF mediated soil metabarcoding data from bacteria and fungi from the Biowide study and other additional Danish metabarcoding studies. As the base map with habitat types we will use Basemap04 which is a land use / land cover map for Denmark. Basemap04 integrates and combines a number of publicly available data into one nationwide map for Denmark. GBIF data may be accessed by APIs and filtered already at the source, but may also be downloaded as a full dataset and filtered locally. The taxonomically and geographically filtered data is binned. Spatial binning of species occurrences can use Uber’s H3 system (hexagonal hierarchical spatial index).

For the use of specific or weighted prioritization, a number of metrics are calculated per hexagon, per land used type, and dissimilarities between hexagons. Richness per unit is the count of unique taxonomic units. Uniquity is a quantitative and spatially scalable measure of uniqueness of an area based on the species composition correcting for sampling (or land use type) bias. Coverage measures the amount of detected richness compared to the estimated total richness. Other measures may be relevant to include.


Watch an interview with Dmitry Schigel from GBIF on the Prototype Digital Twin (pDT) for DNA detected biodiversity in cryptic habitats