Dispersal-Niche Continuum Index (DNCI) Functions
The Dispersal-Niche Continuum Index (DNCI) functions in MetaCommunityMetrics
quantifies the balance between dispersal and niche processes within a metacommunity, providing insight into community structure and the relative influence of these two key ecological drivers. The function DNCI_multigroup
in this package is adapted from the R package DNCImper
: Assembly process identification based on SIMPER analysis. These methods, originally developed by Clarke(1993) and later refined by Gibert & Escarguel(2019) and Vilmi et al.(2021), offer powerful tools for identifying the processes underlying species assembly in metacommunities.
Background
The DNCI functions is built around the Per-SIMPER and DNCI analyses. PerSIMPER, based on the Similarity Percentage (SIMPER) analysis developed by Clarke (1993), assesses the contribution of individual taxa to overall dissimilarity (OAD) between groups of assemblages. PerSIMPER enhances this by comparing empirical SIMPER plots with randomized plots generated through matrix permutation, which helps identify whether niche, dispersal, or both processes are driving community assembly.
The DNCI (Dispersal-Niche Continuum Index) further extends this approach by transforming the qualitative results of PerSIMPER into a quantitative index, providing a straightforward measure of the influence of niche and dispersal processes on community structure.
Functionality Overview
The DNCI functions in MetaCommunityMetrics
allow you to analyze the processes driving species assembly within your dataset. By comparing empirical data with randomized permutations, one can determine the extent to which niche and dispersal processes influence the structure of metacommunities. Before calculating the DNCI, groupings of sites (clusters) are required, as the DNCI relies on analyzing community composition across spatial groups. This package provides a function to perform the necessary clustering, which is not available in the equivalent R package. When the DNCI value is significantly below zero, dispersal processes are likely the dominant drivers of community composition. In contrast, a DNCI value significantly above zero suggests that niche processes play a primary role in shaping community composition. If the DNCI value is not significantly different from zero, it indicates that dispersal and niche processes contribute equally to variations in community composition.
The Functions
- create_clusters: Groups sampling locations based on their spatial attributes and species richness, which can then be used to assess DNCI.
- plot_clusters: Visualizes the clusters created, allowing for an intuitive understanding of spatial groupings.
- DNCI_multigroup: Computes the Dispersal-Niche Continuum Index (DNCI) across multiple groups, helping to quantify the relative influence of niche versus dispersal processes.
MetaCommunityMetrics.create_clusters
— Functioncreate_clusters(time::Vector{Int}, latitude::Vector{Float64}, longitude::Vector{Float64}, patch::Vector{Int}, total_richness::Vector{Int}) -> Dict{Int, DataFrame}
This function creates clusters (groupings of patches/sites) for each unique time step in a dataset which can then used for calculating DNCI. Only presnece-absence data can be used. Please remove singletons (taxa/species that occuring at one patch/site within a time step) before using this function.
Arguments
time::Vector
: A vector indicating the time each sample was taken.latitude::Vector
: A vector indicating the latitude of each sample.longitude::Vector
: A vector indicating the longitude of each sample.patch::Vector
: A vector indicating the spatial location (patch) of each sample. At least 10 patches are required for clustering.total_richness::Vector
: A vector indicating the total species richness at each plot at each time step.
Returns
Dict{Int, DataFrame}
: A dictionary where each key represents a unique time point from the input data, with the corresponding value being aDataFrame
for that time step. EachDataFrame
contains the following columns:Time
,Latitude
,Longitude
,Patch
,Total_Richness
, andGroup
(indicating the assigned cluster).
Details This function performs hierarchical clustering on the geographical coordinates of sampling patches/sites at each unique time step, assuming that organism dispersal occurs within the study region. It incorporates checks and adjustments to ensure the following conditions are met: at least 2 clusters, a minimum of 5 patches/sites per cluster, and that the variation in the number of taxa/species and patches/sites per group does not exceed 40% and 30%, respectively. These conditions are critical for calculating an unbiased DNCI value, and the function will issue warnings if any are not fulfilled.
Example
julia> using MetaCommunityMetrics, Pipe, DataFrames
julia> df = load_sample_data()
48735×10 DataFrame
Row │ Year Month Day Sampling_date_order plot Species Abundance Presence Latitude Longitude
│ Int64 Int64 Int64 Int64 Int64 String3 Int64 Int64 Float64 Float64
───────┼────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 2010 1 16 1 1 BA 0 0 35.0 -110.0
2 │ 2010 1 16 1 2 BA 0 0 35.0 -109.5
3 │ 2010 1 16 1 8 BA 0 0 35.5 -109.5
4 │ 2010 1 16 1 9 BA 0 0 35.5 -109.0
5 │ 2010 1 16 1 11 BA 0 0 35.5 -108.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
48731 │ 2023 3 21 117 9 SH 0 0 35.5 -109.0
48732 │ 2023 3 21 117 10 SH 0 0 35.5 -108.5
48733 │ 2023 3 21 117 12 SH 1 1 35.5 -107.5
48734 │ 2023 3 21 117 16 SH 0 0 36.0 -108.5
48735 │ 2023 3 21 117 23 SH 0 0 36.5 -108.0
48725 rows omitted
julia> total_presence_df=@pipe df|>
groupby(_,[:Species,:Sampling_date_order])|>
combine(_,:Presence=>sum=>:Total_Presence) |>
filter(row -> row[:Total_Presence] > 1, _)
791×3 DataFrame
Row │ Species Sampling_date_order Total_Presence
│ String3 Int64 Int64
─────┼──────────────────────────────────────────────
1 │ BA 41 2
2 │ BA 50 2
3 │ BA 51 8
4 │ BA 52 19
5 │ BA 53 18
⋮ │ ⋮ ⋮ ⋮
787 │ SH 56 3
788 │ SH 60 4
789 │ SH 70 3
790 │ SH 73 5
791 │ SH 117 4
781 rows omitted
julia> total_richness_df= @pipe df|>
innerjoin(_, total_presence_df, on = [:Species, :Sampling_date_order], makeunique = true) |>
groupby(_,[:plot,:Sampling_date_order,:Longitude, :Latitude])|>
combine(_,:Presence=>sum=>:Total_Richness)|>
filter(row -> row[:Total_Richness] > 0, _)
2565×5 DataFrame
Row │ plot Sampling_date_order Longitude Latitude Total_Richness
│ Int64 Int64 Float64 Float64 Int64
──────┼─────────────────────────────────────────────────────────────────
1 │ 1 41 -110.0 35.0 5
2 │ 2 41 -109.5 35.0 4
3 │ 4 41 -108.5 35.0 2
4 │ 8 41 -109.5 35.5 2
5 │ 9 41 -109.0 35.5 3
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮
2561 │ 9 117 -109.0 35.5 5
2562 │ 10 117 -108.5 35.5 3
2563 │ 12 117 -107.5 35.5 6
2564 │ 16 117 -108.5 36.0 4
2565 │ 23 117 -108.0 36.5 5
2555 rows omitted
julia> result = create_clusters(total_richness_df.Sampling_date_order, total_richness_df.Latitude, total_richness_df.Longitude, total_richness_df.plot, total_richness_df.Total_Richness)
Dict{Int64, DataFrame} with 117 entries:
5 => 17×6 DataFrame…
56 => 23×6 DataFrame…
55 => 24×6 DataFrame…
35 => 23×6 DataFrame…
110 => 24×6 DataFrame…
114 => 22×6 DataFrame…
60 => 24×6 DataFrame…
30 => 20×6 DataFrame…
32 => 22×6 DataFrame…
6 => 19×6 DataFrame…
67 => 23×6 DataFrame…
45 => 23×6 DataFrame…
117 => 24×6 DataFrame…
73 => 23×6 DataFrame…
⋮ => ⋮
julia> println(result[1])
14×6 DataFrame
Row │ Time Latitude Longitude Patch Total_Richness Group
│ Int64 Float64 Float64 Int64 Int64 Int64
─────┼──────────────────────────────────────────────────────────
1 │ 1 35.0 -110.0 1 1 1
2 │ 1 35.0 -109.5 2 1 1
3 │ 1 35.5 -109.5 8 1 1
4 │ 1 35.5 -109.0 9 1 1
5 │ 1 35.5 -108.0 11 2 2
6 │ 1 36.0 -109.5 14 2 1
7 │ 1 36.0 -108.0 17 1 2
8 │ 1 36.5 -108.5 22 1 2
9 │ 1 35.0 -107.5 6 1 2
10 │ 1 36.0 -110.0 13 1 1
11 │ 1 36.0 -109.0 15 1 1
12 │ 1 36.5 -109.5 20 1 1
13 │ 1 36.5 -109.0 21 1 2
14 │ 1 36.5 -108.0 23 1 2
MetaCommunityMetrics.plot_clusters
— Functionplot_clusters(latitude::Vector{Float64}, longitude::Vector{Float64}, group::Union{AbstractVector, String})
Plots the clustering result at one time step of the create_cluster
function using the geographic coordinates and cluster assignments of patches/sites.
Arguments
latitude::Vector{Float64}
: A vector of latitude coordinates of the patches/sites.longitude::Vector{Float64}
: A vector of longitude coordinates of the patches/sites.group::Union{AbstractVector, String}
: A vector or string indicating the cluster assignments for each data point.
Returns
- A plot showing the patches/sites colored by the cluster assignment from the
create_clusters
function.
Details
- The function assigns a unique color to each cluster and plots the patches/sites based on their geographic coordinates.
- The patches/sites are colored according to the cluster assignment.
- The plot includes black borders around the markers for better visibility.
Example
julia> using MetaCommunityMetrics, Pipe, DataFrames
julia> df = load_sample_data()
48735×10 DataFrame
Row │ Year Month Day Sampling_date_order plot Species Abundance Presence Latitude Longitude
│ Int64 Int64 Int64 Int64 Int64 String3 Int64 Int64 Float64 Float64
───────┼────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 2010 1 16 1 1 BA 0 0 35.0 -110.0
2 │ 2010 1 16 1 2 BA 0 0 35.0 -109.5
3 │ 2010 1 16 1 8 BA 0 0 35.5 -109.5
4 │ 2010 1 16 1 9 BA 0 0 35.5 -109.0
5 │ 2010 1 16 1 11 BA 0 0 35.5 -108.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
48731 │ 2023 3 21 117 9 SH 0 0 35.5 -109.0
48732 │ 2023 3 21 117 10 SH 0 0 35.5 -108.5
48733 │ 2023 3 21 117 12 SH 1 1 35.5 -107.5
48734 │ 2023 3 21 117 16 SH 0 0 36.0 -108.5
48735 │ 2023 3 21 117 23 SH 0 0 36.5 -108.0
48725 rows omitted
julia> total_presence_df=@pipe df|>
groupby(_,[:Species,:Sampling_date_order])|>
combine(_,:Presence=>sum=>:Total_Presence) |>
filter(row -> row[:Total_Presence] > 1, _)
791×3 DataFrame
Row │ Species Sampling_date_order Total_Presence
│ String3 Int64 Int64
─────┼──────────────────────────────────────────────
1 │ BA 41 2
2 │ BA 50 2
3 │ BA 51 8
4 │ BA 52 19
5 │ BA 53 18
⋮ │ ⋮ ⋮ ⋮
787 │ SH 56 3
788 │ SH 60 4
789 │ SH 70 3
790 │ SH 73 5
791 │ SH 117 4
781 rows omitted
julia> total_richness_df= @pipe df|>
innerjoin(_, total_presence_df, on = [:Species, :Sampling_date_order], makeunique = true) |>
groupby(_,[:plot,:Sampling_date_order,:Longitude, :Latitude])|>
combine(_,:Presence=>sum=>:Total_Richness)|>
filter(row -> row[:Total_Richness] > 0, _)
2565×5 DataFrame
Row │ plot Sampling_date_order Longitude Latitude Total_Richness
│ Int64 Int64 Float64 Float64 Int64
──────┼─────────────────────────────────────────────────────────────────
1 │ 1 41 -110.0 35.0 5
2 │ 2 41 -109.5 35.0 4
3 │ 4 41 -108.5 35.0 2
4 │ 8 41 -109.5 35.5 2
5 │ 9 41 -109.0 35.5 3
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮
2561 │ 9 117 -109.0 35.5 5
2562 │ 10 117 -108.5 35.5 3
2563 │ 12 117 -107.5 35.5 6
2564 │ 16 117 -108.5 36.0 4
2565 │ 23 117 -108.0 36.5 5
2555 rows omitted
julia> clustering_result = create_clusters(total_richness_df.Sampling_date_order, total_richness_df.Latitude, total_richness_df.Longitude, total_richness_df.plot, total_richness_df.Total_Richness)
Dict{Int64, DataFrame} with 117 entries:
5 => 17×6 DataFrame…
56 => 23×6 DataFrame…
55 => 24×6 DataFrame…
35 => 23×6 DataFrame…
110 => 24×6 DataFrame…
114 => 22×6 DataFrame…
60 => 24×6 DataFrame…
30 => 20×6 DataFrame…
32 => 22×6 DataFrame…
6 => 19×6 DataFrame…
67 => 23×6 DataFrame…
45 => 23×6 DataFrame…
117 => 24×6 DataFrame…
73 => 23×6 DataFrame…
⋮ => ⋮
julia> println(clustering_result[1])
14×6 DataFrame
Row │ Time Latitude Longitude Patch Total_Richness Group
│ Int64 Float64 Float64 Int64 Int64 Int64
─────┼──────────────────────────────────────────────────────────
1 │ 1 35.0 -110.0 1 1 1
2 │ 1 35.0 -109.5 2 1 1
3 │ 1 35.5 -109.5 8 1 1
4 │ 1 35.5 -109.0 9 1 1
5 │ 1 35.5 -108.0 11 2 2
6 │ 1 36.0 -109.5 14 2 1
7 │ 1 36.0 -108.0 17 1 2
8 │ 1 36.5 -108.5 22 1 2
9 │ 1 35.0 -107.5 6 1 2
10 │ 1 36.0 -110.0 13 1 1
11 │ 1 36.0 -109.0 15 1 1
12 │ 1 36.5 -109.5 20 1 1
13 │ 1 36.5 -109.0 21 1 2
14 │ 1 36.5 -108.0 23 1 2
julia> plot_clusters(result[1].Latitude, result[1].Longitude, result[1].Group)
This plot shows the clustering result for time step 1 based on geographic coordinates:
MetaCommunityMetrics.DNCI_multigroup
— FunctionDNCI_multigroup(comm::Matrix, groups::Vector, Nperm::Int=1000; count::Bool=true) -> DataFrame
Calculates the dispersal-niche continuum index (DNCI) for multiple groups, a metric proposed by Vilmi(2021). The DNCI quantifies the balance between dispersal and niche processes within a metacommunity, providing insight into community structure and the relative influence of these two key ecological drivers. Please remove singletons (taxa/species that occuring at one patch/site within a time step) before using this function.
Arguments
comm::Matrix
: A presence-absence data matrix where rows represent observations (e.g., sites or samples) and columns represent species.groups::Vector
: A vector indicating the group membership for each row in thecomm
matrix. You can use thecreate_clusters
function to generate the group membership.Nperm::Int=1000
: The number of permutations for significance testing. Default is 1000.count::Bool=true
: A flag indicating whether the numeber of permutations is printed. Default isfalse
.
Returns
DataFrame
: A DataFrame containing the DNCI value, the associate confiden interval (CI_DNCI
) and variance (S_DNCI
) for each pair of groups.
Details
- The function calculates the DNCI for each pair of groups in the input data.
- When the DNCI value is significantly below zero, dispersal processes are likely the dominant drivers of community composition.
- In contrast, a DNCI value significantly above zero suggests that niche processes play a primary role in shaping community composition.
- If the DNCI value is not significantly different from zero, it indicates that dispersal and niche processes contribute equally to variations in community composition.
- Please remove singletons (taxa/species that occuring at one patch/site within a time step) before using this function.
- This function is a translation/adaptation of a function from the R package
DNCImper
, licensed under GPL-3. - Original package and documentation available at: https://github.com/Corentin-Gibert-Paleontology/DNCImper
Example
julia> using MetaCommunityMetrics, Pipe, DataFrames
julia> df = load_sample_data()
48735×10 DataFrame
Row │ Year Month Day Sampling_date_order plot Species Abundance Presence Latitude Longitude
│ Int64 Int64 Int64 Int64 Int64 String3 Int64 Int64 Float64 Float64
───────┼────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 2010 1 16 1 1 BA 0 0 35.0 -110.0
2 │ 2010 1 16 1 2 BA 0 0 35.0 -109.5
3 │ 2010 1 16 1 8 BA 0 0 35.5 -109.5
4 │ 2010 1 16 1 9 BA 0 0 35.5 -109.0
5 │ 2010 1 16 1 11 BA 0 0 35.5 -108.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
48731 │ 2023 3 21 117 9 SH 0 0 35.5 -109.0
48732 │ 2023 3 21 117 10 SH 0 0 35.5 -108.5
48733 │ 2023 3 21 117 12 SH 1 1 35.5 -107.5
48734 │ 2023 3 21 117 16 SH 0 0 36.0 -108.5
48735 │ 2023 3 21 117 23 SH 0 0 36.5 -108.0
48725 rows omitted
julia> total_presence_df=@pipe df|>
groupby(_,[:Species,:Sampling_date_order])|>
combine(_,:Presence=>sum=>:Total_Presence) |>
filter(row -> row[:Total_Presence] > 1, _)
791×3 DataFrame
Row │ Species Sampling_date_order Total_Presence
│ String3 Int64 Int64
─────┼──────────────────────────────────────────────
1 │ BA 41 2
2 │ BA 50 2
3 │ BA 51 8
4 │ BA 52 19
5 │ BA 53 18
⋮ │ ⋮ ⋮ ⋮
787 │ SH 56 3
788 │ SH 60 4
789 │ SH 70 3
790 │ SH 73 5
791 │ SH 117 4
781 rows omitted
julia> total_richness_df= @pipe df|>
innerjoin(_, total_presence_df, on = [:Species, :Sampling_date_order], makeunique = true) |>
groupby(_,[:plot,:Sampling_date_order,:Longitude, :Latitude])|>
combine(_,:Presence=>sum=>:Total_Richness)|>
filter(row -> row[:Total_Richness] > 0, _)
2565×5 DataFrame
Row │ plot Sampling_date_order Longitude Latitude Total_Richness
│ Int64 Int64 Float64 Float64 Int64
──────┼─────────────────────────────────────────────────────────────────
1 │ 1 41 -110.0 35.0 5
2 │ 2 41 -109.5 35.0 4
3 │ 4 41 -108.5 35.0 2
4 │ 8 41 -109.5 35.5 2
5 │ 9 41 -109.0 35.5 3
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮
2561 │ 9 117 -109.0 35.5 5
2562 │ 10 117 -108.5 35.5 3
2563 │ 12 117 -107.5 35.5 6
2564 │ 16 117 -108.5 36.0 4
2565 │ 23 117 -108.0 36.5 5
2555 rows omitted
julia> clustering_result = create_clusters(total_richness_df.Sampling_date_order, total_richness_df.Latitude, total_richness_df.Longitude, total_richness_df.plot, total_richness_df.Total_Richness)
Dict{Int64, DataFrame} with 117 entries:
5 => 17×6 DataFrame…
56 => 23×6 DataFrame…
55 => 24×6 DataFrame…
35 => 23×6 DataFrame…
110 => 24×6 DataFrame…
114 => 22×6 DataFrame…
60 => 24×6 DataFrame…
30 => 20×6 DataFrame…
32 => 22×6 DataFrame…
6 => 19×6 DataFrame…
67 => 23×6 DataFrame…
45 => 23×6 DataFrame…
117 => 24×6 DataFrame…
73 => 23×6 DataFrame…
⋮ => ⋮
julia> println(clustering_result[1])
14×6 DataFrame
Row │ Time Latitude Longitude Patch Total_Richness Group
│ Int64 Float64 Float64 Int64 Int64 Int64
─────┼──────────────────────────────────────────────────────────
1 │ 1 35.0 -110.0 1 1 1
2 │ 1 35.0 -109.5 2 1 1
3 │ 1 35.5 -109.5 8 1 1
4 │ 1 35.5 -109.0 9 1 1
5 │ 1 35.5 -108.0 11 2 2
6 │ 1 36.0 -109.5 14 2 1
7 │ 1 36.0 -108.0 17 1 2
8 │ 1 36.5 -108.5 22 1 2
9 │ 1 35.0 -107.5 6 1 2
10 │ 1 36.0 -110.0 13 1 1
11 │ 1 36.0 -109.0 15 1 1
12 │ 1 36.5 -109.5 20 1 1
13 │ 1 36.5 -109.0 21 1 2
14 │ 1 36.5 -108.0 23 1 2
julia> comm= @pipe df|>
innerjoin(_, total_presence_df, on = [:Species, :Sampling_date_order], makeunique = true) |>
innerjoin(_, total_richness_df, on = [:plot, :Sampling_date_order], makeunique = true) |>
filter(row -> row[:Sampling_date_order] == 1, _) |>
select(_, [:plot, :Species, :Presence]) |>
unstack(_, :Species, :Presence, fill=0) |>
select(_, Not(:plot)) |>
Matrix(_)
14×3 Matrix{Int64}:
1 0 0
1 0 0
1 0 0
1 0 0
1 1 0
1 1 0
1 0 0
0 0 1
0 1 0
0 1 0
1 0 0
0 1 0
0 0 1
0 0 1
julia> DNCI_result = DNCI_multigroup(comm, clustering_result[1].Group, 1000; count = false)
1×5 DataFrame
Row │ group1 group2 DNCI CI_DNCI S_DNCI
│ Int64 Int64 Float64 Float64 Float64
─────┼────────────────────────────────────────────
1 │ 1 2 0.533635 7.06888 3.53444
References
- Clarke, K. R. Non-parametric multivariate analyses of changes in community structure. Australian Journal of Ecology 18, 117-143 (1993). https://doi.org:https://doi.org/10.1111/j.1442-9993.1993.tb00438.x
- Gibert, C. & Escarguel, G. PER-SIMPER—A new tool for inferring community assembly processes from taxon occurrences. Global Ecology and Biogeography 28, 374-385 (2019). https://doi.org:https://doi.org/10.1111/geb.12859
- Vilmi, A. et al. Dispersal–niche continuum index: a new quantitative metric for assessing the relative importance of dispersal versus niche processes in community assembly. Ecography 44, 370-379 (2021). https://doi.org:https://doi.org/10.1111/ecog.05356