Hypervolume Functions
Hypervolume is a method originally proposed by Hutchinson (1957). It provide methods to calculate the volume of a niche for a given specice and thus the ncihe overlap between two species. It helps to infer how niche breath of species has contribute to the co-occurance of different species in the same location at the same time. The hypervolume functions provide by this package are adapted from the R package MVNH
(https://github.com/lvmuyang/MVNH)
The Functions
MetaCommunityMetrics.MVNH_det
— FunctionMVNH_det(data::DataFrame; var_names::Vector{String}=String[]) -> DataFrame
Calculate the niche hypervolume of a species based on environmental variables.
Arguments
data::DataFrame
: DataFrame where each row represents an observation of a species (presence points) and columns represent environmental variables.var_names::Vector{String}=String[]
: Optional vector specifying names for the environmental variables. If empty, default names "variable1", "variable2", etc. will be used.
Returns
DataFrame
: A DataFrame containing:Correlation
: The correlation component (calculated as det(COV)/prod(variances))- One column for each environmental variable showing its variance
total
: The total hypervolume (calculated as the determinant of the covariance matrix)
Details
- Environmental variables are assumed to follow a multivariate normal distribution
- Variables should be normalized before using this function to avoid bias from different scales
- The function computes the covariance matrix of the input data, extracts variances, and calculates the determinant
- This function is a Julia implementation of the
MVNH_det
function from the R packageMVNH
(GPL-3) - Original package and documentation: https://github.com/lvmuyang/MVNH
Example
julia> using MetaCommunityMetrics, Pipe, DataFrames, Statistics
julia> df = load_sample_data()
48735×12 DataFrame
Row │ Year Month Day Sampling_date_order plot Species Abundance Presence Latitude Longitude temperature precipitation
│ Int64 Int64 Int64 Int64 Int64 String3 Int64 Int64 Float64 Float64 Float64 Float64
───────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 2010 1 16 1 1 BA 0 0 35.0 -110.0 19.0414 36.519
2 │ 2010 1 16 1 2 BA 0 0 35.0 -109.5 9.38964 64.928
3 │ 2010 1 16 1 8 BA 0 0 35.5 -109.5 9.47682 15.1981
4 │ 2010 1 16 1 9 BA 0 0 35.5 -109.0 12.915 45.2296
5 │ 2010 1 16 1 11 BA 0 0 35.5 -108.0 16.4379 42.2394
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
48731 │ 2023 3 21 117 9 SH 0 0 35.5 -109.0 15.7166 159.86
48732 │ 2023 3 21 117 10 SH 0 0 35.5 -108.5 15.7419 96.5712
48733 │ 2023 3 21 117 12 SH 1 1 35.5 -107.5 20.0481 32.7878
48734 │ 2023 3 21 117 16 SH 0 0 36.0 -108.5 15.1438 34.5151
48735 │ 2023 3 21 117 23 SH 0 0 36.5 -108.0 15.854 80.9382
48725 rows omitted
julia> data = @pipe df |>
filter(row -> row[:Presence] > 0, _) |>
filter(row -> row[:Species] == "BA", _) |>
select(_, [:temperature, :precipitation])
143×2 DataFrame
Row │ temperature precipitation
│ Float64 Float64
─────┼────────────────────────────
1 │ 7.99303 64.8003
2 │ 1.95878 48.6883
3 │ 15.995 51.9702
4 │ -0.0812167 39.1034
5 │ 19.1706 58.3263
⋮ │ ⋮ ⋮
139 │ 17.8614 66.4085
140 │ 14.5365 12.0312
141 │ 17.5499 14.3885
142 │ 10.2389 34.7715
143 │ 14.0303 39.0852
133 rows omitted
julia> result = MVNH_det(data; var_names=["Temperature", "Precipitation"])
1×4 DataFrame
Row │ Correlation Precipitation Temperature total
│ Float64 Float64 Float64 Float64
─────┼──────────────────────────────────────────────────
1 │ 0.997928 879.781 24.7847 21760.0
MetaCommunityMetrics.MVNH_dissimilarity
— FunctionMVNH_dissimilarity(data_1::DataFrame, data_2::DataFrame; var_names::Vector{String}=String[]) -> Dict{String, DataFrame}
Calculate niche dissimilarity between two species based on their environmental variables, using the Bhattacharyya distance and its components.
Arguments
data_1::DataFrame
: DataFrame for the first species, where each row represents an observation and columns represent environmental variables.data_2::DataFrame
: DataFrame for the second species, with the same structure asdata_1
.var_names::Vector{String}=String[]
: Optional vector specifying names for the environmental variables. If empty, default names "variable1", "variable2", etc. will be used.
Returns
Dict{String, DataFrame}
: A dictionary containing three DataFrames:"Bhattacharyya_distance"
: The total Bhattacharyya distance and its components"Mahalanobis_distance"
: The Mahalanobis component of the Bhattacharyya distance"Determinant_ratio"
: The determinant ratio component of the Bhattacharyya distance
Each DataFrame contains:
total
: The total value of the respective distance measurecorrelation
: The correlation component of the distance measure- One column for each environmental variable showing its contribution to the distance
Details
- The Bhattacharyya distance is calculated as the sum of two components:
- Mahalanobis component: (1/8) × (μ₁-μ₂)ᵀ × (S₁+S₂)/2⁻¹ × (μ₁-μ₂)
- Determinant ratio component: (1/2) × log(det((S₁+S₂)/2) / sqrt(det(S₁) × det(S₂)))
- Each component is further decomposed into individual variable contributions and correlation effects
- Environmental variables are assumed to follow a multivariate normal distribution
- Variables should be normalized before using this function to avoid bias from different scales
- This function is a Julia implementation inspired by the
MVNH
R package (GPL-3) - Original package and documentation: https://github.com/lvmuyang/MVNH
Example
julia> using MetaCommunityMetrics, Pipe, DataFrames, Statistics
julia> df = load_sample_data()
48735×12 DataFrame
Row │ Year Month Day Sampling_date_order plot Species Abundance Presence Latitude Longitude temperature precipitation
│ Int64 Int64 Int64 Int64 Int64 String3 Int64 Int64 Float64 Float64 Float64 Float64
───────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 2010 1 16 1 1 BA 0 0 35.0 -110.0 19.0414 36.519
2 │ 2010 1 16 1 2 BA 0 0 35.0 -109.5 9.38964 64.928
3 │ 2010 1 16 1 8 BA 0 0 35.5 -109.5 9.47682 15.1981
4 │ 2010 1 16 1 9 BA 0 0 35.5 -109.0 12.915 45.2296
5 │ 2010 1 16 1 11 BA 0 0 35.5 -108.0 16.4379 42.2394
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
48731 │ 2023 3 21 117 9 SH 0 0 35.5 -109.0 15.7166 159.86
48732 │ 2023 3 21 117 10 SH 0 0 35.5 -108.5 15.7419 96.5712
48733 │ 2023 3 21 117 12 SH 1 1 35.5 -107.5 20.0481 32.7878
48734 │ 2023 3 21 117 16 SH 0 0 36.0 -108.5 15.1438 34.5151
48735 │ 2023 3 21 117 23 SH 0 0 36.5 -108.0 15.854 80.9382
48725 rows omitted
julia> data_1 = @pipe df |>
filter(row -> row[:Presence] > 0, _) |>
filter(row -> row[:Species] == "BA", _) |>
select(_, [:temperature, :precipitation])
143×2 DataFrame
Row │ temperature precipitation
│ Float64 Float64
─────┼────────────────────────────
1 │ 7.99303 64.8003
2 │ 1.95878 48.6883
3 │ 15.995 51.9702
4 │ -0.0812167 39.1034
5 │ 19.1706 58.3263
⋮ │ ⋮ ⋮
139 │ 17.8614 66.4085
140 │ 14.5365 12.0312
141 │ 17.5499 14.3885
142 │ 10.2389 34.7715
143 │ 14.0303 39.0852
133 rows omitted
julia> data_2 = @pipe df |>
filter(row -> row[:Presence] > 0, _) |>
filter(row -> row[:Species] == "SH", _) |>
select(_, [:temperature, :precipitation])
58×2 DataFrame
Row │ temperature precipitation
│ Float64 Float64
─────┼────────────────────────────
1 │ 16.5938 21.3197
2 │ 16.1385 20.8511
3 │ 18.4342 60.962
4 │ 11.8572 49.7807
5 │ 12.0075 30.7462
⋮ │ ⋮ ⋮
54 │ 12.206 23.0181
55 │ 13.9233 62.2122
56 │ 20.2339 37.0448
57 │ 7.40668 116.384
58 │ 20.0481 32.7878
48 rows omitted
julia> result = MVNH_dissimilarity(data_1, data_2; var_names=["Temperature", "Precipitation"])
Dict{String, DataFrame} with 3 entries:
"Determinant_ratio" => 1×4 DataFrame…
"Bhattacharyya_distance" => 1×4 DataFrame…
"Mahalanobis_distance" => 1×4 DataFrame…
julia> result["Determinant_ratio"]
1×4 DataFrame
Row │ total correlation Temperature Precipitation
│ Float64 Float64 Float64 Float64
─────┼─────────────────────────────────────────────────────
1 │ 0.00738662 2.84135e-5 0.000672539 0.00668567
julia> result["Bhattacharyya_distance"]
1×4 DataFrame
Row │ total correlation Temperature Precipitation
│ Float64 Float64 Float64 Float64
─────┼────────────────────────────────────────────────────
1 │ 0.0100511 0.000122951 0.00296459 0.00696354
julia> result["Mahalanobis_distance"]
1×4 DataFrame
Row │ total correlation Temperature Precipitation
│ Float64 Float64 Float64 Float64
─────┼─────────────────────────────────────────────────────
1 │ 0.00266447 9.4538e-5 0.00229205 0.000277876
MetaCommunityMetrics.average_MVNH_det
— Functionaverage_MVNH_det(data::DataFrame, presence_absence::Vector{Int}, species::Union{AbstractVector, String};
var_names::Vector{String}=String[]) -> Float64
Calculate the average niche hypervolume across multiple species in a community dataset.
Arguments
data::DataFrame
: DataFrame containing environmental variables where each row represents an observation.presence_absence::Vector{Int}
: Vector indicating presence (1) or absence (0) for each observation indata
.species::Union{AbstractVector, String}
: Vector containing species identifiers corresponding to each observation indata
.var_names::Vector{String}=String[]
: Optional vector specifying names for the environmental variables. If empty, default names will be used.
Returns
Float64
: The average hypervolume across all species with presence data.
Details
- For each unique species, the function:
- Filters observations where the species is present (presence_absence = 1)
- Calculates the niche hypervolume using the
MVNH_det
function - Extracts the total hypervolume value
- The function then computes the mean of all individual species hypervolumes
- Species with no presence data are skipped in the calculation
- Environmental variables are assumed to follow a multivariate normal distribution
- Variables should be normalized before using this function to avoid bias from different scales
Example
julia> using MetaCommunityMetrics, Pipe, DataFrames, Statistics
julia> df = load_sample_data()
48735×12 DataFrame
Row │ Year Month Day Sampling_date_order plot Species Abundance Presence Latitude Longitude temperature precipitation
│ Int64 Int64 Int64 Int64 Int64 String3 Int64 Int64 Float64 Float64 Float64 Float64
───────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 2010 1 16 1 1 BA 0 0 35.0 -110.0 19.0414 36.519
2 │ 2010 1 16 1 2 BA 0 0 35.0 -109.5 9.38964 64.928
3 │ 2010 1 16 1 8 BA 0 0 35.5 -109.5 9.47682 15.1981
4 │ 2010 1 16 1 9 BA 0 0 35.5 -109.0 12.915 45.2296
5 │ 2010 1 16 1 11 BA 0 0 35.5 -108.0 16.4379 42.2394
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
48731 │ 2023 3 21 117 9 SH 0 0 35.5 -109.0 15.7166 159.86
48732 │ 2023 3 21 117 10 SH 0 0 35.5 -108.5 15.7419 96.5712
48733 │ 2023 3 21 117 12 SH 1 1 35.5 -107.5 20.0481 32.7878
48734 │ 2023 3 21 117 16 SH 0 0 36.0 -108.5 15.1438 34.5151
48735 │ 2023 3 21 117 23 SH 0 0 36.5 -108.0 15.854 80.9382
48725 rows omitted
julia> data = @pipe df |>
select(_, [:temperature, :precipitation])
48735×2 DataFrame
Row │ temperature precipitation
│ Float64 Float64
───────┼────────────────────────────
1 │ 19.0414 36.519
2 │ 9.38964 64.928
3 │ 9.47682 15.1981
4 │ 12.915 45.2296
5 │ 16.4379 42.2394
⋮ │ ⋮ ⋮
48731 │ 15.7166 159.86
48732 │ 15.7419 96.5712
48733 │ 20.0481 32.7878
48734 │ 15.1438 34.5151
48735 │ 15.854 80.9382
48725 rows omitted
julia> result = average_MVNH_det(data, df.Presence, df.Species; var_names=["Temperature", "Precipitation"])
22427.757500223863
MetaCommunityMetrics.average_MVNH_dissimilarity
— Functionaverage_MVNH_dissimilarity(data::DataFrame, presence_absence::Vector{Int}, species::Union{AbstractVector, String};
var_names::Vector{String}=String[]) -> Float64
Calculate the average niche dissimilarity between all unique pairs of species in a community dataset using Bhattacharyya distance.
Arguments
data::DataFrame
: DataFrame containing environmental variables where each row represents an observation.presence_absence::Vector{Int}
: Vector indicating presence (1) or absence (0) for each observation indata
.species::Union{AbstractVector, String}
: Vector containing species identifiers corresponding to each observation indata
.var_names::Vector{String}=String[]
: Optional vector specifying names for the environmental variables. If empty, default names will be used.
Returns
Float64
: The average Bhattacharyya distance across all unique species pairs.
Details
- For each unique pair of species, the function:
- Filters observations where each species is present (presence_absence = 1)
- Calculates the niche dissimilarity using the
MVNH_dissimilarity
function - Extracts the total Bhattacharyya distance value
- The function then computes the mean of all pairwise Bhattacharyya distances
- Species pairs where either species has no presence data are skipped
- Each species pair is processed only once (i.e., sp1-sp2 is calculated, but sp2-sp1 is skipped)
- Environmental variables are assumed to follow a multivariate normal distribution
- Variables should be normalized before using this function to avoid bias from different scales
Example
julia> using MetaCommunityMetrics, Pipe, DataFrames, Statistics
julia> df = load_sample_data()
48735×12 DataFrame
Row │ Year Month Day Sampling_date_order plot Species Abundance Presence Latitude Longitude temperature precipitation
│ Int64 Int64 Int64 Int64 Int64 String3 Int64 Int64 Float64 Float64 Float64 Float64
───────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 2010 1 16 1 1 BA 0 0 35.0 -110.0 19.0414 36.519
2 │ 2010 1 16 1 2 BA 0 0 35.0 -109.5 9.38964 64.928
3 │ 2010 1 16 1 8 BA 0 0 35.5 -109.5 9.47682 15.1981
4 │ 2010 1 16 1 9 BA 0 0 35.5 -109.0 12.915 45.2296
5 │ 2010 1 16 1 11 BA 0 0 35.5 -108.0 16.4379 42.2394
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
48731 │ 2023 3 21 117 9 SH 0 0 35.5 -109.0 15.7166 159.86
48732 │ 2023 3 21 117 10 SH 0 0 35.5 -108.5 15.7419 96.5712
48733 │ 2023 3 21 117 12 SH 1 1 35.5 -107.5 20.0481 32.7878
48734 │ 2023 3 21 117 16 SH 0 0 36.0 -108.5 15.1438 34.5151
48735 │ 2023 3 21 117 23 SH 0 0 36.5 -108.0 15.854 80.9382
48725 rows omitted
julia> data = @pipe df |>
select(_, [:temperature, :precipitation])
48735×2 DataFrame
Row │ temperature precipitation
│ Float64 Float64
───────┼────────────────────────────
1 │ 19.0414 36.519
2 │ 9.38964 64.928
3 │ 9.47682 15.1981
4 │ 12.915 45.2296
5 │ 16.4379 42.2394
⋮ │ ⋮ ⋮
48731 │ 15.7166 159.86
48732 │ 15.7419 96.5712
48733 │ 20.0481 32.7878
48734 │ 15.1438 34.5151
48735 │ 15.854 80.9382
48725 rows omitted
julia> result = average_MVNH_dissimilarity(data, df.Presence, df.Species; var_names=["Temperature", "Precipitation"])
0.029651910867403767