Contemporary species distribution modeling tools for python.
Source code: earth-chris/elapid
elapid is a series of species distribution modeling tools for python. This includes a custom implementation of Maxent and a suite of methods to simplify working with biogeography data.
The name is an homage to A Biogeographic Analysis of Australian Elapid Snakes (H.A. Nix, 1986), the paper widely credited with defining the essential bioclimatic variables to use in species distribution modeling. It's also a snake pun (a python wrapper for mapping snake biogeography).
pip install elapid or
conda install -c conda-forge elapid
glmnet is optional, but recommended. This can be done with
pip install elapid[glmnet] or
conda install -c conda-forge elapid glmnet. For more support, and for information on why this package is recommended, see this page.
conda install is recommended for Windows users. While there is a
pip distribution, you may experience some challenges. The easiest way to overcome them is to use Windows Subsystem for Linux (WSL). Otherwise, see this page for support.
Why use elapid?¶
The amount and quality of bioegeographic data has increased dramatically over the past decade, as have cloud-based tools for working with it.
elapid was designed to provide a set of modern, python-based tools for working with species occurrence records and environmental covariates to map different dimensions of a species' niche.
elapid supports working with modern geospatial data formats and uses contemporary approaches to training statistical models. It uses
sklearn conventions to fit and apply models,
rasterio to handle raster operations,
geopandas for vector operations, and processes data under the hood with
This makes it easier to do things like fit/apply models to multi-temporal and multi-scale data, fit geographically-weighted models, create ensembles, precisely define background point distributions, and summarize model predictions.
It does the following things reasonably well:
Select random geographic point samples (aka background or pseudoabsence points) within polygons or rasters, handling
nodata locations, as well as sampling from bias maps (using
Extract and annotate point data from rasters, creating
GeoDataFrames with sample locations and their matching covariate values (using
elapid.annotate()). On-the-fly reprojection, dropping nodata, multi-band inputs and multi-file inputs are all supported.
Calculate zonal statistics from multi-band, multi-raster data into a single
GeoDataFrame from one command (using
Transform covariate data into derivative
features to expand data dimensionality and improve prediction accuracy (like
elapid.HingeTransformer(), or the all-in-one
Species distribution modeling
Train and apply species distribution models based on annotated point data, configured with sensible defaults (like
Training spatially-aware models
Compute spatially-explicit sample weights, checkerboard train/test splits, or geographically-clustered cross-validation splits to reduce spatial autocorellation effects (with
Applying models to rasters
Apply any pixel-based model with a
.predict() method to raster data to easily create prediction probability maps (like training a
RandomForestClassifier() and applying with
Cloud-native geo support
Work with cloud- or web-hosted raster/vector data (on
s3://, etc.) to keep your disk free of temporary files.
Check out some example code snippets and workflows on the examples page.
elapid requires some effort on the user's part to draw samples and extract covariate data. This is by design.
Selecting background samples, computing sample weights, splitting train/test data, and specifying training parameters are all critical modeling choices that have profound effects on inference and interpretation.
The extra flexibility provided by
elapid enables more control over the seemingly black-box approach of Maxent, enabling users to better tune and evaluate their models.