One third of all islands in the Indo-Pacific are atolls. Despite being the most common island type, atolls remain widely overlooked in global biodiversity studies. The importance of seabirds for the natural functioning and resilience is becoming increasingly recognised. Vice versa, however, it remains unquantified how relevant atolls are on a macro-ecological scale for seabirds. In this study, we gathered seabird nesting census data from atolls across the Indo-Pacific and developed a Bayesian predictive model to estimate seabird nesting colonies on all atolls of the Indo-Pacific. More than 31 million seabirds are nesting atolls - more than across entire Europe, and about 25% of all tropical seabirds worldwide. Atolls are globally important sites for seabird nesting. Protecting and future-proofing atolls against global change has to become a priority to preserve a significant fraction of the world's tropical seabirds.
We compiled environmental data for each of the Indo-Pacific's 280 atolls from literature reports and remote-sensing satellite databases. Seabird nesting census data for each atoll was either abundance-based, incidence-based (presence/absence), or unknown. We computed a Principal Component Analysis (PCA) using the environmental data. We created to separate seabird datasets, (1) count (abundance)-based and (2) incidence-based. In the first step, we used the presence-absence seabird data to predict seabird occurrence (or absence) on the unknown atolls. In the second step, we used the count-based seabird data to predict seabird nesting abundances on the unknown atolls.
@article{steibl2024seabird,
title={Atolls are globally important sites for tropical seabirds},
author={Steibl, Sebastian and Steiger, Simon and Wegmann, Alex S and Holmes, Nick D and Young, Hillary S and Carr, Pete and Russell, James C},
journal={Nature Ecology & Evolution},
year={2024}
doi={10.1038/s41559-024-02496-4}
}
- Install the R programming language (Windows, MacOS, Linux)
- Install the Julia programming language. In the installation wizard, check the option to Add julia to the PATH to skip step 3.
- Add R and Julia to the PATH (Instructions for Julia, the process is the same for R except that you will have to add the directory of the R binary). You can test if you previously added R or Julia to the PATH by running
R
orjulia
in a Unix shell. Tip: If you do not know the path to your Julia binary, you can writeSys.BINDIR
in the Julia REPL. For the equivalent in R, runR.home("bin")
in an R console.
Install an environment that allows you to run bash scripts, like Git Bash
Unix-based systems support executing bash scripts natively. However, since the default shell used by MacOS is now the Z shell, we recommend running the commands under Reproducing results in the bash shell.
You can reproduce the analysis and corresponding visualisations by executing the reproduce.sh
script.
To do so, start by making this file executable on your machine:
$ chmod +x /path/to/repository/reproduce.sh
Replace the segment /path/to/repository
below with the path to the project folder on your machine.
Next, run all scripts by executing reproduce.sh
:
$ ./reproduce.sh /path/to/repository true true false
Replace the segment path/to/repository
with the path to the project folder on your machine.
Note on Z shell:
To run the above command in Z shell, prefix with bash
, i.e., % bash ./reproduce.sh ...
.
Additional arguments:
The extra arguments to reproduce.sh
(true true false
) are forwarded to the Julia scripts, and intend to give the user more fine-grained control over the runtime.
The first argument determines if the analysis scripts sample from the posterior (true
) or attempt to load previously saved chains (false
). Loading saved chains requires having sampled from the posterior at least once on your machine.
The second argument determines if cross validation is performed (true
) or skipped (false
).
The third argument determines if sensitivity analyses with different prior settings are run (true
), or if only the default prior is used (false
).
Note that running sensitivity analyses quadruples the runtime.
Run-time: The analyses were performed on a MacBook Pro (M1) and took 90 minutes, including 10 minutes of package download and precompilation. These values depend on the speed of your computer and internet connection.
A sentence or two about the project structure.
├── R # R scripts
│ ├── wrangle # Data wrangling of remotesensing data
│ └── create_figures.R # Creates raw figures for article
├── data # CSV files which are *inputs to* the model
├── figures # Final figures used in the article
├── julia # Julia scripts
│ ├── scripts # Analysis scripts
│ ├── src # Modules defining functions and variables
│ └── reproduce.jl # Runs all julia scripts
├── manuscript # Directory with manuscript
├── renv # Renv for storing R package versions
├── results # Outputs of modeling
│ ├── chains # Chains will be saved here
│ ├── data # CSV files which are *outputs of* the model
│ ├── png # PNG files
│ └── svg # SVG files
.
.
.
└── reproduce.sh # Execute entire model pipeline, see "Installation" above for instructions
The raw data downloaded from Copernicus and JISAO are too large to share on GitHub.
We are only uploading the specifications we used to download them and the scripts we used to clean them.
The scripts can be found in the R/wrangle
directory.