
radEmu is an R package for estimating
changes in the abundance of microbial taxa using amplicon or shotgun
sequencing technologies. Online documentation is available here.
If you are a microbial ecologist or
bioinformatician, some of the things that you may like
about radEmu include
radEmu uses your amplicon or shotgun sequencing to
estimate changes in the “absolute abundance” of microbial taxa. Here,
“absolute abundance” could be interpreted on the cell count, cell
concentration or DNA concentration scale. Yes! It’s true!
radEmu formalizes some of the nice things about
log-ratio-type methods for differential abundance, including
radEmu is robust to differential detection of taxa, so
you don’t have to worry about (e.g.) the different extraction/PCR
efficiency of your protocolradEmu is robust to unequal sampling effort. No need to
rarefy! (Actually, please don’t.)radEmu deals with zeroes natively, without any need for
arbitrary parameters like pseudocountsradEmu does not require that you have a “reference
taxon” that is not changing in abundance across samples
radEmu estimates differences in abundance
across taxaradEmu is most similar in
flavor to ALDEx2 and ANCOM (and ANCOM relatives), but doesn’t
require priors, log-ratio transformations (and thus pseudocounts), nor a
reference taxon!radEmu can adjust for relevant covariates, including
precision variables and confoundersradEmu achieves all of the above by jointly modeling
all taxa (i.e., it’s not a taxon-by-taxon model like corncob). This
makes it harder to parallelize, but fortunately testing can be
parallelized easily. (There’s is an example in the preprint’s supplementary
material, but let us know if you want a tutorial on how!)
On a standard desktop, radEmu can handle 1000 taxa, 800
samples and 12 covariates. You may want to get a 35-minute coffee break
while it runs, though.radEmu is publicly available in open-source software…
right here!If you are a statistician, some of the things that
you may like about radEmu include
Sadly we do not yet have a logo nice-looking logo. If you
would like to design us one, please let Amy know!
To download the radEmu package, use the code below.
# install.packages("devtools")
devtools::install_github("statdivlab/radEmu")
library(radEmu)The vignettes demonstrate example usage of the main functions. Please
file an issue
if you have a request for a tutorial that is not currently included. The
following code shows the easy-to-use syntax if your data is in a
phyloseq object, and you want to estimate parameters for
all taxa and run a test for the parameter associated with “Group” and
taxon 1:
ch_fit <- emuFit(formula = ~ Group + Study + Gender + Sampling,
Y = my_phyloseq_object,
test_kj = data.frame(k = 2, j = 1)) and if your abundances and covariates are in a dataframe, you can use the following, in which you want to estimate parameters for all taxa and run tests for the parameters associated with “Group” for all taxa:
all_fit <- emuFit(formula = ~ Group + Study + Gender + Sampling,
data = my_covariates_df,
Y = my_abundances_df,
test_kj = data.frame(k = 2, j = 1:ncol(my_abundances_df)))We additionally have a pkgdown website that contains
pre-built versions of our function documentation
and our vignettes (an introductory vignette,
an introductory vignette
that uses phyloseq data, an introductory vignette
that uses TreeSummarizedExperiment data, a vignette
for optionally specifying a reference taxon for an analysis, a vignette
for running radEmu tests in parallel for more efficient
computation, and a vignette
for running radEmu with clustered data).
If you use radEmu for your analysis, please cite our
manuscript.
David S Clausen, Sarah V Teichman, and Amy D Willis (2026). “Estimating Ratios of Means of Multicategory Data Observed with Sample and Category Perturbations.” Biometrika. https://doi.org/10.1093/biomet/asag009.
Huge thanks to the NIGMS for funding this work through Amy’s R35!
If you encounter a bug or would like make a change request, please file it as an issue here.
If you’re a developer, we would love to review your pull requests.
When we are not developing fast, robust and interpretable estimation
methods, we enjoy making up silly names for our fast, robust and
interpretable estimation methods. radEmu abbreviates
radEmuAbPill, which denotes “using
relative abundance
data to estimate
multiplicative differences in absolute
abundances with partially identified
log-linear models.”