| Type: | Package |
| Title: | A Logistic Regression Model for Testing Microbial Differential Abundance |
| Version: | 1.0 |
| Date: | 2026-04-18 |
| Maintainer: | Yi-Juan Hu <yijuanhu@bicmr.pku.edu.cn> |
| Description: | Testing differential abundance at individual taxa and in a whole microbial community. The tests are based on the log-ratio of relative abundances. The tests accommodate continuous, discrete (binary, categorical), and multivariate traits, and allow adjustment of confounders. For more details see He (2026) <doi:10.64898/2026.04.07.716976>. |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| RoxygenNote: | 7.3.2 |
| Depends: | R (≥ 3.5.0) |
| Imports: | Rcpp (≥ 1.0.11), stats, permute, parallel, BiocParallel, matrixStats, abind, car |
| LinkingTo: | Rcpp, RcppArmadillo |
| Suggests: | testthat, survival |
| URL: | https://github.com/yijuanhu/LOCOM2 |
| BugReports: | https://github.com/yijuanhu/LOCOM2/issues |
| Encoding: | UTF-8 |
| LazyData: | true |
| NeedsCompilation: | yes |
| Packaged: | 2026-04-30 01:08:36 UTC; yhu30 |
| Author: | Yi-Juan Hu [aut, cre] |
| Repository: | CRAN |
| Date/Publication: | 2026-05-04 12:10:14 UTC |
A logistic regression model for testing differential abundance in compositional microbiome data (LOCOM2)
Description
This function allows you to test (1). whether any OTU (or taxon) is associated with the trait of interest with FDR control, based on log ratios of relative abundances between pairs of taxa, and (2). whether the whole community is associated with the trait (a global test), based on the harmonic mean method for combining individual p-values The tests accommodate continuous, discrete (binary or categorical), and multivariate traits, and allow adjustment for confounders.
Usage
locom2(
otu.table,
Y,
C = NULL,
fdr.nominal = 0.1,
filter = TRUE,
permute = TRUE,
n.perm.max = 1000,
n.rej.stop = 100,
n.cores = 1,
seed = NULL,
verbose = TRUE,
Firth.thresh = 0.4
)
Arguments
otu.table |
The OTU table (or taxa count table), where rows correspond to samples and columns correspond to OTUs (taxa). |
Y |
The trait of interest, which can be a vector, matrix, or data frame, must be numeric; for example, a factor should be represented by its corresponding design matrix. When specified as a matrix or data frame, all components are tested jointly for microbial association. |
C |
The additional (confounding) covariates to be adjusted for. See the requirements for |
fdr.nominal |
The nominal FDR level, with a default of 0.1. |
filter |
A logical value indicating whether to filter out rare taxa. The default is TRUE, using a filtering threshold of min(0.1*n.sam, 10). |
permute |
A logical value indicating whether to perform permutation. The default is TRUE. |
n.perm.max |
The maximum number of permutations. The default is 1,000, used for the Wald-type test. The full permutation procedure as in LOCOM is performed when |
n.rej.stop |
The minimum number of rejections (i.e., instances where the permutation test statistic exceeds the observed test statistic) required before stopping the permutation procedure. The default is 100. |
n.cores |
The number of cores to be used for parallel computing. The default is 1. |
seed |
A user-supplied integer seed for the random number generator in the permutation procedure. The default is NULL, in which case an integer seed is generated internally at random. In either case, the seed is stored in the output object to enable reproducibility of the permutation replicates. |
verbose |
A logical value indicating whether to produce verbose output during the permutation process. The default is TRUE. |
Firth.thresh |
The threshold (between 0 and 1) of taxon prevalence for applying the Firth correction. The default is 0.4. |
Details
This function extends LOCOM (Hu et al., 2022, PNAS) in the following ways: -. accommodating both relative abundance and read count data for OTUs; -. refining the weighting scheme in LOCOM to eliminate confounding by library size; -. incorporating a series of adjustments to ensure stable and reliable inference, even under extreme conditions such as rare taxa and highly unbalanced case–control designs; -. replacing the computationally intensive permutation procedure with a Wald-type test (using a fixed 1,000 permutation replicates).
Value
A list consisting of
p.otu.Wald - Wald p-values for OTU-specific tests
q.otu.Wald - Wald q-values (adjusted p-values by BH) for OTU-specific tests
detected.otu.Wald - OTUs detected by the Wald test at the nominal FDR level
p.otu.perm - permutation p-values for OTU-specific tests
q.otu.perm - permutation q-values (adjusted p-values by BH) for OTU-specific tests
detected.otu.perm - OTUs detected by the permutation test at the nominal FDR level
p.otu.asymptotic - asymptotic p-values for OTU-specific tests
q.otu.asymptotic - asymptotic q-values (adjusted p-values by BH) for OTU-specific tests
detected.otu.asymptotic - OTUs detected by the asymptotic test at the nominal FDR level
beta - effect size at each OTU, defined as beta_j - median (beta_j'), after Yeo–Johnson transformation if the Wald test is used
beta.var - estimated variance for each beta
ref.otu - reference OTU
p.global - p-value for the global test (not available in the asymptotic version). The global test is based on the harmonic mean of individual p-values, using all permutation replicates generated up to the point when the procedure terminates.
n.perm.completed - number of permutations completed
seed - the seed used to generate the permutation replicates
Examples
data("throat.otu.table.filter")
data("throat.meta.filter")
data("throat.otu.taxonomy")
Y <- ifelse(throat.meta.filter$SmokingStatus == "NonSmoker", 0, 1)
C <- ifelse(throat.meta.filter$Sex == "Male", 0, 1)
##################
# running LOCOM2
##################
## LOCOM2 (Wald), most recommended, better to use n.cores = 4 to speed up
res <- locom2(otu.table = throat.otu.table.filter, Y = Y, C = C, seed = 123, n.perm.max = 100)
res$detected.otu.Wald
res$p.otu.Wald[res$detected.otu.Wald]
Metadata of the throat microbiome samples
Description
This data set includes samples from the microbiome of the nasopharynx and oropharynx on each side of the body. It were generated to study the effect of smoking on the microbiota of the upper respiratory tract in 57 individuals, after filtering out three individuals with antibiotic use.
Usage
data("throat.meta.filter")
Format
A data frame with 57 observations on 16 variables.
Source
Charlson ES, Chen J, Custers-Allen R, Bittinger K, Li H, et al. (2010) Disordered Microbial Communities in the Upper Respiratory Tract of Cigarette Smokers. PLoS ONE 5(12): e15216.
References
R package "GUniFrac"
Examples
data(throat.meta.filter)
OTU count table from 16S sequencing of the throat microbiome samples
Description
This data set contains 57 subjects, after filtering out three subjects with antibiotic use. Microbiome data were collected from right and left nasopharynx and oropharynx region to form an OTU table with 856 OTUs.
Usage
data("throat.otu.table.filter")
Format
A data frame with 57 observations on 856 variables.
Source
Charlson ES, Chen J, Custers-Allen R, Bittinger K, Li H, et al. (2010) Disordered Microbial Communities in the Upper Respiratory Tract of Cigarette Smokers. PLoS ONE 5(12): e15216.
References
R package "GUniFrac"
Examples
data(throat.otu.table.filter)
Taxonomy names for OTUs from 16S sequencing of the throat microbiome samples
Description
This file contains 5683 taxonomy names.
Usage
data("throat.otu.taxonomy")
Format
A vector with 5683 taxonomy names
Source
Charlson ES, Chen J, Custers-Allen R, Bittinger K, Li H, et al. (2010) Disordered Microbial Communities in the Upper Respiratory Tract of Cigarette Smokers. PLoS ONE 5(12): e15216.
References
R package "GUniFrac"
Examples
data(throat.otu.taxonomy)