Introduction to tabbitR

Siobhan McAndrew

Overview

tabbitR automates the production of large sets of weighted crosstabulation tables and exports them directly to ‘Excel’.
It is designed for situations where analysts need many tables at once, such as:

This is a routine but time-consuming task in survey research, monitoring and evaluation, and exploratory data analysis. Doing it manually is repetitive, error-prone, and difficult to keep consistent.

tabbitR::tabbit_excel() solves this by producing, for each
outcome x breakdown variable pair:

The goal is to reduce manual work, enforce consistency, and accelerate the early stages of analysis.

tabbitR use cases

Survey teams, academic researchers, and data analysts often need to deliver hundreds or thousands of tables - for example:

Manual workflows struggle with:

tabbitR automates all of this in one reproducible command.

Key features

Weighted percentages

Percentages are computed using user-supplied weights:

Percentage formatting respects the decimals = argument.

Unweighted counts

Alongside the percentage table, tabbit_excel() writes:

Missing values: clear and explicit

tabbitR does not silently drop missing responses.

Users may choose:

Flexible layout

Designed for large projects

tabbitR is especially useful when producing:

Usage

A minimal example

library(tabbitR)

df <- data.frame(
  outcome = factor(c("A", "B", "A", NA, "C", NA)),
  sex     = factor(c("Male", "Male", "Female", "Female",
                     "Prefer not to say", "Male")),
  weight  = c(1, 2, 1, 1, 0.75, 3)
)

tmp <- tempfile(fileext = ".xlsx")

tabbit_excel(
  data        = df,
  vars        = "outcome",
  breakdown   = "sex",
  wtvar       = "weight",
  file        = tmp,
  decimals    = 1
)

tmp
#> [1] "C:\\Users\\siobh\\AppData\\Local\\Temp\\RtmpIftjlZ\\file482c63c8662c.xlsx"

# The workbook is written to a temporary location (tmp).
# Open the file in a spreadsheet application to inspect the output.

Multiple outcomes and multiple breakdowns

### Example toy survey data
set.seed(123)

survey_df <- data.frame(
  outcome1       = factor(sample(c("Agree", "Neutral", "Disagree"), 200, replace = TRUE)),
  outcome2       = factor(sample(c("Often", "Sometimes", "Never"), 200, replace = TRUE)),
  outcome3       = factor(sample(c("Yes", "No"), 200, replace = TRUE)),
  sex            = factor(sample(c("Male", "Female"), 200, replace = TRUE)),
  age            = factor(sample(c("18-34", "35-54", "55+"), 200, replace = TRUE)),
  region         = factor(sample(c("North", "Midlands", "South"), 200, replace = TRUE)),
  survey_weight  = runif(200, 0.5, 2)
)

vars   <- c("outcome1", "outcome2", "outcome3")
breaks <- c("sex", "age", "region")

tmp2 <- tempfile(fileext = ".xlsx")

tabbit_excel(
  data        = survey_df,
  vars        = vars,
  breakdown   = breaks,
  wtvar       = "survey_weight",
  file        = tmp2,
  by_breakdown = TRUE,
  decimals    = 1
)

tmp2
#> [1] "C:\\Users\\siobh\\AppData\\Local\\Temp\\RtmpIftjlZ\\file482c5a837032.xlsx"

Understanding the options

Required

Main options

Missingness

Display

Treatment of missing values

tabbitR always makes missingness explicit.

Default

missingasrow = TRUE

Missing values appear as "Response missing" inside the main table.

nomissing = TRUE

No missing information is displayed in either table.

How weighted bases are computed

For each breakdown column:

weighted base W = sum(weights for non-missing outcomes)

Weighted bases are rounded to whole numbers for readability.

Citation

If you use tabbitR in published work, please cite:

McAndrew, S. (2025). tabbitR: Automated weighted cross-tabulations for survey analysis. R package.
https://github.com/smmcandrew/tabbitR

A BibTeX entry is available via:

citation("tabbitR")

Session info

sessionInfo()
#> R version 4.5.0 (2025-04-11 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26200)
#> 
#> Matrix products: default
#>   LAPACK version 3.12.1
#> 
#> locale:
#> [1] LC_COLLATE=C                           
#> [2] LC_CTYPE=English_United Kingdom.utf8   
#> [3] LC_MONETARY=English_United Kingdom.utf8
#> [4] LC_NUMERIC=C                           
#> [5] LC_TIME=English_United Kingdom.utf8    
#> 
#> time zone: Europe/London
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] tabbitR_0.1.3
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.37     R6_2.6.1          fastmap_1.2.0     xfun_0.54        
#>  [5] cachem_1.1.0      knitr_1.50        htmltools_0.5.9   rmarkdown_2.30   
#>  [9] lifecycle_1.0.4   cli_3.6.5         zip_2.3.3         openxlsx_4.2.8.1 
#> [13] sass_0.4.10       jquerylib_0.1.4   compiler_4.5.0    rstudioapi_0.17.1
#> [17] tools_4.5.0       evaluate_1.0.5    bslib_0.9.0       Rcpp_1.1.0       
#> [21] yaml_2.3.11       rlang_1.1.6       jsonlite_2.0.0    stringi_1.8.7