rolescry: Name-Blind Variable-Role Detection by Data Signature

Deterministic, name-blind detection of variable roles (group, outcome, survival time and event, paired and agreement measurements, repeated measures, scale items, subject identifier, covariate) in tabular data. Roles are assigned from each column's information-theoretic signature – Shannon entropy, normalized mutual information, and distributional shape – rather than from column names, so renaming columns to 'col_1', 'col_2', ... does not change the result ("Data inspice, non nomen"). An optional, capped name-based hint and automatic header-row detection are also provided. No large language models and no external data transmission. Extracted from the 'MDStatR' biostatistics engine; see Boynukara (2026) <doi:10.5281/zenodo.20707791>.

Version: 0.1.0
Depends: R (≥ 4.0.0)
Imports: stats, utils
Suggests: moments, diptest, stringdist, readxl, openxlsx, haven, testthat (≥ 3.0.0), knitr, rmarkdown, spelling
Published: 2026-06-22
DOI: 10.32614/CRAN.package.rolescry (may not be active yet)
Author: Can Boynukara ORCID iD [aut, cre, cph], M. Yasir Ceyhan ORCID iD [ctb]
Maintainer: Can Boynukara <canboynukara1 at gmail.com>
BugReports: https://github.com/canboynukara/rolescry/issues
License: Apache License (== 2.0)
URL: https://github.com/canboynukara/rolescry
NeedsCompilation: no
Language: en-US
Citation: rolescry citation info
Materials: README
CRAN checks: rolescry results

Documentation:

Reference manual: rolescry.html , rolescry.pdf
Vignettes: Name-blind variable-role detection with rolescry (source, R code)

Downloads:

Package source: rolescry_0.1.0.tar.gz
Windows binaries: r-devel: not available, r-release: not available, r-oldrel: not available
macOS binaries: r-release (arm64): rolescry_0.1.0.tgz, r-oldrel (arm64): rolescry_0.1.0.tgz, r-release (x86_64): rolescry_0.1.0.tgz, r-oldrel (x86_64): rolescry_0.1.0.tgz

Linking:

Please use the canonical form https://CRAN.R-project.org/package=rolescry to link to this page.