tableParser: Parse Tabled Content to Text Vector and Extract Statistical Standard Results

Features include the ability to extract tabled content from NISO-JATS-coded XML, any native HTML or HML file, DOCX, and PDF documents, and then collapse it into a text format that is readable by humans by mimicking the actions of a screen reader. As tables within PDF documents are extracted with the 'tabulapdf' package, and the table captions and footnotes cannot be extracted, the results on tables within PDF documents have to be considered less precise. The function 'table2matrix()' returns a list of the tables within a document as character matrices. '[table2text()]' collapses the matrix content into a list of character strings by imitating the behavior of a screen reader. The textual representation of characters and numbers can be unified with unifyMatrix() before parsing. The function 'table2stats()' extracts the tabled statistical test results from the collapsed text with the function 'standardStats()' from the 'JATSdecoder' package and, if activated, checks the reported and coded p-values for consistency. Due to the great variability and potential complexity of table structures, parsing accuracy may vary.

Version: 1.0.1
Depends: R (≥ 4.1)
Imports: utils, JATSdecoder, tabulapdf
Published: 2026-01-27
DOI: 10.32614/CRAN.package.tableParser (may not be active yet)
Author: Ingmar Böschen ORCID iD [aut, cre]
Maintainer: Ingmar Böschen <ingmar.boeschen at uni-hamburg.de>
BugReports: https://github.com/ingmarboeschen/tableParser/issues
License: GPL-3
URL: https://github.com/ingmarboeschen/tableParser
NeedsCompilation: no
Language: en-US
CRAN checks: tableParser results

Documentation:

Reference manual: tableParser.html , tableParser.pdf

Downloads:

Package source: tableParser_1.0.1.tar.gz
Windows binaries: r-devel: not available, r-release: not available, r-oldrel: not available
macOS binaries: r-release (arm64): not available, r-oldrel (arm64): not available, r-release (x86_64): not available, r-oldrel (x86_64): not available

Linking:

Please use the canonical form https://CRAN.R-project.org/package=tableParser to link to this page.