Title: Perform the Pearson-Quetelet Analysis on Two-Way Contingency Tables
Version: 1.0.0
Description: Tools to perform Pearson-Quetelet analysis on two-way contingency tables. The package computes absolute and relative frequencies, Quetelet indices, Pearson-Quetelet decomposition, apex tables, and chi-square summaries for interpreting associations between categorical variables.
License: LGPL-3
LazyData: true
Encoding: UTF-8
RoxygenNote: 7.3.2
Depends: R (≥ 3.5)
Suggests: testthat (≥ 3.0.0)
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2026-03-25 17:20:37 UTC; lcorag
Author: Boris Mirkin [aut], Luca Coraggio [aut, cre], Trevor Fenner [aut], Zina Taran [aut]
Maintainer: Luca Coraggio <luca.coraggio@unina.it>
Repository: CRAN
Date/Publication: 2026-03-30 17:10:08 UTC

BMI Category vs Mortality Outcome (Excluding First 5 Years)

Description

Cross-classified counts of participants by BMI category at study entry and all-cause mortality outcome from the Leisure World Cohort Study (1981-2004), excluding the first 5 years of follow-up (Table 6 in the cited paper).

Usage

bmi_mortality

Format

A numeric matrix (also an array) with 4 rows and 2 columns:

Rows (BMI category):

Underweight, Normal, Overweight, Obese.

Columns (mortality outcome):

Died, Survived (Participants - Deaths).

Values

Frequencies (counts).

Details

BMI thresholds at study entry:

Category-level metadata (excluding first 5 years):

Totals in this dataset: 11,375 participants and 9,127 deaths.

Source

Corrada, Maria M., Kawas, Claudia H., Mozaffar, Farah, and Paganini-Hill, Annlia (2006). Association of Body Mass Index and Weight Change with All-Cause Mortality in the Elderly. American Journal of Epidemiology, 163(10), 938-949. Table 6, values excluding the first 5 years of follow-up. doi:10.1093/aje/kwj114

Examples

data(bmi_mortality)
bmi_mortality
rowSums(bmi_mortality)
colSums(bmi_mortality)
sum(bmi_mortality)


BMI Category vs Mortality Outcome (Total Sample)

Description

Cross-classified counts of participants by BMI category at study entry and all-cause mortality outcome from the Leisure World Cohort Study (1981-2004), using the total sample values reported in Table 6 of the cited paper.

Usage

bmi_mortality_all

Format

A numeric matrix (also an array) with 4 rows and 2 columns:

Rows (BMI category):

Underweight, Normal, Overweight, Obese.

Columns (mortality outcome):

Died, Survived (Participants - Deaths).

Values

Frequencies (counts).

Details

BMI thresholds at study entry:

Category-level metadata (total sample):

Totals in this dataset: 13,451 participants and 11,203 deaths.

Source

Corrada, Maria M., Kawas, Claudia H., Mozaffar, Farah, and Paganini-Hill, Annlia (2006). Association of Body Mass Index and Weight Change with All-Cause Mortality in the Elderly. American Journal of Epidemiology, 163(10), 938-949. Table 6, total sample values. doi:10.1093/aje/kwj114

Examples

data(bmi_mortality_all)
bmi_mortality_all
rowSums(bmi_mortality_all)
colSums(bmi_mortality_all)
sum(bmi_mortality_all)


Pearson-Quetelet Analysis for Two-Way Contingency Tables

Description

Performs Pearson-Quetelet analysis (PQA) to examine associations between categorical variables through the Quetelet index and its decomposition of the chi-square statistic.

Usage

pqa(x)

Arguments

x

A two-way table of counts. Higher-dimensional tables are not supported.

Details

The Quetelet index is computed as q_{ij} = p_{ij} / (p_i p_j) - 1, so 0 indicates independence, positive values indicate higher-than-expected frequency, and negative values indicate lower-than-expected frequency. The decomposition pq equals p_{ij} q_{ij} and sums to \phi^2; apex rescales pq to percentage contributions. When \phi^2 = 0 (perfect independence), apex is returned as a zero table.

The function automatically handles missing factor/level names and assesses chi-square validity based on expected frequencies:

Value

An object of class pqa, which is a list containing:

abs

Absolute frequencies (counts).

rel

Relative frequencies (proportions).

q

Quetelet index values, measuring relative change in probability.

pq

Pearson-Quetelet decomposition of the chi-square statistic.

apex

Percentage contributions of each cell to the chi-square statistic.

chisq

A list of class pqa.chisq with test results (stat, df, pval) and a validity flag.

See Also

table for creating contingency tables, chisq.test for chi-square tests

Examples

# Example 1: Using the built-in usa_voting_prefs dataset
data(usa_voting_prefs)
result <- pqa(usa_voting_prefs)
print(result$abs)  # View absolute frequencies
print(result$chisq)  # View chi-square test results

# Example 2: Using a matrix (converted to table first)
data_matrix <- matrix(c(10, 20, 15, 25), nrow = 2, ncol = 2)
dimnames(data_matrix) <- list(Gender = c("Male", "Female"), Preference = c("A", "B"))
result <- pqa(as.table(data_matrix))


Print Pearson-Quetelet Analysis Object

Description

Displays a summary of the available components within a pqa object.

Usage

## S3 method for class 'pqa'
print(x, pp = NULL, ...)

Arguments

x

A pqa object.

pp

Logical; if TRUE, prints a formatted summary. If FALSE, prints the raw structure. Defaults to the "pqa.pretty_print" option.

...

Further arguments passed to or from other methods.

Details

Components include absolute (abs) and relative (rel) frequencies, Quetelet indices (q), Pearson-Quetelet decomposition (pq), apex (apex), and chi-square results (chisq).

Value

Invisibly returns the input object.

See Also

pqa, summary.pqa, print.pqa.subtable

Examples

data(usa_voting_prefs)
qt <- pqa(usa_voting_prefs)
print(qt)


Print Chi-Square Test Results

Description

Formatted print method for pqa.chisq objects, showing test statistics and validity assessments.

Usage

## S3 method for class 'pqa.chisq'
print(x, pp = NULL, ...)

Arguments

x

A pqa.chisq object.

pp

Logical; if TRUE, prints formatted results. Defaults to the "pqa.pretty_print" option.

...

Further arguments passed to or from other methods.

Details

Displays the null hypothesis, chi-square statistic, degrees of freedom, and p-value. Includes warnings if the test is unreliable (expected frequencies < 5) or cannot be computed.

Value

Invisibly returns the input object.

See Also

pqa, print.pqa

Examples

data(usa_voting_prefs)
qt <- pqa(usa_voting_prefs)
print(qt$chisq)


Print Pearson-Quetelet Analysis Subtables

Description

Formatted print method for pqa.subtable components such as absolute frequencies, relative frequencies, Quetelet indices, decompositions, and apex.

Usage

## S3 method for class 'pqa.subtable'
print(x, pp = NULL, ...)

Arguments

x

A pqa.subtable object.

pp

Logical; if TRUE, prints a formatted contingency table. Defaults to the "pqa.pretty_print" option.

...

Further arguments passed to or from other methods.

Details

Formatting (rounding, scaling, and marginals) automatically adapts to the subtable type:

If pp = FALSE, the raw matrix-like object is printed via print.AsIs().

Value

Invisibly returns the input object.

See Also

pqa, print.pqa, print.pqa.chisq

Examples

data(usa_voting_prefs)
qt <- pqa(usa_voting_prefs)
print(qt$abs)
print(qt$q)


Summarize a Pearson-Quetelet Analysis

Description

Prints a textual summary of a pqa object, including absolute frequencies, chi-square test output, Quetelet signals of association/indifference, and apex-based contribution notes.

Usage

## S3 method for class 'pqa'
summary(object, ...)

Arguments

object

A pqa object.

...

Further arguments passed to or from other methods.

Details

The summary output includes:

Value

Invisibly returns the input pqa object.

See Also

pqa, print.pqa, print.pqa.subtable, print.pqa.chisq

Examples

# Create a pqa from the built-in usa_voting_prefs dataset and get summary
data(usa_voting_prefs)
qt <- pqa(usa_voting_prefs)

# Get comprehensive summary
summary(qt)


UK Crime Survey: Rubbish on Street vs Crime Victimization

Description

A cross-classified data table from the British Crime Survey (2007-2008) showing the relationship between the perceived frequency of rubbish on streets and crime victimization status. This dataset is useful for illustrating contingency table analysis and chi-square tests of independence in statistical education and research.

Usage

uk_crime_rubbish

Format

An object of class table with 4 rows (Rubbish on street categories) and 2 columns (Crime victimization status):

Rows (Rubbish on street):
  • Very common: Rubbish on street is very common

  • Fairly common: Rubbish on street is fairly common

  • Not very common: Rubbish on street is not very common

  • Not at all common: Rubbish on street is not at all common

Columns (Crime victimization status):
  • Not a victim of crime: Respondent was not a victim of crime

  • Victim of crime: Respondent was a victim of crime

Values

Frequencies or counts of survey respondents (integer numbers).

Source

BMRB Social Research and Home Office, Research, Development and Statistics Directorate (2022). British Crime Survey, 2007-2008 (data collection), 4th Edition. UK Data Service, SN: 6066. doi:10.5255/UKDA-SN-6066-2

Examples

# Load the dataset into the workspace
data(uk_crime_rubbish)

# Display the entire table
print(uk_crime_rubbish)

# Calculate marginal totals (row sums and column sums)
rowSums(uk_crime_rubbish)
colSums(uk_crime_rubbish)

# Perform chi-square test of independence
chisq.test(uk_crime_rubbish)


US Mortality Data by Age and Gender (2020 vs 2015-2019 Average)

Description

A dataset containing US mortality statistics by age group and gender, comparing 2020 deaths (including COVID-19 impact) with 2015-2019 averages. Includes all-cause deaths, non-COVID-19 deaths, and population data.

Usage

us_covid_mortality

Format

A data.frame with 22 rows (11 age groups × 2 genders) and 8 variables:

Age

Character vector: age groups (<1, 1-4, 5-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, 85+)

Gender

Character vector: "Male" or "Female"

Deaths_2020

Numeric: Total deaths in 2020

NonCOVID_Deaths_2020

Numeric: Non-COVID-19 deaths in 2020

COVID_Deaths_Percentage

Numeric: Percentage of deaths attributed to COVID-19

Population_2020

Numeric: Population in 2020

Average_Deaths_2015_2019

Numeric: Average deaths for 2015-2019 period

Average_Population_2015_2019

Numeric: Average population for 2015-2019 period

Source

Jacobson, Sheldon H. and Jokela, Janet A. (2021). Beyond COVID-19 Deaths during the COVID-19 Pandemic in the United States. Health Care Management Science, 24, 661-665. doi:10.1007/s10729-021-09570-4

Examples

# Load the dataset
data(us_covid_mortality)

# View the structure
str(us_covid_mortality)

# Summary statistics by gender
aggregate(Deaths_2020 ~ Gender, data = us_covid_mortality, FUN = sum)

# COVID-19 impact analysis
us_covid_mortality$COVID_Impact <- with(
  us_covid_mortality,
  Deaths_2020 - Average_Deaths_2015_2019
)
summary(us_covid_mortality$COVID_Impact)


US Construction Fall Accidents by Occupation and Injury Degree

Description

Cross-classified counts of US construction fall accidents by occupation and injury degree, derived from the 2000-2020 data analysis reported by Halabi et al. (2022). The table summarizes how fall accidents are distributed across 17 occupation groups and 3 injury-severity categories.

Usage

us_fall_accidents

Format

An object of class table with 17 rows (occupation groups) and 3 columns (injury degree):

Rows (occupation groups, shown here with full names):

Roofers, ⁠Construction laborers⁠, Carpenters, ⁠Laborers, except construction⁠, ⁠Supervisors and Engineers⁠, ⁠Painters, plasterers, construction and maintenance⁠, ⁠Installers and Repairers⁠, ⁠Structural metal workers⁠, Operators, Electricians, ⁠Truck drivers, heavy⁠, ⁠Technicians, Mechanics⁠, ⁠Janitors and cleaners⁠, ⁠Helpers, construction trades⁠, Installers (Drywalls, elevators), Plumbers, ⁠Sales engineers, workers⁠.

Columns (injury degree):

Fatality, Hospitalized, ⁠Non Hospitalized⁠.

Values

Frequencies (counts of fall accidents).

Details

Totals in this dataset: 15,495 accidents overall, including 5,701 fatal cases, 8,955 hospitalized cases, and 839 non-hospitalized cases.

The table stored in the package uses abbreviated row names for compact display; the full occupation labels are listed above for readability.

The largest occupation groups by total number of accidents are Roofers (2,967), ⁠Construction laborers⁠ (2,725), and Carpenters (1,665).

Source

Halabi, Y., Xu, H., Long, D., Chen, Y., Yu, Z., Alhaek, F., and Alhaddad, W. (2022). Causal Factors and Risk Assessment of Fall Accidents in the US Construction Industry: A Comprehensive Data Analysis (2000-2020). Safety Science, 146, 105537. Table 6 (p. 8), "Fall accidents distributed by occupation and injury degree". doi:10.1016/j.ssci.2021.105537

Examples

data(us_fall_accidents)
us_fall_accidents
rowSums(us_fall_accidents)
colSums(us_fall_accidents)
sum(us_fall_accidents)


USA School Readiness of Toddlers by Parent Education

Description

A cross-classified data table from the National Survey of Children's Health showing school readiness for children aged 3-5 years by the highest level of education of an adult in the household. The table reports nationwide counts for children who are on track versus those who need support.

Usage

usa_toddlers

Format

An object of class table with 4 rows (parent education categories) and 2 columns (school readiness):

Rows (parent education):
  • Less than high school

  • High School/GED

  • College/Technical

  • College or more

Columns (school readiness):
  • On track

  • Need support

Values

Frequencies or counts of children (integer numbers).

Details

Totals in this dataset: 23,176 children overall, including 15,964 classified as ⁠On track⁠ and 7,212 as ⁠Need support⁠.

The largest parent-education group is ⁠College or more⁠ (15,718 children), followed by College/Technical (4,388 children).

Source

Data Resource Center for Child & Adolescent Health (2024). National Survey of Children's Health: School Readiness (Age 3-5 Years) by Parent Education. Nationwide tabulation based on the highest level of education of an adult in the household. https://www.childhealthdata.org/ Accessed 30 October 2024.

Examples

# Load the dataset into the workspace
data(usa_toddlers)

# Display the table
print(usa_toddlers)

# Calculate marginal totals
rowSums(usa_toddlers)
colSums(usa_toddlers)
sum(usa_toddlers)


USA Residents Voting Preferences by Income Category

Description

A cross-classified data table presenting the voting preferences of USA residents classified by their income category, according to a survey by the Pew Research Center (2014). These data are typically used to illustrate computations and contingency analyses in statistical scenarios.

Usage

usa_voting_prefs

Format

An object of class table with 4 rows (Income Categories) and 3 columns (Political Affiliations):

Rows (Income Categories):
  • I: Less than $30,000

  • II: More than $30,000 but less than $50,000

  • III: More than $50,000 but less than $100,000

  • IV: $100,000 or more

Columns (Political Affiliations):
  • R: Republican or leaning toward Republican

  • U: Undecided

  • D: Democrat or leaning toward Democrat

Values

Frequencies or counts of respondents (integer numbers).

Source

Pew Research Center (2014). Religious Landscape Study: Compare Party Affiliation by Income Distribution. https://www.pewresearch.org/religion/religious-landscape-study/compare/party-affiliation/by/income-distribution/ Accessed 08 July 2022.

Examples

# Load the dataset into the workspace
data(usa_voting_prefs)

# Display the entire table
print(usa_voting_prefs)

# Calculate marginal totals (row sums and column sums)
rowSums(usa_voting_prefs)
colSums(usa_voting_prefs)