README

The RegAssure package is designed to simplify and enhance the process of validating regression model assumptions in R. It provides a comprehensive set of tools for evaluating key assumptions such as linearity, homoscedasticity, independence, normality, and collinearity, contributing to the reliability of analytical results.

Installation

# install.packages("devtools")
devtools::install_github("nrubiog/RegAssure")

Example: Linear Regression

Here’s a basic example showcasing how RegAssure can be used to enhance linear regression analysis:

# Install the package
# devtools::install_github("nrubiog/RegAssure")

# Load the package
library(RegAssure)

# Create a regression model
lm_model <- lm(mpg ~ wt + hp, data = mtcars)

# Check assumptions
check_lm_assumptions(lm_model)
#> 
#> The assumption tests have been completed and the results are available in a list. Enjoy it :)
#> Las pruebas de supuestos han sido completadas y los resultados están disponibles en una lista. Disfrútalo :)
#> $Linearity
#> [1] 1.075529e-16
#> 
#> $Homoscedasticity
#> 
#>  studentized Breusch-Pagan test
#> 
#> data:  model
#> BP = 0.88072, df = 2, p-value = 0.6438
#> 
#> 
#> $Independence
#> 
#>  Durbin-Watson test
#> 
#> data:  model
#> DW = 1.3624, p-value = 0.04123
#> alternative hypothesis: true autocorrelation is not 0
#> 
#> 
#> $Normality
#> 
#>  Shapiro-Wilk normality test
#> 
#> data:  model$residuals
#> W = 0.92792, p-value = 0.03427
#> 
#> 
#> $Multicollinearity
#>       wt       hp 
#> 1.766625 1.766625

Example: Logistic Regression

Here’s an additional example demonstrating the use of RegAssure with logistic regression:

# Load the package

library(RegAssure)
library(titanic)

# Load the dataframe
titanic <- titanic_train

# Create a binary logistic regression model
logit_model <- glm(Survived ~ Pclass + Sex, data = titanic, family = "binomial")

# Check assumptions for binary logistic regression
check_logit(logit_model, data = titanic, tipo_modelo = "binario", vars_numericas = "Pclass", y = "Survived")
#> logit_model has no missing values.
#> 
#> Tests performed for binary/binomial model.
#> Warning in check_logit(logit_model, data = titanic, tipo_modelo = "binario", : Box-Tidwell Test cannot be done.
#> Warning in check_logit(logit_model, data = titanic, tipo_modelo = "binario", : Variance Inflation Factor Test cannot be done.
#> 
#> The assumption tests have been completed and the results are available in a list. Enjoy it :)
#> Las pruebas de supuestos han sido completadas y los resultados están disponibles en una lista. Disfrútalo :)
#> $model_type
#> [1] "binary/binomial"
#> 
#> $Confusion
#>         Predicciones
#> Variable   0   1
#>        0 468  81
#>        1 109 233
#> 
#> $ROC
#> 
#> Call:
#> roc.default(response = new_data[[y]], predictor = pred_logit,     smooth = TRUE, auc = TRUE, ci = TRUE, ret = TRUE)
#> 
#> Data: pred_logit in 549 controls (new_data[[y]] 0) < 342 cases (new_data[[y]] 1).
#> Smoothing: binormal 
#> Area under the curve: 0.8453
#> 95% CI: 0.8115-0.8729 (2000 stratified bootstrap replicates)

Example: storing data

Here’s an example of how to use the get_predict() function to compare real and predicted values from a model:

# Load the package
library(RegAssure)

# Create a regression model
lm_model <- lm(mpg ~ wt + hp, data = mtcars)

# Get predictions and compare with real values
predictions <- get_predict(lm_model, mtcars, mtcars$mpg, n = 3)

# Print the results
print(predictions[1:7,])
#>                   reales predichos  Error
#> Mazda RX4           21.0    23.572 -2.572
#> Mazda RX4 Wag       21.0    22.583 -1.583
#> Datsun 710          22.8    25.276 -2.476
#> Hornet 4 Drive      21.4    21.265  0.135
#> Hornet Sportabout   18.7    18.327  0.373
#> Valiant             18.1    20.474 -2.374
#> Duster 360          14.3    15.599 -1.299

By incorporating RegAssure into your workflow, you can streamline the process of assessing and addressing regression model assumptions, leading to more informed decision-making.