--- title: "How To Use CGMissingDataR" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{How To Use CGMissingDataR} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # CGMissingDataR CGMissingDataR is an R package based on the CGMissingData Python library for evaluating model performance under feature missingness by: * injecting missing values into feature columns at specified masking rates, * imputing missing values using a Multiple Imputation by Chained Equations (MICE)-style iterative imputer, and * training Random Forest and k-Nearest Neighbors regressors to report Mean ABsolute Percentage Error (MAPE) and R across missingness levels. ## Installation Before the installation, ensure that you have the following R packages installed: ```r install.packages(c("FNN", "ranger", "mice")) ``` Install the development version of CGMissingDataR from GitHub: ``` r devtools::install_github("saraswatsh/CGMissingDataR") ``` ## Example Below is a brief example illustrating the usage of CGMissingDataR. ```{r, setup, cache = TRUE} library(CGMissingDataR) # Load example dataset data("CGMExampleData") results <- run_missingness_benchmark(CGMExampleData, mask_rates = c(0.05, 0.10, 0.15, 0.20),target_col = "LBORRES", # Running the missingness benchmark feature_cols = c("TimeDifferenceMinutes", "TimeSeries", "USUBJID")) print(results) # Displaying the results ```