--- title: "Introduction to lorbridge: Bridging Log-Odds Ratios and Correspondence Analysis" author: "Se-Kang Kim, Ph.D." date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to lorbridge} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` ## Why lorbridge? Clinical and medical researchers routinely report **odds ratios (ORs)** from logistic regression as their primary measure of association. An OR of 1.83, for example, means the odds of the outcome are 83% higher for a one-unit increase in the predictor — a statement that requires statistical training to interpret intuitively. **lorbridge** provides a formal mathematical bridge (Kim & Grochowalski, 2019) that re-expresses log-odds ratios (LORs) as **cosine theta** — a metric bounded between −1 and +1, immediately interpretable like a Pearson correlation. At the same time, the package extends this bridge into singly-ordered (SONSCA) and doubly-ordered (DONSCA) nonsymmetric correspondence analysis, giving researchers visual geometric maps alongside their regression results. --- ## Dataset The package includes `lorbridge_data`, an individual-level dataset (N = 900) with Vocabulary Meaning (VM) scores and binary minority/majority group membership: ```{r data} library(lorbridge) data(lorbridge_data) str(lorbridge_data) ``` --- ## Subprogram 1: Binary Logistic Regression ### 1a. Continuous predictor (VM per 1 SD) ```{r blr_continuous} res_1a <- blr_continuous( outcome = lorbridge_data$minority, predictor = lorbridge_data$VM ) print(res_1a$summary_table[, c("LOR","OR","OR_lo","OR_hi","p", "Nagelkerke_R2","YuleQ","r_meta")], digits = 4) ``` **Plain-English interpretation:** A one-standard-deviation increase in VM score is associated with an OR of approximately 0.65 for minority membership. The LOR of −0.43 translates to an r_meta of approximately −0.12 on the familiar −1 to +1 scale — a small but statistically reliable negative association. --- ### 1b. Categorical predictor (VM bins, VM4 as reference) ```{r blr_categorical} res_1b <- blr_categorical( outcome = lorbridge_data$minority, predictor = lorbridge_data$VMbin, ref_level = "VM4" ) print(res_1b$results[, c("Category","LOR","OR","p","YuleQ","r_meta","cos_theta")], digits = 4) ``` **Note:** In a 2-row table, the 1D correspondence analysis solution yields cosine thetas of exactly ±1. The **sign** carries the substantive information: positive = minority over-represented relative to VM4; negative = under-represented. --- ## Subprogram 2: SONSCA Singly-Ordered Nonsymmetric Correspondence Analysis is applied to the IQ-by-race and VM-by-race contingency tables, with Race2 and VM4 (or IQ4) as the row and column anchors respectively. ```{r sonsca_setup} data(tab_IQ) row_anchor <- "Race2" col_anchor <- "IQ4" races <- setdiff(rownames(tab_IQ), row_anchor) bins <- setdiff(colnames(tab_IQ), col_anchor) ``` ```{r sonsca_ccms} # Pairwise CCMs for Race1 vs Race2 at IQ1 vs IQ4 sonsca_ccm(tab_IQ, row_k = "Race1", bin_j = "IQ1", row_anchor = row_anchor, col_anchor = col_anchor) ``` ```{r sonsca_cosines} # SONSCA coordinates and cosine theta matrix sc <- sonsca_coords(tab_IQ) cos <- sonsca_cosines(sc$row_coords, sc$col_coords, row_anchor = row_anchor, col_anchor = col_anchor) round(cos[races, bins], 3) ``` ```{r inertia} pct <- inertia_pct(tab_IQ) cat(sprintf("Dimension 1: %.1f%% | Dimension 2: %.1f%%\n", pct[1], pct[2])) ``` --- ## Subprogram 3: DONSCA Doubly-Ordered Nonsymmetric Correspondence Analysis is applied to the 6 × 6 IQ × VM table, with IQ4 and VM4 as the row and column anchors. ```{r donsca} data(tab_IQ_VM) fit <- donsca_fit(tab_IQ_VM) cos_d <- donsca_cosines(fit, col_anchor_idx = 4, row_anchor_idx = 4) head(cos_d, 6) ``` ### Multinomial logistic regression with CCMs ```{r mlr_ccm, message = FALSE} data(lorbridge_data) # use VM as numeric predictor, VMbin as outcome proxy # Illustrative: treat VM bins as the outcome and VM numeric as predictor # (In practice use IQ bins as outcome and VM as predictor per the paper) data(tab_IQ_VM) # Build long-format data from tab_IQ_VM for multinomial logit vm_vals <- c(54,59,62,63,65,67,69,71,73,74,76,78,80,81,82,84,85,86,87,89, 90,92,93,95,96,98,100,101,103,104,105,107,108,110,112,113, 115,117,119,121,123,125,126,128,130,132,134,136,138,139, 143,147,149) rows6 <- paste0("IQ", 1:6) # (Full X_wide matrix omitted here for brevity — see unified analysis script) ``` --- ## Key References Kim, S.-K., & Grochowalski, J. H. (2019). Gaining from discretization of continuous data: The correspondence analysis biplot approach. *Behavior Research Methods*, 51(2), 589–601. https://doi.org/10.3758/s13428-018-1161-1 Kim, S.-K. (2020). Test treatment effect differences in repeatedly measured symptoms with binary values: The matched correspondence analysis approach. *Behavior Research Methods*, 52, 1480–1490. Kim, S.-K. (2024). Factorization of person response profiles to identify summative profiles carrying central response patterns. *Psychological Methods*, 29(4), 723–730.