---
title: "Plotting Functions"
author: "US EPA's Center for Computational Toxicology and Exposure ccte@epa.gov"
output:
  rmarkdown::html_vignette
bibliography: '`r system.file("REFERENCES.bib", package="invitroTKstats")`'
vignette: >
  %\VignetteIndexEntry{Plotting Functions}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.align = 'center'
)
```

## Introduction

The goal of this vignette is to showcase the two plotting functions in `invitroTKstats`, namely `plot_clint` and `plot_fup_uc`. The input data for these two functions should be level-2 data. Level-2 data is level-1 data returned from `format_clint` and `format_fup_uc`, respectively, then curated with a verification column using `sample_verification`. A "Y" in the verification column indicates the data row is valid for analysis and plotting. 

## Set Up

```{r setup, warning=FALSE, message=FALSE}
# Primary Package #
library(invitroTKstats)
# Data Formatting Packages #
library(tidyverse)
# Data Plotting Package #
library(ggplot2)
library(gridExtra)
library(gridtext)
```

## plot_clint.R

`plot_clint` generates a time-versus-response scatter plot for a specified chemical. The data set used in this example is `clint_L2` (@smeltz2023plasma). 

```{r example clint single, fig.width=5, fig.height=4}
# load data in 
level2.clint <- invitroTKstats::clint_L2

# keep only the verified data
verified_data_clint <- level2.clint[level2.clint$Verified == "Y", ]

id <- unique(verified_data_clint$DTXSID)[1]
plot_clint(verified_data_clint, id)
```

**Figure 1:** Time vs MS Response Factor plots generated by `plot_clint`. The title contains the DTXSID of the compound and the respective compound name is displayed below the x-axis. Each point represents a measurement of the mass-spectrometry response factor (`Response`) and the sample type and calibration is represented by shape and color, respectively.

The following code demonstrates how to display plots for multiple compounds together in a grid. Each plot displays a legend specific to the data that exists for a given compound, and we do not recommend a common legend. However, adjusting the legend size is recommended. The code below also demonstrates legend scaling.

```{r example clint multiple, warning=FALSE, fig.width=7, fig.height=7}
# plot all 3 chemicals in the dataset  
p <- list()
for (id in unique(verified_data_clint$DTXSID)) {
  p[[id]] <- plot_clint(verified_data_clint, id) + 
    theme(legend.title = element_text(size = 7),
          legend.text = element_text(size = 7),
          plot.title = element_text(size= 7))
}
do.call(grid.arrange, p)
```

**Figure 2:** These three plots are examples of plots generated by `plot_clint`. The font size of legend text and titles are scaled down for better visualization in the grid arrangement.

## plot_fup_uc.R

`plot_fup_uc` generates a response by sample type plot for a specified chemical. The data set used in this example is `fup_uc_L2` (@smeltz2023plasma). 

```{r example fup_uc single, fig.width=5, fig.height=4}
# load in data
level2.uc <- invitroTKstats::fup_uc_L2

# keep only the verified data
verified_data_uc <- level2.uc[level2.uc$Verified == "Y", ]

id <- unique(verified_data_uc$DTXSID)[2]
plot_fup_uc(verified_data_uc, id)
```

**Figure 3:** Boxplots of MS Response Factor (`Response`) generated by `plot_fup_uc`. The title contains the DTXSID of the compound and the respective compound name is displayed below the x-axis. Each boxplot shows the distribution of responses by sample types and calibrations. Data points representing the individual response points for a given boxplot are overlaid. The boxplot outlines and data points are colored by calibration (if there are multiple).

The following code demonstrates how to display plots for multiple compounds together in a grid. The y-axis label can be long and is generally common to all plots. Thus, duplication for each individual plot may be unsuitable for multiple plot situations. The code below shows how to utilize the common axis label to prevent overlapping axes labels when using `grid.arrange`.

```{r example fup_uc, fig.height=7, fig.width=7, warning = FALSE}
# plot all three chemicals 
p2 <- list()
for (id in unique(verified_data_uc$DTXSID)) {
  p2[[id]] <- plot_fup_uc(verified_data_uc, id) + 
    theme(legend.title = element_text(size = 7),
          legend.text = element_text(size = 7),
          plot.title = element_text(size= 7))
}

# remove the y-axis labels from all plots
p2 = p2 %>% map(~.x + labs(y=NULL))
# create an universal label
yleft = gridtext::richtext_grob("Mass Spec. Intensity / Fraction Unbound", rot=90)
grid.arrange(grobs=p2, ncol = 2, nrow = 2, left = yleft)
```

**Figure 4:** These three plots are examples of plots generated by `plot_fup_uc`. The font size of legend text and titles are scaled down and a common y-axis label is utilized for better visualization in the grid arrangement.

By default, the plot compares responses across sample types. However, as there may be many different calibrations, displaying them all on one plot may make the plot uninterpretable. Thus, in these cases, faceting the plots by calibration is recommended and is shown in the code chunks below. Note that there is only one calibration value in our example dataset so Figures 5 and 6 are identical. But, the code is presented for demonstration purposes.  

```{r, fig.width=5, fig.height=4}
dtxsid <- unique(verified_data_uc$DTXSID)[2]
p3 <- plot_fup_uc(verified_data_uc, dtxsid)
p3
```

**Figure 5:** Plot generated with `plot_fup_uc` with all calibrations displayed on the same plot.

```{r, fig.width=5, fig.height=4}
p3 <- plot_fup_uc(verified_data_uc, dtxsid) + facet_wrap(~Calibration)
p3
```

**Figure 6:** Plot generated with `plot_fup_uc` and faceted by calibration such that each plot would correspond to a single calibration.

In the `plot_fup_uc` function there is an argument `compare` that allows one to change the boxplots' comparisons. By default, boxplots are compared across sample types (`compare = "type"`), but one can also choose to compare across calibrations (`compare = "cal"`). The code below shows how to update the plots to compare across calibrations and facet the plots such that each plot corresponds to a sample type.

```{r, fig.width=7, fig.height=7}
p4 <- plot_fup_uc(verified_data_uc, dtxsid, compare = "cal")
p5 <- p4 + facet_wrap(~Sample.Type)

grid.arrange(p4, p5)
```

**Figure 7:** Two plots generated by `plot_fup_uc` comparing across calibrations. The top plot shows all boxplots in a single plot and the bottom plot is faceted by sample type such that each sub-plot relates to a single sample type. Note that there is only one calibration value in our example dataset, "030821".

## References 

\insertRef{smeltz2023plasma}{invitroTKstats}