--- title: "Getting Started with iAR" author: "Felipe Elorrieta, Cesar Ojeda, Susana Eyheramendy, Wilfredo Palma" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 3 vignette: > %\VignetteIndexEntry{Getting Started with iAR} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4 ) ``` ## Overview The **iAR** package provides autoregressive models for time series observed at *irregularly spaced* time points, a common situation in astronomy, finance, and environmental science. It implements three complementary model classes: | Class | Coefficient | Autocorrelation | Reference | |-------|-------------|-----------------|-----------| | `iAR` | Scalar $\phi \in (0,1)$ | Positive only | Eyheramendy et al. (2018) | | `CiAR` | Vector $(\phi_R, \phi_I)$, $\|\phi\| < 1$ | Positive *and* negative | Elorrieta et al. (2019) | | `BiAR` | Vector $(\phi_R, \phi_I)$, $\|\phi\| < 1$ | Bivariate, cross-correlated | Elorrieta et al. (2021) | All three share the same workflow: **construct** → **simulate** → **estimate** → **forecast / interpolate**. ```{r load-pkg} library(iAR) ``` --- ## Generating Irregular Observation Times Real astronomical or environmental data is rarely sampled on a regular grid. The `gentime()` function generates realistic irregular time grids from several distributions. The first argument `x` defaults to `NULL`, so the `utilities` object is created internally and you can call `gentime()` directly without any setup step. ```{r gentime} set.seed(2847) # x defaults to NULL — no setup needed times <- gentime(n = 100)@times cat("Number of observations:", length(times), "\n") cat("Time span: [", round(min(times), 1), ",", round(max(times), 1), "]\n") cat("Mean gap between observations:", round(mean(diff(times)), 2), "\n") ``` The default distribution (`"expmixture"`) mixes two exponential distributions, mimicking data sets where most observations are closely spaced but occasional long gaps occur. Alternative distributions include `"uniform"`, `"exponential"`, and `"gamma"`. ```{r gentime-plot, fig.cap = "Distribution of inter-observation gaps (log scale) for 100 irregular times drawn from an exponential mixture."} gaps <- diff(times) hist(log1p(gaps), main = "Inter-observation gaps (log1p scale)", xlab = "log(1 + gap)", col = "steelblue", border = "white", breaks = 20) ``` --- ## iAR: Univariate Irregular Autoregressive Model ### Simulation Construct an `iAR` object with the desired coefficient, then call `sim()`. ```{r iar-sim} set.seed(2847) times <- gentime(n = 100)@times # Build the model: family = "norm", phi = 0.9 m_iar <- iAR(family = "norm", times = times, coef = 0.9) m_iar <- sim(m_iar) head(m_iar@series) ``` ```{r iar-plot, fig.cap = "Simulated iAR series (phi = 0.9). The irregular spacing is visible in the unequal distances along the x-axis."} plot(m_iar, type = "o",main = "Simulated iAR series (phi = 0.9)",ylab="Values",xlab="Time",pch=20) ``` ### Parameter Estimation Two estimators are available: * **`loglik()`** — direct MLE; supports all three families (`"norm"`, `"t"`, `"gamma"`). * **`kalman()`** — Kalman-filter MLE; available for all three model classes (`iAR`, `CiAR`, `BiAR`). Before estimation, standardise the series (zero mean, unit variance) and supply a vector of error standard deviations. For simulated data with no measurement noise, zeros suffice. ```{r iar-prep} y <- m_iar@series y_std <- y / sd(y) # unit variance m_iar@series <- y_std m_iar@series_esd <- rep(0, length(times)) ``` #### Using `loglik()` ```{r iar-loglik} m_loglik <- loglik(m_iar) cat("loglik estimate: phi =", round(m_loglik@coef, 4), "\n") cat("Log-likelihood: ", round(m_loglik@loglik, 4), "\n") ``` #### Using `kalman()` ```{r iar-kalman} m_kalman <- kalman(m_iar) cat("kalman estimate: phi =", round(m_kalman@coef, 4), "\n") cat("Kalman log-lik: ", round(m_kalman@kalmanlik, 4), "\n") ``` Both estimators should recover a value close to the true $\phi = 0.9$. #### Standard Errors via the Hessian Set `hessian = TRUE` to obtain standard errors and p-values. ```{r iar-hessian} m_h <- iAR(family = "norm", times = times, coef = 0.9, hessian = TRUE) m_h@series <- y_std m_h@series_esd <- rep(0, length(times)) m_h <- loglik(m_h) # Summary contains the coefficient, SE, p-value, information criteria, and residual diagnostics summary(m_h) ``` ### Forecasting After estimation, use `forecast()` to predict `tAhead` time units into the future. ```{r iar-forecast} m_fc <- forecast(m_kalman, tAhead = 10) cat("Forecast (10 time units ahead):", round(m_fc@forecast, 4), "\n") ``` ```{r iar-plot-forecast, fig.cap = "iAR series with one-step-ahead forecast (blue dot)."} plot_forecast(m_fc,ylab="Values",xlab="Time",type="o",pch=20) ``` ### Interpolation (Imputing Missing Values) `interpolation()` imputes `NA` values in the series using the fitted model. ```{r iar-interp} # Introduce a missing value at position 10 m_interp <- m_kalman m_interp@series[10] <- NA m_interp <- interpolation(m_interp) cat("Interpolated value at position 10:", round(m_interp@interpolated_values, 4), "\n") cat("Original value at position 10 :", round(y_std[10], 4), "\n") ``` --- ## CiAR: Complex Irregular Autoregressive Model `CiAR` extends `iAR` to allow **negative autocorrelation** via a complex coefficient $\phi = \phi_R + i\,\phi_I$. The stationarity condition is $|\phi| = \sqrt{\phi_R^2 + \phi_I^2} < 1$. ### Simulation and Estimation ```{r ciar-sim} set.seed(2847) times <- gentime(n = 100)@times # phi = (0.9, 0): purely real CiAR — same as iAR but more general m_ciar <- CiAR(times = times, coef = c(0.9, 0)) m_ciar <- sim(m_ciar) # Standardise y_c <- m_ciar@series y_c_std <- y_c / sd(y_c) m_ciar@series <- y_c_std m_ciar@series_esd <- rep(0, length(times)) ``` ```{r ciar-kalman} m_ciar <- kalman(m_ciar) cat("CiAR estimate: phiR =", round(m_ciar@coef[1], 4), " phiI =", round(m_ciar@coef[2], 4), "\n") cat("Modulus |phi| =", round(sqrt(sum(m_ciar@coef^2)), 4), "\n") ``` ### Forecasting ```{r ciar-forecast} m_ciar <- forecast(m_ciar, tAhead = 10) cat("CiAR forecast:", round(m_ciar@forecast, 4), "\n") ``` --- ## BiAR: Bivariate Irregular Autoregressive Model `BiAR` models two time series observed at the **same** irregular time points with a shared autoregressive structure and cross-correlation $\rho$. ### Simulation ```{r biar-sim} set.seed(2847) times <- gentime(n = 100)@times # coef = c(phiR, phiI), rho = cross-series correlation m_biar <- BiAR(times = times, coef = c(0.9, 0.3), rho = 0.9) m_biar <- sim(m_biar) head(m_biar@series) # matrix: column 1 = series 1, column 2 = series 2 ``` ### Estimation ```{r biar-kalman} n <- length(times) y_b <- m_biar@series y_b_std <- y_b / matrix(apply(y_b, 2, sd), nrow = n, ncol = 2, byrow = TRUE) m_biar@series <- y_b_std m_biar@series_esd <- matrix(0, n, 2) m_biar <- kalman(m_biar) cat("BiAR estimate: phiR =", round(m_biar@coef[1], 4), " phiI =", round(m_biar@coef[2], 4), "\n") ``` ### Forecasting ```{r biar-forecast} m_biar <- forecast(m_biar, tAhead = 10) cat("BiAR forecast (series 1):", round(m_biar@forecast[1], 4), "\n") cat("BiAR forecast (series 2):", round(m_biar@forecast[2], 4), "\n") ``` --- ## Real Data Example: AGN Light Curve The package includes the `agn` dataset — 237 K-band flux measurements of the active galactic nucleus MCG-6-30-15, taken between 2006 and 2011. ```{r agn-load} data(agn) head(agn) ``` ```{r agn-plot, fig.cap = "AGN MCG-6-30-15 light curve. Gaps in the observation schedule produce the irregular spacing typical of ground-based astronomical surveys."} plot(agn$t, agn$m, xlab = "Heliocentric JD - 2450000", ylab = "Flux [10^-15 erg/s/cm^2/A]", main = "AGN MCG-6-30-15 (K band)",type="o",pch=20) ``` ### Fitting an iAR Model ```{r agn-fit} # Standardise: centre and scale by SD (measurement errors available) y_agn <- agn$m y_agn_c <- (y_agn - mean(y_agn)) / sd(y_agn) esd_agn <- agn$merr / sd(y_agn) m_agn <- iAR( family = "norm", times = agn$t, series = y_agn_c, series_esd = esd_agn ) m_agn <- kalman(m_agn) cat("Estimated phi:", round(m_agn@coef, 4), "\n") cat("Kalman log-lik:", round(m_agn@kalmanlik, 4), "\n") ``` ### Fitted Values and Forecast ```{r agn-fit-plot, fig.cap = "AGN light curve (standardised) with fitted iAR values overlaid."} m_agn <- fit(m_agn) plot_fit(m_agn, xlab = "Heliocentric JD - 2450000", ylab = "Flux [10^-15 erg/s/cm^2/A]",type="o",pch=20) ``` ```{r agn-forecast} m_agn <- forecast(m_agn, tAhead = 1) cat("One-step-ahead forecast:", round(m_agn@forecast, 4), "\n") ``` --- ## Utility: Pairing Two Irregular Time Series When two series are observed at different (but overlapping) time points, `pairingits()` matches them within a tolerance window before fitting a `BiAR` model. ```{r pair} data(cvnovag) data(cvnovar) # Pair the G-band and R-band CV Nova light curves paired <- pairingits(data1 = cvnovag, data2 = cvnovar, tol = 0.01) cat("Paired observations:", nrow(paired@paired), "\n") head(paired@paired) ``` --- ## Summary The table below collects the key functions for each step of the modelling workflow. | Step | iAR | CiAR | BiAR | |------|-----|------|------| | Construct | `iAR(family, times, coef)` | `CiAR(times, coef)` | `BiAR(times, series, coef, rho)` | | Simulate | `sim(m)` | `sim(m)` | `sim(m)` | | Estimate (direct MLE) | `loglik(m)` | — | — | | Estimate (Kalman) | `kalman(m)` | `kalman(m)` | `kalman(m)` | | Forecast | `forecast(m, tAhead)` | `forecast(m, tAhead)` | `forecast(m, tAhead)` | | Interpolate | `interpolation(m)` | `interpolation(m)` | `interpolation(m)` | | Plot fitted | `plot_fit(m)` | `plot_fit(m)` | `plot_fit(m)` | | Plot forecast | `plot_forecast(m)` | `plot_forecast(m)` | `plot_forecast(m)` | | Generate times | `gentime(n = ...)` | — | — | | Pair two series | `pairingits(data1, data2)` | — | — | ## References Eyheramendy, S., Elorrieta, F., & Palma, W. (2018). An irregular discrete time series model to identify residuals with autocorrelation in astronomical light curves. *Monthly Notices of the Royal Astronomical Society*, **481**(4), 4990–5003. Elorrieta, F., Eyheramendy, S., & Palma, W. (2019). Discrete semi-parametric model for the investigation of variable stars. *Astronomy & Astrophysics*, **627**, A120. Elorrieta, F., Eyheramendy, S., Palma, W., & Ojeda, C. (2021). A bivariate irregular autoregressive model. *Monthly Notices of the Royal Astronomical Society*, **505**(1), 1105–1116.