--- title: "Getting Started" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( message = FALSE, warning = FALSE, comment = "#>" ) ``` `puremoe` provides a unified interface to PubMed and NLM data. Search with `search_pubmed()`, then retrieve data from any of five endpoints with `get_records()`. ```{r libs} library(puremoe) library(dplyr) library(DT) ``` ## Search `search_pubmed()` accepts standard PubMed query syntax and returns a vector of PMIDs. ```{r search} pmids <- puremoe::search_pubmed('("political ideology"[TiAb])') length(pmids) ``` ```{r subset} pmids_sub <- head(pmids, 50L) ``` ## Abstracts ```{r abstracts} abstracts <- puremoe::get_records( pmids_sub, endpoint = "pubmed_abstracts", cores = 1L, sleep = 0.5 ) abstracts <- abstracts |> mutate(pmid = as.character(pmid)) ``` ```{r abstracts-table} abstracts |> select(pmid, year, journal, articletitle) |> DT::datatable(rownames = FALSE) ``` The `annotations` column is a list of per-article data frames containing MeSH terms, chemical names, and keywords. ```{r annotations} bind_rows(abstracts$annotations) |> head(20) |> DT::datatable(rownames = FALSE) ``` ## Affiliations ```{r affiliations} affiliations <- puremoe::get_records( head(pmids_sub, 25L), endpoint = "pubmed_affiliations", cores = 1L, sleep = 0.5 ) affiliations |> DT::datatable(rownames = FALSE) ``` ## iCite metrics ```{r icites} icites <- puremoe::get_records( pmids_sub, endpoint = "icites", cores = 1L, sleep = 0.25 ) icites |> mutate(pmid = as.character(pmid)) |> select(-citation_net, -cited_by_clin) |> DT::datatable(rownames = FALSE, options = list(scrollX = TRUE)) ``` ## PubTator annotations ```{r pubtations} pubtations <- puremoe::get_records( head(pmids_sub, 30L), endpoint = "pubtations", cores = 1L ) pubtations |> DT::datatable(rownames = FALSE) ``` ## Full text Full-text retrieval requires open-access PMC articles. `pmid_to_ftp()` resolves PMIDs to XML URLs via the PMC Cloud Service on AWS S3, filtering to only those with open-access full text available. In August 2026, NCBI will complete its migration from the legacy PMC FTP Service to the Cloud Service; `puremoe` already uses the new service. ```{r ftp} ftp <- puremoe::pmid_to_ftp(pmids = pmids_sub) ftp |> DT::datatable(rownames = FALSE, options = list(scrollX = TRUE)) ``` ```{r fulltext} fulltext <- puremoe::get_records( head(ftp$url, 2L), endpoint = "pmc_fulltext", cores = 1L ) fulltext |> mutate(text = sapply(strsplit(text, "\\s+"), function(w) paste0(paste(head(w, 15), collapse = " "), "..."))) |> slice(1:5) |> DT::datatable(rownames = FALSE, options = list(scrollX = TRUE)) ``` ## Endpoint schemas `endpoint_info()` returns column definitions, rate limits, and notes for any endpoint. ```{r endpoint-info} puremoe::endpoint_info() ``` ```{r endpoint-detail} puremoe::endpoint_info("icites") ```