
Search, download, and process public domain texts from the Project Gutenberg collection.
Install the released version from CRAN:
install.packages("gutenbergr")Install the development version from GitHub:
# install.packages("pak")
pak::pak("ropensci/gutenbergr")Load the package and any other required libraries:
library(gutenbergr)
library(dplyr)We’ll get and set our Project Gutenberg mirror:
gutenberg_get_mirror()#> [1] "https://aleph.pglaf.org"
Search through the metadata to find Jane Austen’s Persuasion:
gutenberg_works(title == "Persuasion")#> # A tibble: 1 × 8
#> gutenberg_id title author gutenberg_author_id language
#> <int> <chr> <chr> <int> <fct>
#> 1 105 Persuasion Austen, Jane 68 en
#> gutenberg_bookshelf rights has_text
#> <chr> <fct> <lgl>
#> 1 Category: Novels/Category: British Literature Public domain in the USA. TRUE
Persuasion’s gutenberg_id is 105. We’ll use
this ID to download it and also set our cache option to
"persistent" so that we don’t have to re-download it
later.
options(gutenbergr_cache_type = "persistent")
persuasion <- gutenberg_download(105)persuasion#> # A tibble: 8,357 × 2
#> gutenberg_id text
#> <int> <chr>
#> 1 105 "Persuasion"
#> 2 105 ""
#> 3 105 ""
#> 4 105 "by Jane Austen"
#> 5 105 ""
#> 6 105 "(1818)"
#> 7 105 ""
#> 8 105 ""
#> 9 105 ""
#> 10 105 ""
#> # ℹ 8,347 more rows
Multiple works can be downloaded at once. We’ll also download Edna
St. Vincent Millay’s Renascence and Other Poems
(gutenberg_id 161) and throw in title data
from the metadata.
books <- gutenberg_download(c(105, 161), meta_fields = "title")books |> count(title)#> # A tibble: 2 × 2
#> title n
#> <chr> <int>
#> 1 Persuasion 8357
#> 2 Renascence, and Other Poems 1222
See the following vignettes for more advanced usage of gutenbergr.
See the data-raw
directory for scripts. Metadata was generated from the
Project Gutenberg catalog on 13 March 2026.
Yes! The package follows Project Gutenberg’s rules:
.zip files to minimize bandwidthSee their Terms of Use for details.
See CONTRIBUTING.md.
Note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.