#> Warning: package 'tidyRSS' was built under R version 4.3.1The package provides three simple functions for reading RSS feeds from news outlets and have them conveniently returned as a tibble.
The newscatcheR package provides a dataset of news sites and
their rss feeds, together with some characteristics of the websites such
as the topic, country or language of the website, and few functions
explore and access the feeds from R.
Two functions that work as a wrapper around tidyRSS can be used to fetch the feed from a given website. Two additional functions can be used to conveniently browse the websites dataset.
The first function get_news() returns a tibble of the
rss feed of a given site.
The second function get_headlines is a helper function
that returns a tibble of just the headlines, instead of the full rss
feed.
Because some website have multiple feeds divided by topics,
describe_url(website) can be helpful to see the topics of a
given website.
Finally, filter_urls(topic, country, language ) can be
used to browse the dataset by topic, country, or language.
filter_urls(topic = "tech", country = "IT", language = "it")
#> # A tibble: 5 × 7
#>   clean_url       language topic_unified main  clean_country rss_url  GlobalRank
#>   <chr>           <chr>    <chr>         <chr> <chr>         <chr>    <chr>     
#> 1 repubblica.it   it       tech          None  IT            http://… 1086      
#> 2 lastampa.it     it       tech          None  IT            http://… 2413      
#> 3 ilsole24ore.com it       tech          None  IT            http://… 2681      
#> 4 corriere.it     it       tech          None  IT            http://… 1328      
#> 5 ansa.it         it       tech          None  IT            http://… 2248This package can be convenient if you need to fetch news from various websites for further analysis and you don’t want to search manually for the URL of their RSS feeds.
Assuming we have the news sites we want to follow:
We can get a list of data frames with: