
Rank provides a customizable alternative to the built-in
rank() function. The package offers the following
features:
Frequency-based ranking of categorical variables: choose whether to rank based on alphabetic order or element frequency.
Control over sorting order: Use
desc=TRUE to rank based on descending or ascending
order.
To install rank from CRAN run:
install.packages("rank")You can install the development version of rank like so:
# install.packages('remotes')
remotes::install_github("selkamand/rank")library(rank)
fruits <- c("Apple", "Orange", "Apple", "Pear", "Orange")
# rank alphabetically
smartrank(fruits)
#> [1] 1.5 3.5 1.5 5.0 3.5
# rank based on frequency
smartrank(fruits, sort_by = "frequency")
#> [1] 2.5 4.5 2.5 1.0 4.5
# rank based on descending order of frequency
smartrank(fruits, sort_by = "frequency", desc = TRUE)
#> [1] 3.5 1.5 3.5 5.0 1.5# rank numerically
smartrank(c(1, 3, 2))
#> [1] 1 3 2
# rank numerically based on descending order
smartrank(c(1, 3, 2), desc = TRUE)
#> [1] 3 1 2We can use order to sort vectors based on their ranks.
For example, we can sort the fruits vector based on the
frequency of each element.
fruits <- c("Apple", "Orange", "Apple", "Pear", "Orange")
ranks <- smartrank(fruits, sort_by = "frequency")
fruits[order(ranks)]
#> [1] "Pear" "Apple" "Apple" "Orange" "Orange"rank_by_priority() assigns the highest ranks to
specified values (in order), while all remaining values share the same
lower rank.
reorder_by_priority() uses those ranks to move priority
values to the front of the vector.
# Prioritise D first, then C; A and B follow in original order
rank_by_priority(c("A", "B", "C", "D"), priority_values = c("D", "C"))
#> [1] 3.5 3.5 2.0 1.0
# Reorder so priorities come first
reorder_by_priority(c("A", "B", "C", "D"), priority_values = c("D", "C"))
#> [1] "D" "C" "A" "B"rank_stratified() computes a single combined rank across
all columns of a data frame, where each column is ranked within groups
defined by all previous columns. This produces a true hierarchical
ordering.
data <- data.frame(
gender = c("male", "male", "male", "male", "female", "female", "male", "female"),
pet = c("cat", "cat", "magpie", "magpie", "giraffe", "cat", "giraffe", "cat")
)
# Hierarchical ranking:
# 1. Rank gender (globally, by frequency)
# 2. Within each gender, rank pet by within-gender frequency
r <- rank_stratified(
data,
sort_by = c("frequency", "frequency"),
desc = TRUE
)
data[order(r), ]
#> gender pet
#> 3 male magpie
#> 4 male magpie
#> 1 male cat
#> 2 male cat
#> 7 male giraffe
#> 6 female cat
#> 8 female cat
#> 5 female giraffesmartrank can be used to arrange data.frames based on
one or more columns, while maintaining complete control over how each
column contributes to the final row order.
For example, we can sort the following dataframe based on frequency of fruits, but break any ties based on the alphabetical order of the picker.
data <- data.frame(
fruits = c("Apple", "Orange", "Apple", "Pear", "Orange"),
picker = c("Elizabeth", "Damian", "Bob", "Cameron", "Alice")
)
# Rank_stratified():
# 1. Rank fruits by frequency (globally)
# 2. Within each fruit, rank pickers alphabetically
strat_ranks <- rank_stratified(
data,
cols = c("fruits", "picker"),
sort_by = c("frequency", "alphabetical"),
desc = c(TRUE, FALSE)
)
data[order(strat_ranks), ]
#> fruits picker
#> 5 Orange Alice
#> 2 Orange Damian
#> 3 Apple Bob
#> 1 Apple Elizabeth
#> 4 Pear CameronAn equivalent way to hierarchically sort data.frames is to use the
tidyverse arrange() function
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
arrange(
data,
rank_stratified(
data,
cols = c("fruits", "picker"),
sort_by = c("frequency", "alphabetical"),
desc = c(TRUE, FALSE)
)
)
#> fruits picker
#> 1 Orange Alice
#> 2 Orange Damian
#> 3 Apple Bob
#> 4 Apple Elizabeth
#> 5 Pear CameronSee CONTRIBUTING.md.