Type: | Package |
Title: | Generate Summary Tables for Continuous, Ordinal, and Categorical Data |
Version: | 0.1.0 |
Maintainer: | Ama Nyame-Mensah <ama@anyamemensah.com> |
URL: | https://anyamemensah.github.io/summarytabl/ |
Description: | Provides functions for tabulating and summarizing continuous, ordinal, and categorical variables in data frames. The package was designed to streamline exploratory data analysis and simplify the creation of summary tables for reports and other purposes. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | dplyr, purrr, rlang, stats, tibble, tidyr |
RoxygenNote: | 7.3.3 |
Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) |
VignetteBuilder: | knitr |
Depends: | R (≥ 4.1.0) |
NeedsCompilation: | no |
Packaged: | 2025-09-30 02:56:26 UTC; AmaNM |
Author: | Ama Nyame-Mensah [aut, cre] |
Repository: | CRAN |
Date/Publication: | 2025-10-06 08:00:02 UTC |
summarytabl: Generate Summary Tables for Continuous, Ordinal, and Categorical Data
Description
Provides functions for tabulating and summarizing continuous, ordinal, and categorical variables in data frames. The package was designed to streamline exploratory data analysis and simplify the creation of summary tables for reports and other purposes.
Author(s)
Maintainer: Ama Nyame-Mensah ama@anyamemensah.com
See Also
Useful links:
Summarize a categorical variable by a grouping variable
Description
cat_group_tbl()
presents frequency counts and percentages
(count, percent) for nominal or categorical variables by some grouping variable.
Relative frequencies and percentages of each level of the primary categorical
variable (row_var
) within each level of the grouping variable (col_var
) can
be returned. Missing data can be excluded for either variable from the calculations.
By default, the table is returned in the long format.
Usage
cat_group_tbl(
data,
row_var,
col_var,
na.rm.row_var = FALSE,
na.rm.col_var = FALSE,
only = NULL,
ignore = NULL,
pivot = "longer"
)
Arguments
data |
A data frame. |
row_var |
A character string of the name of a column in |
col_var |
A character string of the name of a column in |
na.rm.row_var |
A logical value indicating whether missing values for |
na.rm.col_var |
A logical value indicating whether missing values for |
only |
A character string or vector of strings indicating the types of summary
data to return. The default is |
ignore |
A named character vector or list containing values to ignore from
|
pivot |
A character string specifying the format of the returned summary table.
The default is |
Value
A tibble displaying relative frequency counts and/or percentages of row_var
,
grouped by col_var.
When the output is in wider format, columns
prefixed with count_
and percent_
contain the frequency and proportion, respectively, for each distinct
response value of row_var
within each level of col_var.
Author(s)
Ama Nyame-Mensah
Examples
cat_group_tbl(data = nlsy,
row_var = "gender",
col_var = "bthwht",
pivot = "wider",
only = "count")
cat_group_tbl(data = nlsy,
row_var = "birthord",
col_var = "breastfed",
pivot = "longer")
Summarize a categorical variable
Description
cat_group_tbl()
presents frequency counts and percentages
(count, percent) for nominal or categorical variables. Missing data can
be excluded from the calculations.
Usage
cat_tbl(data, var, na.rm = FALSE, only = NULL, ignore = NULL)
Arguments
data |
A data frame. |
var |
A character string of the name of a variable in |
na.rm |
A logical value indicating whether missing values should be
removed before calculations. Default is |
only |
A character string, or vector of character strings, of the
types of summary data to return. Default is |
ignore |
An optional vector that contains values to exclude from the data.
Default is |
Value
A tibble displaying the relative frequency counts and/or percentages
of row_var
.
Author(s)
Ama Nyame-Mensah
Examples
cat_tbl(data = nlsy, var = "gender")
cat_tbl(data = nlsy, var = "race", only = "count")
cat_tbl(data = nlsy,
var = "race",
ignore = "Hispanic",
only = "percent",
na.rm = TRUE)
Check a named vector
Description
This function assesses whether named lists and vectors have
invalid values (like NULL
or NA
), invalid names (such as missing or
empty names), confirms that the count of valid names matches the count of
provided values, and verifies that the valid names obtained from the named
object align with the supplied names. If any checks fail, the default
value is returned.
Usage
check_named_vctr(x, names, default)
Arguments
x |
A named vector. |
names |
A character vector specifying the names to be matched. |
default |
Default value to return |
Value
Either the original object, x
, or the default
value.
Author(s)
Ama Nyame-Mensah
Examples
# returns NULL
check_named_vctr(x = c(one = 1, two = 2, 3),
names = c("one", "two", "three"),
default = NULL)
# returns x
check_named_vctr(x = list(one = 1, two = 2, three = 3),
names = list("one", "two", "three"),
default = NULL)
Depressive Symptoms Data
Description
These data are a subset from the National Longitudinal Survey of Youth (NLSY) 1979 Children and Young Adults. The dataset includes information about depressive symptoms in children and young adults. The dataset has 11,551 observations and 12 variables.
For more information about the National Longitudinal Survey of Youth, visit https://www.nlsinfo.org/.
Usage
depressive
Format
A data.frame with 11,551 rows and 12 columns:
- cid
Child identification number)
- race
race of child (1 = Hispanic, 2 = Black, 3 = Non-Black,Non-Hispanic)
- gender
gender of child (1 = male, 2 = female)
- yob
year of child's bith
- dep_1
how often child feels sad and blue (1 = often, 2 = sometimes, 3 = hardly ever)
- dep_2
how often child feels nervous, tense, or on edge (1 = often, 2 = sometimes, 3 = hardly ever)
- dep_3
how often child feels happy (1 = often, 2 = sometimes, 3 = hardly ever)
- dep_4
how often child feels bored (1 = often, 2 = sometimes, 3 = hardly ever)
- dep_5
how often child feels lonely (1 = often, 2 = sometimes, 3 = hardly ever)
- dep_6
how often child feels tired or worn out (1 = often, 2 = sometimes, 3 = hardly ever)
- dep_7
how often child feels excited about something (1 = often, 2 = sometimes, 3 = hardly ever)
- dep_8
how often child feels too busy to get everything (1 = often, 2 = sometimes, 3 = hardly ever)
Summarize continuous variables by group
Description
mean_group_tbl()
presents descriptive statistics (mean, sd, minimum,
maximum, number of non-missing observations) for interval (e.g., Test scores) and
ratio level (e.g., Age) variables with the same variable stem by some grouping variable.
A variable stem is a common prefix found in related variable names, often corresponding
to similar survey items, that represents a shared concept before unique identifiers (like
time points) are added. For example, in the stem_social_psych
dataset, the two variables
'belong_belongStem_w1' and 'belong_belongStem_w2' share the variable stem 'belong_belongStem'
(e.g., "I feel like an outsider in STEM"), with suffixes (_w1, _w2) indicating different
measurement waves. By default, missing data are excluded from the calculations in a listwise
fashion.
Usage
mean_group_tbl(
data,
var_stem,
group,
escape_stem = FALSE,
ignore_stem_case = FALSE,
group_type = "variable",
group_name = NULL,
escape_group = FALSE,
ignore_group_case = FALSE,
remove_group_non_alnum = TRUE,
na_removal = "listwise",
only = NULL,
var_labels = NULL,
ignore = NULL
)
Arguments
data |
A data frame. |
var_stem |
A character string of a variable stem or the full name of a variable in
|
group |
A character string of a variable in |
escape_stem |
A logical value indicating whether to escape |
ignore_stem_case |
A logical value indicating whether the search for columns matching
the supplied |
group_type |
A character string that defines the type of grouping variable. Should be
one of |
group_name |
A character string piped to the final table to replace the name of |
escape_group |
A logical value indicating whether to escape string supplied to |
ignore_group_case |
A logical value indicating whether |
remove_group_non_alnum |
A logical value indicating whether to remove all non-
alphanumeric characters (anything that is not a letter or number) from |
na_removal |
A character string specifying how to remove missing values. Should be
one of |
only |
A character string or vector of character strings of the types of
summary data to return. Default is |
var_labels |
An optional named character vector or list where each element maps
labels to variable names. If any element is unnamed or if any labels do not match
variables in returned from |
ignore |
An optional named vector or list specifying values to exclude from the
dataset and analysis. By default, |
Value
A tibble presenting summary statistics (e.g., mean, standard deviation, minimum value, maximum, number of non-missing observations) for a set of variables sharing the same variable stem. The results are grouped by either a grouping variable in the data or by a pattern matched with variable names.
Author(s)
Ama Nyame-Mensah
Examples
mean_group_tbl(data = stem_social_psych,
var_stem = "belong_welcomedStem",
group = "_w\\d",
group_type = "pattern",
na_removal = "pairwise",
var_labels = c(belong_welcomedStem_w1 = "I feel welcomed in STEM workplaces",
belong_welcomedStem_w2 = "I feel welcomed in STEM workplaces"),
group_name = "wave")
mean_group_tbl(data = social_psy_data,
var_stem = "belong",
group = "gender",
group_type = "variable",
na_removal = "pairwise",
var_labels = c(belong_1 = "I feel like I belong at this institution",
belong_2 = "I feel like part of the community",
belong_3 = "I feel valued by this institution"),
group_name = "gender_identity")
grouped_data <-
data.frame(
symptoms.t1 = sample(c(1:5, -999), replace = TRUE, size = 50),
symptoms.t2 = sample(c(NA, 1:5, -999), replace = TRUE, size = 50)
)
mean_group_tbl(data = grouped_data,
var_stem = "symptoms",
group = ".t\\d",
group_type = "pattern",
escape_group = TRUE,
na_removal = "listwise",
ignore = c(symptoms = -999))
Summarize continuous variables
Description
mean_tbl()
presents descriptive statistics (mean, sd, minimum, maximum,
number of non-missing observations) for interval (e.g., Test scores) and ratio level
(e.g., Age) variables with the same variable stem. A variable stem is a common prefix
found in related variable names, often corresponding to similar survey items, that
represents a shared concept before unique identifiers (like timep oints) are added. For
example, in the stem_social_psych
dataset, the two variables 'belong_belongStem_w1'
and 'belong_belongStem_w2' share the variable stem 'belong_belongStem' (e.g., "I feel
like an outsider in STEM"), with suffixes (_w1, _w2) indicating different measurement
waves. By default, missing data are excluded from the calculations in a listwise
fashion.
Usage
mean_tbl(
data,
var_stem,
escape_stem = FALSE,
ignore_stem_case = FALSE,
na_removal = "listwise",
only = NULL,
var_labels = NULL,
ignore = NULL
)
Arguments
data |
A data frame. |
var_stem |
A character string of a variable stem or the full name of a variable in
|
escape_stem |
A logical value indicating whether to escape |
ignore_stem_case |
A logical value indicating whether the search for columns
matching the supplied |
na_removal |
A character string specifying how to remove missing values. Should be
one of |
only |
A character string or vector of character strings of the kinds of summary
statistics to return. Default is |
var_labels |
An optional named character vector or list where each element maps
labels to variable names. If any element is unnamed or if any labels do not match
variables in returned from |
ignore |
An optional vector that contains values to exclude from the data. Default
is |
Value
A tibble presenting summary statistics for series of continuous variables with the same variable stem.
Author(s)
Ama Nyame-Mensah
Examples
mean_tbl(data = social_psy_data,
var_stem = "belong")
mean_tbl(data = social_psy_data,
var_stem = "belong",
na_removal = "pairwise",
var_labels = c(belong_1 = "I feel like I belong at this institution",
belong_2 = "I feel like part of the community",
belong_3 = "I feel valued by this institution"))
National Longitudinal Survey of Youth (NLSY) Data
Description
These data are a subset from the National Longitudinal Survey of Youth (NLSY) 1979 Children and Young Adults.The data contains 2,976 observations and 10 variables.
For more information about the National Longitudinal Survey of Youth, visit https://www.nlsinfo.org/.
Usage
nlsy
Format
A tibble with 2,976 rows and 11 columns:
- CID
Child identification number)
- race
race of child (Hispanic, Black, Non-Black,Non-Hispanic)
- gender
gender of child (1 = male, 0 = female)
- birthord
birth order of child
- magebirth
Age of mother at birth of child
- bthwht
whether child was born low birth weight (1 = yes, 0 = no)
- breastfed
whether child was breastfed (1 = yes, 0 = no)
- medu
Highest grade completed by child’s mother
- math
PIAT Math Standard Score
- read
PIAT Reading Recognition Standard Score
- hhnum
Number of household members in household
Summarize multiple response variables by group
Description
select_group_tbl()
presents frequency counts and percentages
(count, percent) for binary (e.g., Unselected/Selected) and ordinal (e.g.,
strongly disagree to strongly agree) variables with the same variable stem
by some grouping variable. A variable stem is a common prefix found in related
variable names, often corresponding to similar survey items, that represents a
shared concept before unique identifiers (like timep oints) are added. For
example, in the stem_social_psych
dataset, the two variables
belong_belongStem_w1
and belong_belongStem_w2
share the variable stem
belong_belongStem
(e.g., "I feel like an outsider in STEM"), with suffixes
(_w1, _w2) indicating different measurement waves. By default, missing data are
excluded from the calculations in a listwise fashion.
Usage
select_group_tbl(
data,
var_stem,
group,
escape_stem = FALSE,
ignore_stem_case = FALSE,
group_type = "variable",
group_name = NULL,
escape_group = FALSE,
ignore_group_case = FALSE,
remove_group_non_alnum = TRUE,
na_removal = "listwise",
pivot = "longer",
only = NULL,
var_labels = NULL,
ignore = NULL
)
Arguments
data |
A data frame. |
var_stem |
A character string of a variable stem or the full name of a
variable in |
group |
A character string of a variable in |
escape_stem |
A logical value indicating whether to escape |
ignore_stem_case |
A logical value indicating whether the search for columns
matching the supplied |
group_type |
A character string that defines the type of grouping variable.
Should be one of |
group_name |
A character string piped to the final table to replace the name
of |
escape_group |
A logical value indicating whether to escape string supplied
to |
ignore_group_case |
A logical value indicating whether |
remove_group_non_alnum |
A logical value indicating whether to remove all
non-alphanumeric characters (anything that is not a letter or number) from |
na_removal |
A character string specifying how to remove missing values.
Should be one of |
pivot |
A character string specifying the format of the returned summary table.
The default is |
only |
A character string or vector of character strings of the kinds of summary
data to return. Default is |
var_labels |
An optional named character vector or list where each element
maps labels to variable names. If any element is unnamed or if any labels do not
match variables in returned from |
ignore |
An optional named vector or list specifying values to exclude from
the dataset and analysis. By default, |
Value
A tibble displaying frequency counts and/or percentages for each value of a
set of variables sharing the same variable stem. The results are grouped by either a
grouping variable in the data or by a pattern matched with variable names. When the
output is in the wider format, columns beginning with count_value
and percent_value
prefixes report the count and percentage, respectively, for each distinct response
value of the variable within each group.
Author(s)
Ama Nyame-Mensah
Examples
select_group_tbl(data = stem_social_psych,
var_stem = "belong_belong",
group = "\\d",
group_type = "pattern",
group_name = "wave",
na_removal = "pairwise",
pivot = "wider",
only = "count")
tas_recoded <-
tas |>
dplyr::mutate(sex = dplyr::case_when(
sex == 1 ~ "female",
sex == 2 ~ "male",
TRUE ~ NA)) |>
dplyr::mutate(dplyr::across(
.cols = dplyr::starts_with("involved_"),
.fns = ~ dplyr::case_when(
.x == 1 ~ "selected",
.x == 0 ~ "unselected",
TRUE ~ NA)
))
select_group_tbl(data = tas_recoded,
var_stem = "involved_",
group = "sex",
group_type = "variable",
na_removal = "pairwise",
pivot = "wider")
depressive_recoded <-
depressive |>
dplyr::mutate(sex = dplyr::case_when(
sex == 1 ~ "male",
sex == 2 ~ "female",
TRUE ~ NA)) |>
dplyr::mutate(dplyr::across(
.cols = dplyr::starts_with("dep_"),
.fns = ~ dplyr::case_when(
.x == 1 ~ "often",
.x == 2 ~ "sometimes",
.x == 3 ~ "hardly",
TRUE ~ NA
)
))
select_group_tbl(data = depressive_recoded,
var_stem = "dep",
group = "sex",
group_type = "variable",
na_removal = "listwise",
pivot = "wider",
only = "percent",
var_labels =
c("dep_1" = "how often child feels sad and blue",
"dep_2" = "how often child feels nervous, tense, or on edge",
"dep_3" = "how often child feels happy",
"dep_4" = "how often child feels bored",
"dep_5" = "how often child feels lonely",
"dep_6" = "how often child feels tired or worn out",
"dep_7" = "how often child feels excited about something",
"dep_8" = "how often child feels too busy to get everything"))
Summarize multiple response variables
Description
select_tbl()
presents frequency counts and percentages
(count, percent) for binary (e.g., Unselected/Selected) and ordinal (e.g.,
strongly disagree to strongly agree) variables with the same variable stem.
A variable stem is a common prefix found in related variable names, often
corresponding to similar survey items, that represents a shared concept before
unique identifiers (like time points) are added. For example, in the stem_social_psych
dataset, the two variables belong_belongStem_w1
and belong_belongStem_w2
share the variable stem belong_belongStem
(e.g., "I feel like an outsider in
STEM"), with suffixes (_w1, _w2) indicating different measurement waves. By
default, missing data are excluded from the calculations in a listwise fashion.
Usage
select_tbl(
data,
var_stem,
escape_stem = FALSE,
ignore_stem_case = FALSE,
na_removal = "listwise",
pivot = "longer",
only = NULL,
var_labels = NULL,
ignore = NULL
)
Arguments
data |
A data frame. |
var_stem |
A character string of a variable stem or the full name of a variable
in |
escape_stem |
A logical value indicating whether to escape |
ignore_stem_case |
A logical value indicating whether the search for columns
matching the supplied |
na_removal |
A character string specifying how to remove missing values. Should
be one of |
pivot |
A character string specifying the format of the returned summary table.
The default is |
only |
A character string or vector of character strings of the kinds of summary
data to return. Default is |
var_labels |
An optional named character vector or list where each element maps
labels to variable names. If any element is unnamed or if any labels do not match
variables in returned from |
ignore |
An optional vector that contains values to exclude from the data. Default
is |
Value
A tibble displaying frequency counts and/or percentages for each value of a
set of variables sharing the same variable stem. When the output is in the wider format,
columns beginning with count_value
and percent_value
prefixes report the count and
percentage, respectively, for each distinct response value of the variable.
Author(s)
Ama Nyame-Mensah
Examples
select_tbl(data = tas,
var_stem = "involved_",
na_removal = "pairwise")
select_tbl(data = depressive,
var_stem = "dep",
na_removal = "listwise",
pivot = "wider",
only = "percent")
var_label_example <-
c("dep_1" = "how often child feels sad and blue",
"dep_2" = "how often child feels nervous, tense, or on edge",
"dep_3" = "how often child feels happy",
"dep_4" = "how often child feels bored",
"dep_5" = "how often child feels lonely",
"dep_6" = "how often child feels tired or worn out",
"dep_7" = "how often child feels excited about something",
"dep_8" = "how often child feels too busy to get everything")
select_tbl(data = depressive,
var_stem = "dep",
na_removal = "pairwise",
pivot = "longer",
var_labels = var_label_example)
select_tbl(data = depressive,
var_stem = "dep",
na_removal = "pairwise",
pivot = "wider",
only = "count",
var_labels = var_label_example)
Social Psychological (Generated) Data
Description
These data were generated to produce social psychological data applicable to real-world contexts.
Usage
social_psy_data
Format
A data.frame with 10,200 rows and 17 columns:
- id
participant id number)
- belong_1
I feel like I belong at this institution (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- belong_2
I feel like part of the community (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- belong_3
I feel valued by this institution (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- identity_1
This institution is a big part of who I am (1=Strongly Disagree,2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- identity_2
I feel comfortable being myself in this setting (1=Strongly Disagree,2=Disagree,3=Neither agree nor disagree,4=Agree, 5=Strongly Agree)
- identity_3
This institution is a big part of who I am (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- identity_4
I care about doing well at this institution (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- selfEfficacy_1
I am confident about A (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- selfEfficacy_2
I am confident about B (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- selfEfficacy_3
I am confident about C (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- selfEfficacy_4
I am confident about D (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- selfEfficacy_5
I am confident about E (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- selfEfficacy_6
I am confident about F (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- selfEfficacy_7
I am confident about G (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
- gender
Participant's gender identity (1=Woman,2=Man,3=Non-binary, 4=Self-identify,5=Transgender,6=Gender-queer/non-conforming)
- citizen
Participant's citizenship status (1=U.S. citizen,2=Non-U.S. citizen with permanent residency,3=Non-U.S. citizen with temporary visa,4=Other)
STEM Social Psychological (Generated) Data
Description
These data were generated to produce social psychological data applicable to a subset of college students participating in a Science, Technology, Engineering, and Mathematics (STEM) intervention program.
Usage
stem_social_psych
Format
A data.frame with 786 rows and 37 columns:
- id
student id number)
- belong_belongStem_w1
I feel like I belong in STEM (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- belong_outsiderStem_w1
I feel like an outsider in STEM (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- identity_identityStem_w1
STEM is a big part of who I am. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- belong_welcomedStem_w1
I feel welcomed in STEM workplaces (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- identity_noCommonStem_w1
I do not have much in common with the other students in my STEM classes.(1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_passStemCourses_w1
pass my STEM courses.(1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_learnConcepts_w1
learn the foundations and concepts of scientific thinking. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_stemField_w1
do well in a stem-related field. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- selfEfficacy_learnScience_w1
quickly learn new science areas, systems, techniques or concepts on my own. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_contributeProject_w1
contribute to a science project. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_commScience_w1
clearly communicate scientific problems and findings to varied audiences (1=Strongly disagree,2=Somewhat disagree, 3=Neither disagree nor agree, 4=Somewhat agree,5=Strongly agree)
- selfEfficacy_scientist_w1
become a scientist. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- selfEfficacy_completeUG_w1
complete an undergraduate STEM degree. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_admitGrad_w1
get admitted to a graduate STEM program. (1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_successGrad_w1
be successful in a graduate STEM program. (1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- belong_belongStem_w2
I feel like I belong in STEM (1=Strongly disagree, 2=Somewhat disagree, 3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- belong_outsiderStem_w2
I feel like an outsider in STEM. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- identity_identityStem_w2
STEM is a big part of who I am. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- belong_welcomedStem_w2
I feel welcomed in STEM workplaces. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- identity_noCommonStem_w2
I do not have much in common with the other students in my STEM classes.(1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_passStemCourses_w2
pass my STEM courses. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_learnConcepts_w2
learn the foundations and concepts of scientific thinking. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_stemField_w2
do well in a stem-related field. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- selfEfficacy_learnScience_w2
quickly learn new science areas, systems, techniques or concepts on my own. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_contributeProject_w2
contribute to a science project. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_commScience_w2
clearly communicate scientific problems and findings to varied audiences (1=Strongly disagree,2=Somewhat disagree, 3=Neither disagree nor agree, 4=Somewhat agree,5=Strongly agree)
- selfEfficacy_scientist_w2
become a scientist. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
- selfEfficacy_completeUG_w2
complete an undergraduate STEM degree. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_admitGrad_w2
get admitted to a graduate STEM program. (1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- selfEfficacy_successGrad_w2
be successful in a graduate STEM program. (1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
- is_male
Participant's current sex (0=Not Male,1=Male)
- has_disability
Whether participant has a disability (0=No, 1=Yes)
- firstGen
Whether participant is a first generation college student (0=No, 1=Yes)
- stemMajor
Whether participant is a STEM Major (0=No, 1=Yes)
- expLearning
Whether student has participated in an experiential learning program, such as an internship, research, or leadership opportunity. (0=No, 1=Yes)
- urm
Whether participant is Asian, Middle Eastern/Arab or White (0) vs. Black, Indigenous, Hispanic/Latino, or Mixed Race (1)
Panel Study of Income Dynamics (PSID) Transition into Adulthood Supplement (TAS) Data
Description
These data are a subset from the Panel Study of Income Dynamics (PSID) Transition into Adulthood Supplement. The data contains 2,526 observations and 8 variables.
For more information about the Panel Study of Income Dynamics, visit https://psidonline.isr.umich.edu/CDS/default.aspx.
Usage
tas
Format
A tibble with 2,526 rows and 8 columns:
- pid
personal identification number)
- sex
sex of individual (1 = female, 2 = male)
- involved_arts
whether the individual participated in any organized activities related to art, music, or the theater in the last 12 months (1 = yes, 0 = no)
- involved_sports
whether the individual was a member of any athletic or sports teams in the last 12 months (1 = yes, 0 = no)
- involved_schoolClubs
whether the individual was involved with any high school or college clubs or student government in the last 12 months (1 = yes, 0 = no)
- involved_election
whether the individual voted in the national election in November 2016 that was held to elect the President (1 = yes, 0 = no)
- involved_socialActionGrps
whether the individual was involved in any political groups, solidarity or ethnic-support groups or social-action groups in the last 12 months (1 = yes, 0 = no)
- involved_volunteer
whether the individual was involved in any unpaid volunteer or community service work in the last 12 months (1 = yes, 0 = no)