Title: | US General Social Survey (GSS) Data for R |
---|---|
Description: | The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the GSS Cumulative Data and GSS Panel Data files packaged for R. Its companion package, gssrdoc, provides the codebook integrated into R's help system For more information on the GSS see \url{http://gss.norc.org}. |
Authors: | Kieran Healy [aut, cre] |
Maintainer: | Kieran Healy <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.6 |
Built: | 2024-11-11 23:16:11 UTC |
Source: | https://github.com/kjhealy/gssr |
A tibble containing Release 2a of the GSS Cumulative Data (1972-2022) file.
data(gss_all)
data(gss_all)
An object of class tibble
with 72,390 rows and 6,694 columns. Variables are encoded as labelled vectors. The GSS Codebook is the authoritative source for the variables in this dataset. It is available at http://gss.norc.org/Get-Documentation. Summary information is available in gss_doc
, a tibble supplied with this package.
National Opinion Research Center, http://gss.norc.org.
Use gss_get_yr()
to get GSS data for a single year from
NORC's GSS website (where it is available as a zipped Stata file)
and put it directly into a tibble.
gss_get_yr( year = 2022, url = "https://gss.norc.org/documents/stata/", fname = "_stata", ext = "zip", dest = "data-raw/", save_file = c("n", "y") )
gss_get_yr( year = 2022, url = "https://gss.norc.org/documents/stata/", fname = "_stata", ext = "zip", dest = "data-raw/", save_file = c("n", "y") )
year |
The desired GSS survey year, as a number (i.e., not in quotes). Defaults to 2022. |
url |
Location of the file. Defaults to the current NORC URL for Stata files. |
fname |
Non-year filename component. Defaults to '_stata'. Usually should not be changed. |
ext |
File name extension. Defaults to 'zip'. Usually should not be changed. |
dest |
If |
save_file |
Save the data file as well as loading it as an object. Defaults to 'n'. |
A tibble with the requested year's GSS data.
gss80 <- gss_get_yr(1980)
gss80 <- gss_get_yr(1980)
A tibble containing the General Social Survey 2006 Three Wave Panel Data File, in long format.
data(gss_panel06_long)
data(gss_panel06_long)
A tibble with 6,000 rows and 1,572 columns. Variables are encoded as numerics or factors. The GSS
Codebook is the authoritative source for the variables in this
dataset. It is available at
http://gss.norc.org/Get-Documentation. Summary
information is available in gss_panel_doc
, a tibble
supplied with this package. Respondent ids are contained in the
variable firstid
(from the GSS id\_1
variable).
Survey waves (years 2006, 2008, 2010) are contained in the
wave
variable as 1, 2, and 3. See also the gss_panel_doc
object in this package.
National Opinion Research Center, http://gss.norc.org.
A tibble containing the General Social Survey 2008 Three Wave Panel Data File, in long format.
data(gss_panel08_long)
data(gss_panel08_long)
A tibble with 6,069 rows and 1,243 columns. Variables are encoded as as numerics or factors. The GSS
Codebook is the authoritative source for the variables in this
dataset. It is available at
http://gss.norc.org/Get-Documentation. Summary
information is available in gss_panel_doc
, a tibble
supplied with this package. Respondent ids are contained in the
variable firstid
(from the GSS id_1
variable).
Survey waves (years 2008, 2010, 2012) are indicated by the
wave
variable as 1, 2, and 3. See also the gss_panel_doc
object in this package.
National Opinion Research Center, http://gss.norc.org.
A tibble containing the General Social Survey 2010 Three Wave Panel Data File, in long format.
data(gss_panel10_long)
data(gss_panel10_long)
A tibble with 6,132 rows and 1,191 columns. Variables are encoded as as numerics or factors.
The GSS Codebook is the authoritative source for the
variables in this dataset. It is available at
http://gss.norc.org/Get-Documentation. Summary
information is available in gss_panel_doc
, a tibble supplied
with this package. Respondent ids are contained in the variable
firstid
(from the GSS id_1
variable). Survey
waves (years 2010, 2012, 2014) are indicated by the wave
variable as 1, 2, and 3. See also the gss_panel_doc
object in this package.
National Opinion Research Center, http://gss.norc.org.
A tibble containing the General Social Survey 2020 Panel Data File, in wide format.
data(gss_panel20)
data(gss_panel20)
A tibble with 5,215 rows and 4,296 columns. Variables are encoded as labelled vectors. The GSS Codebook is the authoritative source for the variables in this dataset. It is available at http://gss.norc.org/Get-Documentation. Due to the COVID-19 pandemic, in 2020 the conducted the GSS was conducted as two studies: (1) a panel re-interview of past respondents from the 2016 and 2018 cross sectional GSS studies (referred to as the 2016-2020 GSS Panel), and (2) an independent fresh cross-sectional address-based sampling push to web study (referred to as 2020 cross-sectional survey). This data object is for the first study; namely, the study empaneling former 2016 and 2018 GSS respondents to answer a GSS questionnaire in 2020 (i.e., the 2016-2020 GSS panel).
This data focuses on Wave 2 of the 2016-2020 GSS Panel – i.e. the panel reinterviews with 2018 GSS respondents and a randomly selected subset of 2016 GSS respondents. The GSS has used a panel format previously, as parts of the 2006-2014 GSS. In the 2016-2020 GSS Panel, variables only contain data from one of the three years. To differentiate between versions of each variable, they have been appended with suffixes. Variables from 2016 (Wave 1a) have _1a appended, variables from 2018 (Wave 1b) have _1b appended, and variables from 2020 (Wave 2) have _2 appended. Users can also track cases from 2016 and 2018, and reinterviews from 2020 with the variable `samptype`. Because of its relatively complex nature, users are strongly encouraged to consult the official [GSS documentation for this dataset](https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf).
National Opinion Research Center, http://gss.norc.org.
A tibble containing just a few variables from the GSS Cumulative Data File. See http://gss.norc.org/Get-Documentation for full documentation of the variables.
data(gss_sub)
data(gss_sub)
A tibble with 72,390 rows and 19 columns.
year
Year of the survey.
id
Respondent id.
ballot
Survey ballot
age
Age of respondent
race
Race of respondent
sex
Sex of respondent
degree
Highest level of education obtained
padeg
Father's education
padeg
Mother's education
relig
Religion (simple coding)
Polviews
Political views
fefam
Response to a statement that it is better for man to go out to work, and for a woman to tend the home
vpsu
Variance primary sampling unit
vstrat
Variance stratum
oversamp
Weights for black oversamples
formwt
Survey weight for experimental randomization
wtssall
Survey weight (1972-2018)
wtssps
Poststratification survey weight (1972-2022)
sampcode
Sampling error code
sample
Sampling frame and method
National Opinion Research Center, http://gss.norc.org.
See which years a particular question was asked in the GSS.
gss_which_years(data, variable)
gss_which_years(data, variable)
data |
A tibble of data, usually gss_all |
variable |
The variable or variables we want to check. Provide variables in tidyselect style, i.e. unquoted, and for multiple variables enclose unquoted in c() |
What years was a particular question asked in the GSS?
A tibble showing whether the question or questions were asked in each of the GSS years
## Not run: data(gss_all) gss_all %>% gss_which_years(fefam) gss_all %>% gss_which_years(c(industry, indus80, wrkgovt, commute)) ## End(Not run)
## Not run: data(gss_all) gss_all %>% gss_which_years(fefam) gss_all %>% gss_which_years(c(industry, indus80, wrkgovt, commute)) ## End(Not run)