| Title: | US General Social Survey (GSS) Data for R |
|---|---|
| Description: | The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the GSS Cumulative Data and GSS Panel Data files packaged for R. Its companion package, gssrdoc, provides the codebook integrated into R's help system For more information on the GSS see \url{http://gss.norc.org}. |
| Authors: | Kieran Healy [aut, cre] (ORCID: <https://orcid.org/0000-0001-9114-981X>) |
| Maintainer: | Kieran Healy <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.9 |
| Built: | 2026-05-11 09:23:55 UTC |
| Source: | https://github.com/kjhealy/gssr |
A tibble containing the General Social Survey Cumulative Data file.
data(gss_all)data(gss_all)
An object of class tibble with 75,699 rows and 6,942 columns. Variables are encoded as labelled vectors. The GSS Codebook is the authoritative source for the variables in this dataset. It is available at http://gss.norc.org/Get-Documentation. Summary information is available in gss_doc, a tibble supplied with this package.
This is Release 3 of the 1972-2024 GSS cumulative data file. See the release notes and documentation from NORC. for further details.
National Opinion Research Center, http://gss.norc.org.
Use gss_get_yr() to get GSS data for a single year from
NORC's GSS website (where it is available as a zipped Stata file)
and put it directly into a tibble.
gss_get_yr( year = 2024, url = "https://gss.norc.org/documents/stata/", fname = "_stata", ext = "zip", dest = "data-raw/", save_file = c("n", "y") )gss_get_yr( year = 2024, url = "https://gss.norc.org/documents/stata/", fname = "_stata", ext = "zip", dest = "data-raw/", save_file = c("n", "y") )
year |
The desired GSS survey year, as a number (i.e., not in quotes). Defaults to 2024. |
url |
Location of the file. Defaults to the current NORC URL for Stata files. |
fname |
Non-year filename component. Defaults to '_stata'. Usually should not be changed. |
ext |
File name extension. Defaults to 'zip'. Usually should not be changed. |
dest |
If |
save_file |
Save the data file as well as loading it as an object. Defaults to 'n'. |
A tibble with the requested year's GSS data.
gss80 <- gss_get_yr(1980)gss80 <- gss_get_yr(1980)
A tibble containing the General Social Survey 2006 Three Wave Panel Data File, in long format.
data(gss_panel06_long)data(gss_panel06_long)
A tibble with 6,000 rows and 1,572 columns. Variables are encoded as numerics or factors. The GSS
Codebook is the authoritative source for the variables in this
dataset. It is available at
http://gss.norc.org/Get-Documentation. Summary
information is available in gss_panel_doc, a tibble
supplied with this package. Respondent ids are contained in the
variable firstid (from the GSS id\_1 variable).
Survey waves (years 2006, 2008, 2010) are contained in the
wave variable as 1, 2, and 3. See also the gss_panel_doc object in this package.
National Opinion Research Center, http://gss.norc.org.
A tibble containing the General Social Survey 2008 Three Wave Panel Data File, in long format.
data(gss_panel08_long)data(gss_panel08_long)
A tibble with 6,069 rows and 1,243 columns. Variables are encoded as as numerics or factors. The GSS
Codebook is the authoritative source for the variables in this
dataset. It is available at
http://gss.norc.org/Get-Documentation. Summary
information is available in gss_panel_doc, a tibble
supplied with this package. Respondent ids are contained in the
variable firstid (from the GSS id_1 variable).
Survey waves (years 2008, 2010, 2012) are indicated by the
wave variable as 1, 2, and 3. See also the gss_panel_doc object in this package.
National Opinion Research Center, http://gss.norc.org.
A tibble containing the General Social Survey 2010 Three Wave Panel Data File, in long format.
data(gss_panel10_long)data(gss_panel10_long)
A tibble with 6,132 rows and 1,191 columns. Variables are encoded as as numerics or factors.
The GSS Codebook is the authoritative source for the
variables in this dataset. It is available at
http://gss.norc.org/Get-Documentation. Summary
information is available in gss_panel_doc, a tibble supplied
with this package. Respondent ids are contained in the variable
firstid (from the GSS id_1 variable). Survey
waves (years 2010, 2012, 2014) are indicated by the wave variable as 1, 2, and 3. See also the gss_panel_doc object in this package.
National Opinion Research Center, http://gss.norc.org.
A tibble containing the General Social Survey 2020 Panel Data File, in wide format.
data(gss_panel20)data(gss_panel20)
A tibble with 5,215 rows and 4,296 columns. Variables are encoded as labelled vectors. The GSS Codebook is the authoritative source for the variables in this dataset. It is available at http://gss.norc.org/Get-Documentation. Due to the COVID-19 pandemic, in 2020 the conducted the GSS was conducted as two studies: (1) a panel re-interview of past respondents from the 2016 and 2018 cross sectional GSS studies (referred to as the 2016-2020 GSS Panel), and (2) an independent fresh cross-sectional address-based sampling push to web study (referred to as 2020 cross-sectional survey). This data object is for the first study; namely, the study empaneling former 2016 and 2018 GSS respondents to answer a GSS questionnaire in 2020 (i.e., the 2016-2020 GSS panel).
This data focuses on Wave 2 of the 2016-2020 GSS Panel – i.e. the panel reinterviews with 2018 GSS respondents and a randomly selected subset of 2016 GSS respondents. The GSS has used a panel format previously, as parts of the 2006-2014 GSS. In the 2016-2020 GSS Panel, variables only contain data from one of the three years. To differentiate between versions of each variable, they have been appended with suffixes. Variables from 2016 (Wave 1a) have _1a appended, variables from 2018 (Wave 1b) have _1b appended, and variables from 2020 (Wave 2) have _2 appended. Users can also track cases from 2016 and 2018, and reinterviews from 2020 with the variable `samptype`. Because of its relatively complex nature, users are strongly encouraged to consult the official [GSS documentation for this dataset](https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf).
National Opinion Research Center, http://gss.norc.org.
A tibble containing just a few variables from the GSS Cumulative Data File. See http://gss.norc.org/Get-Documentation for full documentation of the variables.
data(gss_sub)data(gss_sub)
A tibble with 72,390 rows and 19 columns.
yearYear of the survey.
idRespondent id.
ballotSurvey ballot
ageAge of respondent
raceRace of respondent
sexSex of respondent
degreeHighest level of education obtained
padegFather's education
padegMother's education
religReligion (simple coding)
PolviewsPolitical views
fefamResponse to a statement that it is better for man to go out to work, and for a woman to tend the home
vpsuVariance primary sampling unit
vstratVariance stratum
oversampWeights for black oversamples
formwtSurvey weight for experimental randomization
wtssallSurvey weight (1972-2018)
wtsspsPoststratification survey weight (1972-2022)
sampcodeSampling error code
sampleSampling frame and method
National Opinion Research Center, http://gss.norc.org.
See which years a particular question was asked in the GSS.
gss_which_years(data, variable, year = year)gss_which_years(data, variable, year = year)
data |
A tibble of data, usually |
variable |
The variable or variables we want to check. Provide variables in tidyselect style, i.e. unquoted, and for multiple variables enclose unquoted in c() |
year |
The grouping variable; defaults to and should always be |
What years was a particular question asked in the GSS?
A tibble showing whether the question or questions were asked in each of the GSS years
## Not run: data(gss_all) gss_all %>% gss_which_years(fefam) gss_all %>% gss_which_years(c(industry, indus80, wrkgovt, commute)) ## End(Not run)## Not run: data(gss_all) gss_all %>% gss_which_years(fefam) gss_all %>% gss_which_years(c(industry, indus80, wrkgovt, commute)) ## End(Not run)