| Title: | Utilities and Data Sets for Data Visualization |
|---|---|
| Description: | Supporting materials for a course and book on data visualization. It contains utility functions for graphs and several sample data sets. See Healy (2019) <ISBN 978-0691181622>. |
| Authors: | Kieran Healy [aut, cre] |
| Maintainer: | Kieran Healy <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 2.0.0 |
| Built: | 2026-05-22 05:44:01 UTC |
| Source: | https://github.com/kjhealy/socviz |
Convenience 'not-in' operator
x %nin% yx %nin% y
x |
vector of items |
y |
vector of all values |
Complement of the built-in operator %in%. Returns the elements of x that are not in y.
logical vecotor of items in x not in y
Kieran Healy
fruit <- c("apples", "oranges", "banana") "apples" %nin% fruit "pears" %nin% fruitfruit <- c("apples", "oranges", "banana") "apples" %nin% fruit "pears" %nin% fruit
A dataset of US poverty rates by selected age groups within counties.
acs_povertyacs_poverty
A tibble with 9,666 rows and 4 columns.
fips. County FIPS code.
age_group. Adults 18-64, Children <18, Seniors 65+.
age_rate. Poverty rate (percent) for 'age_group' in county.
total_rate. Poverty rate (percent) for all ages in county.
American Community Survey 2023 ACS 5-year estimates, Table B17018.
A dataset of US poverty rates by selected age groups within counties.
acs_poverty_lonacs_poverty_lon
A tibble with 12,888 rows and 3 columns.
geoid. County FIPS code.
age_group. Adults 18-64, Children <18, Seniors 65+, All Ages.
prop_poor. Proportion (0-1) of 'age_group' in poverty within county.
American Community Survey 2023 ACS 5-year estimates, Table B17018.
Membership and some financial information for sections of the American Sociological Association in 2014/15.
asasecasasec
## 'asasec' A data frame with 52 rows and 9 columns:
Section name.
Short name.
Cash on hand at beginning of year (2015).
Membership revenues.
Section expenses.
Cash on hand at end of year (2015).
Does the Section run a journal?
Membership year.
Number of members (2014).
Data from the American Sociological Association.
Kieran Healy
ASA Annual Report 2016.
Scale and/or center the numeric columns of a data frame or tibble
center_df(data, sc = FALSE, cen = TRUE)center_df(data, sc = FALSE, cen = TRUE)
data |
A data frame or tibble |
sc |
Scale the variables (default FALSE) |
cen |
Center the variables on their means (default TRUE) |
Takes a data frame or tibble as input and scales and/or centers the numeric columns. By default, centers but doesn't scale
An object of the same class as 'data', with the numeric columns scaled or centered as requested
Kieran Healy
head(center_df(organdata))head(center_df(organdata))
Plot a table of color hex values as a table of colors
color_comp(df)color_comp(df)
df |
data frame of color hex values |
Given a data frame of color values, plot them as swatches
Plot of table of colors
Kieran Healy
color_table color_comp(color_table)color_table color_comp(color_table)
Draw a palette of colors
color_pal(col, border = "gray70", ...)color_pal(col, border = "gray70", ...)
col |
vector of colors |
border |
border |
... |
other arguments |
Borrowed from the colorspace library
Plot of a color palette
colorspace library authors
color_pal(c("#66C2A5", "#FC8D62", "#8DA0CB"))color_pal(c("#66C2A5", "#FC8D62", "#8DA0CB"))
Hex values for five default ggplot colors, with corresponding approximations for three kinds of color blindness. Produced by the 'dichromat' package.
color_tablecolor_table
A tibble with five rows and four columns.
Kieran Healy
US County map data layer with selected Census Bureau demographic variables.
counties_sfcounties_sf
## 'counties_sf' A simple features object 3,144 rows and 16 columns:
FIPS code.
County name.
Area in square miles.
N White population.
N Black population.
N Asian population.
N Non-Hispanic White population.
N Hispanic population.
Total population.
Percent Black, discretized.
Percent Hispanic, discretized.
Percent Non-Hispanic White, discretized.
Percent Asian, discretized.
Population density per square mile.
Population density per square mile, discretized.
Firearm-related suicides per 100,000 population, 1999-2015. Factor variable cut into six categories. Note that the values in this variable contain an inaccurate bottom-quartile coding by construction. Do not present this variable as an accurate measure of the firearm-related suicide rate.
Population density per square mile, discretized into six categories, 2014 estimates.
Geometry.
A simple features object. Load the ‘sf' package before using. Alaska and Hawaii have had their geometries scaled and shifted to the bottom left of the map area. Alaska’s Aleutian islands are not included. Except where noted, population counts and other demographic information are from the 2024 5-year ACS estimates.
Kieran Healy
US Census Bureau.
A dataset of components of population change (rates only) for US Counties in 2023.
county_compcounty_comp
A tibble with 3,144 rows and 8 columns.
fips. County FIPS code.
county. County name.
state. State abbreviation.
rbirth. Birth rate.
rdeath. Death rate.
rnatchg. Natural change rate.
rintl. International migration rate.
rdom. Domestic migration rate.
rnetmig. Net migration rate.
US Census Bureau Components of Population Change 2023 estimates.
Selected county data (including US and state-level observations on some variables). Preserved for use with the first edition of the book only.
county_datacounty_data
A data frame with 3195 rows and 13 columns.
The variables are as follows:
id. FIPS State and County code (character)
name. State or County Name
state. State abbreviation
census_region. Census region
pop_dens. Population density per square mile, 2014 estimate (seven categories).
pct_black. Percent black population, 2014 estimate (seven category factor)
pop_dens6. Population density per square mile, 2014 estimate (six categories)
su_gun6. Firearm-related suicides per 100,000 population, 1999-2015. Factor variable cut into six categories. Note that the values in this variable contain an inaccurate bottom-quartile coding by construction. Do not present this variable as an accurate measure of the firearm-related suicide rate.
US Census Bureau, Centers for Disease Control
US county map data
county_mapcounty_map
A data frame with 191,372 rows and 7 columns.
long. Longitude
lat. Latitude
order. Order
hole. Hole (true/false)
piece. Piece
group. Group
id. FIPS code
Eric Celeste
Counts of educational attainment (in thousands) from 1940 to 2016
eduedu
A tibble with 366 rows and 11 columns.
The variables are as follows:
age Character. Cut into 25-34, 35-54, 55>
sex Character. Male, Female.
year Integer.
total Integer. Total in thousands.
elem4 Double. 0 to 4 years of Elementary School completed.
elem8 Double. 5 to 8 years of Elementary School completed.
hs3 Double. 1 to 3 years of High School completed.
hs4 Double. 4 years of High School completed.
coll3 Double. 1 to 3 years of College completed.
coll4 Double. 4 or more years of College completed.
median Double. Median years of education.
US Census Bureau
State-level vote totals and shares for the 2016 US Presidential election. The variables are as follows:
state. State name.
st. State abbreviation.
fips. State FIPS code
total_vote. Total votes cast.
vote_margin. Winner's vote margin
winner. Winning candidate.
party. Winning party.
pct_margin. Winner's percentage margin (proportion of total vote)
r_points. Percentage point difference between Trump share and Clinton
d_points. Percentage point difference between Clinton share and Trump
pct_clinton. Clinton vote share (proportion)
pct_trump. Trump vote share (proportion)
pct_johnson. Johnson vote share (proportion)
pct_other. Other vote share (proportion)
clinton_vote. Clinton vote total
trump_vote. Trump vote total
johnson_vote. Johnson vote total
other_vote. Other vote total
ev_dem. Electoral votes for Clinton
ev_rep. Electoral votes for Trump
ev_oth. Electoral votes for Other
census. Census region.
electionelection
A (tibble) data frame with 51 rows and 22 columns.
Vote data from Dave Leip, US Election Atlas, http://uselectionatlas.org.
State-level vote totals and shares for the 2024 US Presidential election.
election24election24
## 'election24' A data frame with 51 rows and 20 columns:
State name.
State abbreviation.
State FIPS code (character).
Total votes case.
Vote margin (Trump positive values; Harris negative.)
Winning candidate.
Winning party.
Winner's percentage margin (proportion of total vote)
Percentage point difference between Trump vote percent and Harris vote percent
Percentage point difference between Harris vote percent and Trump vote percent
Harris vote share (proportion)
Trump vote share (proportion)
Other vote share (proportion)
Harris vote total
Trump vote total
Other vote total
Electoral votes for Harris
Electoral votes for Trump
Electoral votes for Others
Census region
Kieran Healy
Vote data from Wikipedia, https://en.wikipedia.org/wiki/2024_United_States_presidential_election
A tibble with US presidential election data
election24_county_dfelection24_county_df
## 'election24_county_df' A tibble object with 3,153 rows and 7 columns:
County FIPS code.
State name abbreviation
Votes for Harris/Walz ticket.
Votes for Trump/Vance ticket.
Total votes cast.
Winning party.
Did the party winner change from the winner in 2020? (Yes/No)
A tibble.
Kieran Healy
Election data derived from https://doi.org/10.7910/DVN/VOQCHQ
A dataset of US presidential elections from 1824 to 2024, with information on the winner, runner up, and various measures of vote share. The variables are as follows:
elections_historicelections_historic
A (tibble) data frame with 51 rows and 19 columns.
election. Number of the election counting from the first US presidential election. 1824 is the 10th election.
year. Year.
winner. Full name of winner.
win_party. Party affiliation of winner.
ec_votes. Electoral college votes for winner.
ec_denom. Number of votes in the electoral college.
ec_pct. Winner's share of electoral college vote. (A proportion. Range is 0 to 1.)
popular_pct. Winner's share of popular vote. (A proportion. Range is 0 to 1.)
popular_margin. Winner's margin of the popular vote, expressed as a proportion. Can be positive or negative.
votes. Total votes cast in the election.
margin. Winner's vote margin in the popular vote.
runner_up. Runner up candidate.
ru_part. Party affiliation of runner up candidate.
turnout_pct. Voter turnout as a proportion of eligible voters. (A proportion. Range is 0 to 1.)
winner_lname Last name of winner.
winner_label Winner's last name and election year.
ru_lastname. Runner up's last name.
ru_label. Runner up's last name and election year.
two_term. Is this a two term presidency? (TRUE/FALSE.) Note that F.D. Roosevelt was elected four times.
https://en.wikipedia.org/wiki/List_of_United_States_presidential_elections_by_popular_vote_margin.
Daily data on child pedestrians (aged 0-17 years) involved in a motor vehicle crash that resulted in a fatality.
farsinvolvedfarsinvolved
## 'farsinvolved' A data frame with 5,490 rows and 4 columns:
Month (character)
Day of the month (character)
Year (character)
Number of pedestrians
Each row is a day of the year between January 1st 2009 and December 31st 2023. The 'n' column is the number of pedestrians in the United States who were involved in a motor vehicle crash that day, where the event resulted in a fatality and where the pedestrian was aged between 0 and 17 years old. The person killed is not necessarily the pedestrian.
Kieran Healy
National Highway Traffic Safety Administration (NHTSA) Motor Vehicle Crash Data Querying and Reporting
Two time series of financial data from FRED, the _i means indexed to 100 in the base observation.
fredtsfredts
A data frame with 5 columns and 357 rows.
FRED data.
A dataset containing an extract from the General Social Survey. See http://gss.norc.org/Get-Documentation for full documentation of the variables. This data contains many of the same variables as 'gss_sm', but for all available years from 1972-2024.
gss_longss_lon
A data frame with 75,699 rows and 25 columns.
year. GSS year for this respondent.
id. Respondent id number.
ballot. Ballot used for interview.
age. Age of respondent.
degree. R's highest degree.
race. Race of respondent.
sex. Respondent's sex.
siblings. Number of brothers and sisters (recoded from SIBS).
kids. Number of children (recoded from CHILDS).
bigregion. Region of interview (identical with REGION).
region. Region of interview.
income16. Total family income.
religion. R's religious preference (recoded from RELIGION)
marital. Marital status.
padeg. Father's highest degree.
madeg. Mother's highest degree.
partyid. Political party affiliation.
polviews. Think of self as liberal or conservative.
happy. General happiness.
partners_rc. How many sex partners r had in last year. (Recoded from PARTNERS)
grass. Should marijuana be made legal.
zodiac. Respondent's astrological sign.
wtssall. Person weight variable (1972-2018).
wtssps. Person weight variable (1972-2024).
vpsu. Sampling unit
vstrat. Stratification unit
National Opinion Research Center, http://gss.norc.org.
A dataset containing an extract from the 2016 General Social Survey. See http://gss.norc.org/Get-Documentation for full documentation of the variables.
gss_smgss_sm
A data frame with 2538 rows and 26 columns.
year. gss year for this respondent.
id. respondent id number.
ballot. ballot used for interview.
age. age of respondent.
childs. number of children.
sibs. number of brothers and sisters.
degree. Rs highest degree.
race. race of respondent.
sex. respondent's sex.
region. region of interview.
income16. total family income.
relig. rs religious preference.
marital. marital status.
padeg. fathers highest degree.
madeg. mothers highest degree.
partyid. political party affiliation.
polviews. think of self as liberal or conservative.
happy. general happiness.
partners. how many sex partners r had in last year.
grass. should marijuana be made legal.
zodiac. respondents astrological sign.
pres12. raw variable for whether the Respondent voted for Obama. Recoded to obama in this dataset.
wtssall. weight variable.
income_rc. Recoded income variable.
agegrp. Age variable recoded into age categories
ageq. Age recoded into quartiles.
siblings. Top-coded sibs variable.
kids. Top-coded childs variable.
bigregion. Region variable (Census divisions) recoded to four Census regions.
religion. relig variable recoded to six categories.
partners_rc. partners variable recoded to five categories.
obama. Respondent says the voted for Obama in 2012. 1 = yes; 0 = all other non-design options (Romney, other candidate, did not vote, refused, etc.)
National Opinion Research Center, http://gss.norc.org.
Convert an integer to a date.
int_to_year(x, month = "06", day = "15")int_to_year(x, month = "06", day = "15")
x |
An integer or vector integers. |
month |
The month to be added to the year. Months 1 to 9 should be given as character strings, i.e. "01", "02", etc, and not 1 or 2o, etc. |
day |
The day to be added to the year. Days should be given as character strings, i.e., "01" or "02", etc, and not 1 or 2, etc. |
A vector of dates where the input integer forms the year component. The day and month components added will by default be the 15th of June, so that tick marks will appear in the middle of the series on plots. For input, only years 0:9999 are accepted.
Kieran Healy
int_to_year(1960) class(int_to_year(1960)) int_to_year(1960:1965) int_to_year(1990, month = "01", day = "30")int_to_year(1960) class(int_to_year(1960)) int_to_year(1960:1965) int_to_year(1990, month = "01", day = "30")
Annual enrollments in US Law Schools.
lawschoolslawschools
A tibble with 53 rows and 11 columns.
The variables are as follows:
ay. Academic year. character.
year. Year. integer.
n_schools. Number of law schools. integer.
fy_enrollment. First year enrollment. integer.
fy_male. First year enrollment, men. integer.
fy_female. First year enrollment, women. integer.
jd_total. Total JD enrollment. integer.
jd_male. Total JD enrollment, men. integer.
jd_female. Total JD enrollment, women. integer.
tot_enrolled. Total enrolled. integer.
jd_llb_awarded. JD/LLB degrees awarded. integer.
American Bar Association
A subset of the co2 data in base R's [datasets] package, in a ggplot2-friendly format.
maunaloamaunaloa
A data frame with 4 columns and 271 rows.
R base datasets; Cleveland (1993).
Life expectancy data for individual countries.
oecd_leoecd_le
A tibble with 2,203 rows and 4 columns.
The variables are as follows:
country. Country. (Character)
year. Year. (Integer.)
lifeexp. Life Expectancy at Birth, measured in years.
is_usa. Indicator for USA or Other country.
OECD
Life expectancy data summary table.
oecd_sumoecd_sum
A tibble with 64 rows and 5 columns.
The variables are as follows:
year. Year. (Integer.)
other. Life Expectancy at birth in OECD countries excluding the USA. Measured in years.
usa. Life Expectancy at birth in the USA. Measured in years.
diff. Difference between usa and other.
hi_lo. Is usa above or below the oecd average?
OECD
Births by month, 1933-2015, with decomposition components.
okboomerokboomer
## 'okboomer' A data frame with 996 rows and 11 columns:
Date in date format
Year as ordered factor
Month as ordered factor
N of days in this month
Total births in this month
Population
Births as a proportion of total population
Average daily births per million population
Seasonal component from an STL decomposition of 'births_pct_day'
Trend component from an STL decomposition of 'births_pct_day'
Remainder component from an STL decomposition of 'births_pct_day'
Dataset originally constructed to reproduce a visualization exercise by Aaron Penne.
Kieran Healy
U.S. Census Bureau.
State-level data on opiate related deaths in the US, from the CDC WONDER database.
opiatesopiates
## 'opiates' A tibble frame with 1,122 rows and 8 columns:
State FIPS code.
State abbreviation.
Year.
N opiate-related deaths.
Crude death rate per 100,000 population.
Adjusted death rate.
Census region.
Census division.
Dataset is Multiple Cause of Death, 1999-2020. Standard Population: 2000 U.S. Std. Population. Rates per 100,000. Default intercensal populations for years 2001-2009. MCD ICD-10 Codes selected: T40.0 (Opium), T40.1 (Heroin), T40.2 (Other opioids), T40.3 (Methadone), T40.4 (Other synthetic narcotics), T40.6 (Other and unspecified narcotics). UCD ICD-10 Codes selected: X40-X44, X60-X64, X85, Y10-Y14.
Kieran Healy
CDC WONDER, http://wonder.cdc.gov/mcd-icd10.html
A dataset containing data on rates of organ donation for seventeen OECD countries between 1991 and 2002. The variables are as follows:
organdataorgandata
A (tibble) data frame with 237 rows and 21 columns.
country. Country name.
year. Year.
donors. Organ Donation rate per million population.
pop. Population in thousands.
pop_dens. Population density per square mile.
gdp. Gross Domestic Product in thousands of PPP dollars.
gdp_lag. Lagged Gross Domestic Product in thousands of PPP dollars.
health. Health spending, thousands of PPP dollars per capita.
health_lag Lagged health spending, thousands of PPP dollars per capita.
pubhealth. Public health spending as a percentage of total expenditure.
roads. Road accident fatalities per 100,000 population.
cerebvas. Cerebrovascular deaths per 100,000 population (rounded).
assault. Assault deaths per 100,000 population (rounded).
external. Deaths due to external causes per 100,000 population.
txp_pop. Transplant programs per million population.
world. Welfare state world (Esping Andersen.)
opt. Opt-in policy or Opt-out policy.
consent_law. Consent law, informed or presumed.
consent_practice. Consent practice, informed or presumed.
consistent. Law consistent with practice, yes or no.
ccode. Abbreviated country code.
Macro-economic and spending data: OECD. Other data: Kieran Healy.
Replace series of characters (usually variable names) at the beginning of a character vector.
prefix_replace(var_names, prefixes, replacements, toTitle = TRUE, ...)prefix_replace(var_names, prefixes, replacements, toTitle = TRUE, ...)
var_names |
A character vector, usually variable names |
prefixes |
A character vector, usually variable prefixes |
replacements |
A character vector of replacements for the 'prefixes', in the same order as them. |
toTitle |
Convert results to Title Case? Defaults to TRUE. |
... |
Other arguments to 'gsub' |
Takes a character vector (usually vector of variable names from a summarized or tidied model object), along with a vector of character terms (usually the prefix of a dummy or categorical variable added by R when creating model terms) and strips the latter away from the former. Useful for quickly cleaning variable names for a plot.
A character vector with 'prefixes' terms in 'var_names' replaced with the content of the 'replacement' terms.
Kieran Healy
prefix_replace(iris$Species, c("set", "ver", "vir"), c("sat", "ber", "bar"))prefix_replace(iris$Species, c("set", "ver", "vir"), c("sat", "ber", "bar"))
Strip a series of characters from the beginning of a character vector.
prefix_strip(var_string, prefixes, toTitle = TRUE, ...)prefix_strip(var_string, prefixes, toTitle = TRUE, ...)
var_string |
A character vector, usually variable names |
prefixes |
A character vector, usually variable prefixes |
toTitle |
Convert results to Title Case? Defaults to TRUE. |
... |
Other arguments to 'gsub' |
Takes a character vector (usually vector of variable names from a summarized or tidied model object), along with a vector of character terms (usually the prefix of a dummy or categorical variable added by R when creating model terms) and strips the latter away from the former. Useful for quickly cleaning variable names for a plot.
A character vector with 'prefixes' terms stripped from the beginning of 'var_name' terms.
Kieran Healy
prefix_strip(iris$Species, c("set", "v"))prefix_strip(iris$Species, c("set", "v"))
Round numeric columns of a data frame or tibble
round_df(data, dig = 2)round_df(data, dig = 2)
data |
A data frame or tibble |
dig |
The number of digits to round to |
Takes a data frame or tibble as input, rounds the numeric columns to the specified number of digits.
An object of the same class as 'data', with the numeric columns rounded off to 'dig'
Kieran Healy
head(round_df(iris, 0))head(round_df(iris, 0))
US State map data layer
states_sfstates_sf
## 'states_sf' A simple features object with 51 rows and 5 columns:
State FIPS code
State name abbreviation
State name
Census region
Geometry
A simple features object. Load the ‘sf' package before using. Alaska and Hawaii have had their geometries scaled and shifted to the bottom left of the map area. Alaska’s Aleutian islands are not included.
Kieran Healy
US Census Bureau.
Outstanding student debts in 2016 across 8 income categories, by percent of all borrowers and percent of all balances.
studebtstudebt
## 'studebt' A data frame with 16 rows and 4 columns:
Debt categories (character)
Pct in terms of Borrowers or Balances
Percentage of all type
Debt categories (ordered factor)
Federal Reserve Bank of New York.
A ggplot theme with defaults for axis styling, legends, panels, strips, and plot chrome. Requires ggplot2 >= 4.0.0.
theme_socviz( base_size = 12, base_family = "Source Sans 3", header_family = "Source Sans 3", base_line_size = base_size/24, base_rect_size = base_size/24, ink = "black", paper = "white", accent = "#0072B2" )theme_socviz( base_size = 12, base_family = "Source Sans 3", header_family = "Source Sans 3", base_line_size = base_size/24, base_rect_size = base_size/24, ink = "black", paper = "white", accent = "#0072B2" )
base_size |
Base font size in points. Default is 12. |
base_family |
Base font family. Default is '"Source Sans 3"'. |
header_family |
Font family for plot titles. Default is '"Source Sans 3"'. |
base_line_size |
Base line width, scaled from 'base_size'. |
base_rect_size |
Base rect border width, scaled from 'base_size'. |
ink |
Color used for text, lines, and foreground elements. Default is '"black"'. |
paper |
Color used for backgrounds. Default is '"white"'. |
accent |
Accent color for geom defaults. Default is '"#0072B2"'. |
The theme uses Source Sans 3 (regular weight) as the base font family and Source Sans 3 Semibold as the header family. If the fonts are not installed, they will be downloaded automatically from Google Fonts via [systemfonts::require_font].
A ggplot2 theme object.
## Not run: library(ggplot2) ggplot(mtcars, aes(wt, mpg)) + geom_point() + theme_socviz() ## End(Not run)## Not run: library(ggplot2) ggplot(mtcars, aes(wt, mpg)) + geom_point() + theme_socviz() ## End(Not run)
A theme based on [theme_socviz] with all axes, grids, and borders removed, suitable for plotting maps.
theme_socviz_map( base_size = 12, base_family = "Source Sans 3", header_family = "Source Sans 3", base_line_size = base_size/24, base_rect_size = base_size/24, ink = "black", paper = "white", accent = "#0072B2" )theme_socviz_map( base_size = 12, base_family = "Source Sans 3", header_family = "Source Sans 3", base_line_size = base_size/24, base_rect_size = base_size/24, ink = "black", paper = "white", accent = "#0072B2" )
base_size |
Base font size in points. Default is 12. |
base_family |
Base font family. Default is '"Source Sans 3"'. |
header_family |
Font family for plot titles. Default is '"Source Sans 3"'. |
base_line_size |
Base line width, scaled from 'base_size'. |
base_rect_size |
Base rect border width, scaled from 'base_size'. |
ink |
Color used for text, lines, and foreground elements. Default is '"black"'. |
paper |
Color used for backgrounds. Default is '"white"'. |
accent |
Accent color for geom defaults. Default is '"#0072B2"'. |
A ggplot2 theme object.
## Not run: library(ggplot2) ggplot(map_data("state"), aes(long, lat, group = group)) + geom_polygon(fill = "gray90", colour = "white") + coord_map() + theme_socviz_map() ## End(Not run)## Not run: library(ggplot2) ggplot(map_data("state"), aes(long, lat, group = group)) + geom_polygon(fill = "gray90", colour = "white") + coord_map() + theme_socviz_map() ## End(Not run)
A small table of survival rates from the Titanic, by sex
titanictitanic
A data frame with four rows and four columns.
Titanic data
Quickly make a two-way table of proportions (percentages)
tw_tab(x, y, margin = NULL, digs = 1, dnn = NULL, ...)tw_tab(x, y, margin = NULL, digs = 1, dnn = NULL, ...)
x |
Row variable |
y |
Column variable |
margin |
See 'prop.table'. Default is joint distribution (all cells sum to 100), 1 for row margins (rows sum to 1), 2 for column margins (columns sum to 1) |
digs |
Number of digits to round percentages to. Defaults to 1. |
dnn |
See 'table'. the names to be given to the dimensions in the result (the dimnames names). Defaults to NULL for none. |
... |
Other arguments to be passed to 'table'. |
A wrapper for 'table' and 'prop.table' with the margin labels set by default to NULL and the cells rounded to percents at 1 decimal place.
A contingency table of percentage values.
Kieran Healy
with(gss_sm, tw_tab(bigregion, religion, useNA = "ifany", digs = 1)) with(gss_sm, tw_tab(bigregion, religion, margin = 2, useNA = "ifany", digs = 1))with(gss_sm, tw_tab(bigregion, religion, useNA = "ifany", digs = 1)) with(gss_sm, tw_tab(bigregion, religion, margin = 2, useNA = "ifany", digs = 1))
Data on Revenue and Employees at Yahoo before and during Marissa Mayer's tenure as CEO.
yahooyahoo
A tibble with 4 columns and 12 rows.
QZ.com