Title: | COVID-19 Data |
---|---|
Description: | COVID-19 related data from the ECDC, the COVID-19 Tracking Project, the New York Times, the Human Mortality Database, and Apple. Packaged for R. |
Authors: | Kieran Healy [aut, cre] |
Maintainer: | Kieran Healy <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.01 |
Built: | 2025-02-20 03:47:34 UTC |
Source: | https://github.com/kjhealy/covdata |
%nin%
Convenience 'not-in' operator
x %nin% y
x %nin% y
x |
vector of items |
y |
vector of all values |
Complement of the built-in operator %in%
. Returns the elements of x
that are not in y
.
logical vector of items in x not in y
Kieran Healy
fruit <- c("apples", "oranges", "banana") "apples" %nin% fruit "pears" %nin% fruit
fruit <- c("apples", "oranges", "banana") "apples" %nin% fruit "pears" %nin% fruit
Data from Apple Maps on relative changes in mobility in various cities and countries.
apple_mobility
apple_mobility
A data frame with 2,254,515 rows and 7 variables:
country
character Country name (not provided for all countries)
sub_region
character Subregion names
subregion_and_city
character Subregion and city names
geo_type
character Type geographical unit. Values: city, country/region, sub-region
transportation_type
character Mode of transport. Values: driving, transit, or walking
date
double Date in yyyy-mm-dd format
score
double Activity score. Indexed to 100 on the first date of observation for a given mode of transport.
Table: Data summary
Name | apple_mobility |
Number of rows | 2254515 |
Number of columns | 7 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 5 |
numeric | 1 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2020-01-13 | 2022-04-12 | 2021-02-26 | 819 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
country | 0 | 1 | 5 | 20 | 0 | 63 | 0 |
sub_region | 0 | 1 | 4 | 46 | 0 | 606 | 0 |
subregion_and_city | 0 | 1 | 4 | 46 | 0 | 853 | 0 |
geo_type | 0 | 1 | 4 | 14 | 0 | 3 | 0 |
transportation_type | 0 | 1 | 7 | 7 | 0 | 3 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
score | 608041 | 0.73 | 122.59 | 66.81 | 2.43 | 83.79 | 113.72 | 148.8 | 2228.83 | ▇▁▁▁▁ |
Data made available by Apple, Inc. at https://www.apple.com/covid19/mobility, showing relative volume of directions requests per country/region or city compared to a baseline volume on January 13th, 2020. Apple defines the day as midnight-to-midnight, Pacific time. Cities represent usage in greater metropolitan areas and are stably defined during this period. In many countries/regions and cities, relative volume has increased since January 13th, consistent with normal, seasonal usage of Apple Maps. Day of week effects are important to normalize as you use this data. Data that is sent from users’ devices to the Apple Maps service is associated with random, rotating identifiers so Apple does not have a profile of individual movements and searches. Apple Maps has no demographic information about its users, and so cannot make any statements about the representativeness of its usage against the overall population.
Kieran Healy
https://www.apple.com/covid19/mobility
See https://www.apple.com/covid19/mobility for detailed terms of use.
What the CDC surveillance network covers
cdc_catchments
cdc_catchments
A data frame with 17 rows and 2 variables:
name
character Network name
area
character Area
Table: Data summary
Name | cdc_catchments |
Number of rows | 17 |
Number of columns | 2 |
_______________________ | |
Column type frequency: | |
character | 2 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
name | 0 | 1 | 3 | 9 | 0 | 3 | 0 |
area | 0 | 1 | 4 | 14 | 0 | 15 | 0 |
The Coronavirus Disease 2019 (COVID-19)-Associated Hospitalization Surveillance Network (COVID-NET) conducts population-based surveillance for laboratory-confirmed COVID-19-associated hospitalizations in children (persons younger than 18 years) and adults. The current network covers nearly 100 counties in the 10 Emerging Infections Program (EIP) states (CA, CO, CT, GA, MD, MN, NM, NY, OR, and TN) and four additional states through the Influenza Hospitalization Surveillance Project (IA, MI, OH, and UT). The network represents approximately 10% of US population (~32 million people). Cases are identified by reviewing hospital, laboratory, and admission databases and infection control logs for patients hospitalized with a documented positive SARS-CoV-2 test. Data gathered are used to estimate age-specific hospitalization rates on a weekly basis and describe characteristics of persons hospitalized with COVID-19. Laboratory confirmation is dependent on clinician-ordered SARS-CoV-2 testing. Therefore, the unadjusted rates provided are likely to be underestimated as COVID-19-associated hospitalizations can be missed due to test availability and provider or facility testing practices. COVID-NET hospitalization data are preliminary and subject to change as more data become available. All incidence rates are unadjusted. Please use the following citation when referencing these data: “COVID-NET: COVID-19-Associated Hospitalization Surveillance Network, Centers for Disease Control and Prevention. WEBSITE. Accessed on DATE”.
name | area |
COVID-NET | Entire Network |
EIP | California |
EIP | Colorado |
EIP | Connecticut |
EIP | Entire Network |
EIP | Georgia |
EIP | Maryland |
EIP | Minnesota |
EIP | New Mexico |
EIP | New York |
EIP | Oregon |
EIP | Tennessee |
IHSP | Entire Network |
IHSP | Iowa |
IHSP | Michigan |
IHSP | Ohio |
IHSP | Utah |
Kieran Healy
Courtesy of Bob Rudis's cdccovidview package
https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html
Provisional Death Counts for Coronavirus Disease (COVID-19)
cdc_deaths_by_age
cdc_deaths_by_age
A data frame with 12 rows and 10 variables:
data_as_of
date When the data were most recently recorded
age_group
character Age range
start_week
date Start week
end_week
date End week
covid_deaths
integer COLUMN_DESCRIPTION
total_deaths
integer COLUMN_DESCRIPTION
percent_expected_deaths
double COLUMN_DESCRIPTION
pneumonia_deaths
integer COLUMN_DESCRIPTION
pneumonia_and_covid_deaths
integer COLUMN_DESCRIPTION
all_influenza_deaths_j09_j11
integer COLUMN_DESCRIPTION
Table: Data summary
Name | cdc_deaths_by_age |
Number of rows | 12 |
Number of columns | 10 |
_______________________ | |
Column type frequency: | |
Date | 3 |
character | 1 |
numeric | 6 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
data_as_of | 0 | 1 | 2020-04-30 | 2020-04-30 | 2020-04-30 | 1 |
start_week | 0 | 1 | 2020-02-01 | 2020-02-01 | 2020-02-01 | 1 |
end_week | 0 | 1 | 2020-04-25 | 2020-04-25 | 2020-04-25 | 1 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
age_group | 0 | 1 | 5 | 10 | 0 | 12 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
covid_deaths | 0 | 1 | 5753.50 | 9877.31 | 2.00 | 30.25 | 1211.50 | 7918.25 | 34521.00 | ▇▃▁▁▁ |
total_deaths | 0 | 1 | 118897.67 | 202377.07 | 712.00 | 5675.25 | 28460.00 | 149341.50 | 713386.00 | ▇▂▁▁▁ |
percent_expected_deaths | 0 | 1 | 0.97 | 0.00 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | ▁▁▇▁▁ |
pneumonia_deaths | 0 | 1 | 10454.17 | 18036.25 | 33.00 | 109.00 | 1799.50 | 14114.25 | 62725.00 | ▇▃▁▁▁ |
pneumonia_and_covid_deaths | 0 | 1 | 2550.17 | 4387.93 | 0.00 | 12.50 | 491.50 | 3515.75 | 15301.00 | ▇▃▁▁▁ |
all_influenza_deaths_j09_j11 | 0 | 1 | 970.17 | 1618.90 | 11.00 | 40.75 | 358.50 | 1222.75 | 5821.00 | ▇▃▁▁▁ |
The U.S. Centers for Disease Control provides weekly summary and interpretation of key indicators that have been adapted to track the COVID-19 pandemic in the United States. Data is retrieved using the cdccovidview package from both COVIDView (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html) and COVID-NET (https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html). Please see the indicated reference for all the caveats and precise meanings for each field.
Kieran Healy
Courtesy of Bob Rudis's cdccovidview package
https://data.cdc.gov/api/views/hc4f-j6nb/rows.csv?accessType=DOWNLOAD&bom=true&format=true
Provisional Death Counts for Coronavirus Disease (COVID-19)
cdc_deaths_by_sex
cdc_deaths_by_sex
A data frame with 3 rows and 10 variables:
data_as_of
date Date most recently updated
sex
character Sex
start_week
date Beginning week
end_week
date Ending week
covid_deaths
integer COVID deaths
total_deaths
integer Total deaths
percent_expected_deaths
double COLUMN_DESCRIPTION
pneumonia_deaths
integer COLUMN_DESCRIPTION
pneumonia_and_covid_deaths
integer COLUMN_DESCRIPTION
all_influenza_deaths_j09_j11
integer COLUMN_DESCRIPTION
Table: Data summary
Name | cdc_deaths_by_sex |
Number of rows | 3 |
Number of columns | 10 |
_______________________ | |
Column type frequency: | |
Date | 3 |
character | 1 |
numeric | 6 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
data_as_of | 0 | 1 | 2020-04-30 | 2020-04-30 | 2020-04-30 | 1 |
start_week | 0 | 1 | 2020-02-01 | 2020-02-01 | 2020-02-01 | 1 |
end_week | 0 | 1 | 2020-04-25 | 2020-04-25 | 2020-04-25 | 1 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
sex | 0 | 1 | 4 | 7 | 0 | 3 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
covid_deaths | 0 | 1 | 11507.33 | 10231.40 | 1.00 | 7470.50 | 14940.00 | 17260.50 | 19581.00 | ▇▁▁▇▇ |
total_deaths | 0 | 1 | 237795.00 | 206241.06 | 25.00 | 172555.00 | 345085.00 | 356680.00 | 368275.00 | ▃▁▁▁▇ |
percent_expected_deaths | 0 | 1 | 0.97 | 0.00 | 0.97 | 0.97 | 0.97 | 0.97 | 0.97 | ▁▁▇▁▁ |
pneumonia_deaths | 0 | 1 | 20908.33 | 18248.40 | 1.00 | 14545.00 | 29089.00 | 31362.00 | 33635.00 | ▃▁▁▁▇ |
pneumonia_and_covid_deaths | 0 | 1 | 5100.33 | 4559.67 | 1.00 | 3258.00 | 6515.00 | 7650.00 | 8785.00 | ▇▁▁▇▇ |
all_influenza_deaths_j09_j11 | 0 | 1 | 1940.33 | 1682.21 | 0.00 | 1416.00 | 2832.00 | 2910.50 | 2989.00 | ▃▁▁▁▇ |
The U.S. Centers for Disease Control provides weekly summary and interpretation of key indicators that have been adapted to track the COVID-19 pandemic in the United States. Data is retrieved using the cdccovidview package from both COVIDView (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html) and COVID-NET (https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html). Please see the indicated reference for all the caveats and precise meanings for each field.
Kieran Healy
Courtesy of Bob Rudis's cdccovidview package
https://data.cdc.gov/api/views/hc4f-j6nb/rows.csv?accessType=DOWNLOAD&bom=true&format=true
CDC Surveillance Network provisional death counts
cdc_deaths_by_state
cdc_deaths_by_state
A data frame with 53 rows and 10 variables:
data_as_of
date Date most recently updated
state
character State name
start_week
date Start week
end_week
double End week
covid_deaths
integer COVID Deaths
total_deaths
integer Total deaths
percent_expected_deaths
double COLUMN_DESCRIPTION
pneumonia_deaths
integer COLUMN_DESCRIPTION
pneumonia_and_covid_deaths
integer COLUMN_DESCRIPTION
all_influenza_deaths_j09_j11
integer COLUMN_DESCRIPTION
Table: Data summary
Name | cdc_deaths_by_state |
Number of rows | 53 |
Number of columns | 10 |
_______________________ | |
Column type frequency: | |
Date | 3 |
character | 1 |
numeric | 6 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
data_as_of | 0 | 1 | 2020-04-30 | 2020-04-30 | 2020-04-30 | 1 |
start_week | 0 | 1 | 2020-02-01 | 2020-02-01 | 2020-02-01 | 1 |
end_week | 0 | 1 | 2020-04-25 | 2020-04-25 | 2020-04-25 | 1 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
state | 0 | 1 | 4 | 20 | 0 | 53 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
covid_deaths | 6 | 0.89 | 735.02 | 1801.11 | 0 | 54.50 | 153.00 | 519.00 | 10978.00 | ▇▁▁▁▁ |
total_deaths | 0 | 1.00 | 13557.43 | 13996.83 | 856 | 3813.00 | 10721.00 | 17624.00 | 69341.00 | ▇▂▁▁▁ |
percent_expected_deaths | 0 | 1.00 | 0.93 | 0.27 | 0 | 0.86 | 0.95 | 0.99 | 2.19 | ▁▂▇▁▁ |
pneumonia_deaths | 0 | 1.00 | 1197.26 | 1453.17 | 41 | 277.00 | 769.00 | 1306.00 | 6076.00 | ▇▁▁▁▁ |
pneumonia_and_covid_deaths | 10 | 0.81 | 355.81 | 759.51 | 0 | 30.50 | 65.00 | 296.00 | 4019.00 | ▇▁▁▁▁ |
all_influenza_deaths_j09_j11 | 3 | 0.94 | 116.58 | 142.24 | 14 | 30.50 | 87.50 | 125.50 | 850.00 | ▇▁▁▁▁ |
The U.S. Centers for Disease Control provides weekly summary and interpretation of key indicators that have been adapted to track the COVID-19 pandemic in the United States. Data is retrieved using the cdccovidview package from both COVIDView (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html) and COVID-NET. Please see the indicated reference for all the caveats and precise meanings for each field. (https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html).
Kieran Healy
https://data.cdc.gov/api/views/hc4f-j6nb/rows.csv?accessType=DOWNLOAD&bom=true&format=true
Provisional Death Counts for Coronavirus Disease (COVID-19)
cdc_deaths_by_week
cdc_deaths_by_week
A data frame with 13 rows and 10 variables:
data_as_of
date When the data were most recently recorded
start_week
date Start week
end_week
double End week
covid_deaths
integer COVID deaths
total_deaths
integer Total deaths
percent_expected_deaths
double COLUMN_DESCRIPTION
pneumonia_deaths
integer COLUMN_DESCRIPTION
pneumonia_and_covid_deaths
integer COLUMN_DESCRIPTION
all_influenza_deaths_j09_j11
integer COLUMN_DESCRIPTION
pneumonia_influenza_and_covid_19_deaths
integer COLUMN_DESCRIPTION
Table: Data summary
Name | cdc_deaths_by_week |
Number of rows | 13 |
Number of columns | 10 |
_______________________ | |
Column type frequency: | |
Date | 3 |
numeric | 7 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
data_as_of | 0 | 1 | 2020-04-30 | 2020-04-30 | 2020-04-30 | 1 |
start_week | 0 | 1 | 2020-02-01 | 2020-04-25 | 2020-03-14 | 13 |
end_week | 0 | 1 | 2020-02-01 | 2020-04-25 | 2020-03-14 | 13 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
covid_deaths | 0 | 1 | 2655.46 | 4194.37 | 0.00 | 0.00 | 49.00 | 2659.00 | 11864.00 | ▇▁▁▂▁ |
total_deaths | 0 | 1 | 54875.85 | 9864.46 | 24387.00 | 53940.00 | 56831.00 | 57299.00 | 65676.00 | ▁▁▁▇▂ |
percent_expected_deaths | 0 | 1 | 0.97 | 0.17 | 0.45 | 0.97 | 0.97 | 0.99 | 1.19 | ▁▁▁▇▂ |
pneumonia_deaths | 0 | 1 | 4825.00 | 2217.19 | 2219.00 | 3671.00 | 3692.00 | 5598.00 | 9580.00 | ▇▃▁▁▂ |
pneumonia_and_covid_deaths | 0 | 1 | 1177.00 | 1863.76 | 0.00 | 0.00 | 25.00 | 1220.00 | 5281.00 | ▇▁▁▂▁ |
all_influenza_deaths_j09_j11 | 0 | 1 | 447.77 | 156.19 | 58.00 | 427.00 | 494.00 | 536.00 | 619.00 | ▁▁▁▇▇ |
pneumonia_influenza_and_covid_19_deaths | 0 | 1 | 6690.23 | 4292.62 | 3553.00 | 4165.00 | 4275.00 | 7397.00 | 16272.00 | ▇▁▁▂▁ |
The U.S. Centers for Disease Control provides weekly summary and interpretation of key indicators that have been adapted to track the COVID-19 pandemic in the United States. Data is retrieved using the cdccovidview package from both COVIDView (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html) and COVID-NET (https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html). Please see the indicated reference for all the caveats and precise meanings for each field.
Kieran Healy
Courtesy of Bob Rudis's cdccovidview package
https://data.cdc.gov/api/views/hc4f-j6nb/rows.csv?accessType=DOWNLOAD&bom=true&format=true
Convenience table of country names and their abbreviated names
countries
countries
A data frame with 213 rows and 4 variables:
cname
character Country name
iso3
character ISO 3 designation
iso2
character ISO 2 designation
continent
Continent
Table: Data summary
Name | dplyr::ungroup(countries) |
Number of rows | 213 |
Number of columns | 4 |
_______________________ | |
Column type frequency: | |
character | 4 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
cname | 0 | 1.00 | 4 | 42 | 0 | 213 | 0 |
iso3 | 0 | 1.00 | 3 | 3 | 0 | 213 | 0 |
iso2 | 2 | 0.99 | 2 | 2 | 0 | 211 | 0 |
continent | 0 | 1.00 | 4 | 13 | 0 | 6 | 0 |
Produced from the ECDC tables in the covdata package.
Kieran Healy
ISO 2: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 ISO 3: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3
A dataset containing daily national-level ECDC data on COVID-19. Archived as of December 14th 2020. ECDC switched to a weekly reporting schedule for the COVID-19 situation worldwide and in the EU/EEA and the UK on 17 December 2020. Daily updates have been discontinued from 14 December 2020.
covnat_daily
covnat_daily
A tibble with 61,836 rows and 8 columns
date in YYYY-MM-DD format
Name of country (character)
ISO3 country code (character)
N reported COVID-19 cases for this day
N reported COVID-19 deaths for this day
Country population from Eurostat or UN data
Cumulative N reported COVID-19 cases up to and including this day
Cumulative N reported COVID-19 deaths up to and including this day
Table: Data summary
Name | dplyr::ungroup(covnat_dai... |
Number of rows | 61836 |
Number of columns | 8 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 2 |
numeric | 5 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2019-12-31 | 2020-12-14 | 2020-07-21 | 350 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
cname | 0 | 1 | 4 | 42 | 0 | 213 | 0 |
iso3 | 0 | 1 | 3 | 3 | 0 | 213 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
cases | 0 | 1 | 1156.33 | 6782.63 | -8261 | 0 | 15 | 275.00 | 234633 | ▇▁▁▁▁ |
deaths | 0 | 1 | 26.08 | 131.29 | -1918 | 0 | 0 | 4.00 | 4928 | ▁▇▁▁▁ |
pop | 59 | 1 | 40987698.23 | 153129379.34 | 815 | 1293120 | 7169456 | 28515829.00 | 1433783692 | ▇▁▁▁▁ |
cu_cases | 0 | 1 | 100686.99 | 607743.06 | 0 | 129 | 2055 | 24650.00 | 16256754 | ▇▁▁▁▁ |
cu_deaths | 0 | 1 | 3104.89 | 15545.84 | 0 | 1 | 42 | 464.25 | 299177 | ▇▁▁▁▁ |
A dataset containing weekly national-level ECDC data on COVID-19
covnat_weekly
covnat_weekly
A tibble with 4,966 rows and 11 columns
date in YYYY-MM-DD format
Year and week of reporting (character, YYYY-WW)
Name of country (character)
Country population from Eurostat or UN data
ISO3 country code (character)
N reported COVID-19 cases for this week
N reported COVID-19 deaths for this week
Cumulative N reported COVID-19 cases up to and including this week
Cumulative N reported COVID-19 deaths up to and including this week
14-day notification rate of reported COVID-19 cases per 100,000 population
14-day notification rate of reported COVID-19 cases per 100,000 population
Table: Data summary
Name | dplyr::ungroup(covnat_wee... |
Number of rows | 4966 |
Number of columns | 11 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 3 |
numeric | 7 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2019-12-30 | 2023-01-09 | 2021-07-05 | 159 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
year_week | 0 | 1.00 | 7 | 7 | 0 | 159 | 0 |
cname | 0 | 1.00 | 5 | 14 | 0 | 31 | 0 |
iso3 | 196 | 0.96 | 3 | 3 | 0 | 30 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
pop | 0 | 1.00 | 31613614.13 | 85253844.55 | 39055 | 2108977.00 | 6916548.00 | 17475415.00 | 453006705.00 | ▇▁▁▁▁ |
cases | 222 | 0.96 | 77511.62 | 374657.80 | 0 | 1127.00 | 5487.00 | 28342.00 | 9023067.00 | ▇▁▁▁▁ |
deaths | 279 | 0.94 | 514.14 | 2005.64 | 0 | 8.00 | 46.00 | 250.50 | 28380.00 | ▇▁▁▁▁ |
cu_cases | 222 | 0.96 | 4188407.63 | 16969793.99 | 0 | 43400.25 | 485047.50 | 2117551.00 | 183857564.00 | ▇▁▁▁▁ |
cu_deaths | 279 | 0.94 | 44362.78 | 142967.65 | 0 | 651.00 | 6268.00 | 28807.00 | 1204878.00 | ▇▁▁▁▁ |
r14_cases | 263 | 0.95 | 557.34 | 1044.46 | 0 | 51.61 | 216.74 | 576.99 | 13728.65 | ▇▁▁▁▁ |
r14_deaths | 321 | 0.94 | 34.08 | 50.74 | 0 | 3.81 | 14.21 | 42.57 | 435.28 | ▇▁▁▁▁ |
A dataset containing US state-level data on COVID-19
covus
covus
A tibble with 664,960 rows and 7 columns
Date in YYYY-MM-DD format (date)
Two letter State abbreviation (character)
State FIPS code (character)
data_quality_grade
character Data quality as assessed by COVID Tracking Project staff
Outcome measure for this date
Count of measure
measure_label
character Outcome measure, suitable for use as a plot label
Table: Data summary
Name | covus |
Number of rows | 664960 |
Number of columns | 7 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 4 |
logical | 1 |
numeric | 1 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2020-01-13 | 2021-03-07 | 2020-09-03 | 420 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
state | 0 | 1 | 2 | 2 | 0 | 56 | 0 |
fips | 0 | 1 | 2 | 2 | 0 | 56 | 0 |
measure | 0 | 1 | 5 | 30 | 0 | 31 | 0 |
measure_label | 0 | 1 | 6 | 54 | 0 | 32 | 0 |
Variable type: logical
skim_variable | n_missing | complete_rate | mean | count |
data_quality_grade | 664960 | 0 | NaN | : |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
count | 434365 | 0.35 | 387436.8 | 1638507 | 0 | 498 | 7782 | 134223 | 49646014 | ▇▁▁▁▁ |
The measures tracked by the COVID tracking project are as follows:
measure | measure_label |
positive | Positive Tests |
probable_cases | Probable Cases |
negative | Negative Tests |
pending | Pending Tests |
hospitalized_currently | Currently Hospitalized |
hospitalized_cumulative | Cumulative Hospitalized |
in_icu_currently | Currently in ICU |
in_icu_cumulative | Cumulative in ICU |
on_ventilator_currently | Currently on Ventilator |
on_ventilator_cumulative | Cumulative on Ventilator |
recovered | Recovered |
death | Deaths |
hospitalized_discharged | Total Discharged from Hospital |
total_tests_viral | Total number of PCR tests performed |
positive_tests_viral | Total number of positive PCR tests |
negative_tests_viral | Total number of negative PCR tests |
positive_cases_viral | Total number of positive cases measured with PCR tests |
death_confirmed | Deaths Confirmed |
death_probable | Deaths Probable |
total_test_encounters_viral | Total Test Encounters (PCR) |
total_tests_people_viral | Total PCR Tests (People) |
total_tests_antibody | Total Antibody Tests |
positive_tests_antibody | Positive Antibody Tests |
negative_tests_antibody | Total number of negative antibody tests |
negative_tests_antibody | Negative Antibody Tests |
total_tests_people_antibody | Total Antibody Tests (People) |
positive_tests_people_antibody | Positive Antibody Tests (People) |
negative_tests_people_antibody | Negative Antibody Tests (People) |
total_tests_people_antigen | Total Antigen Tests (People) |
positive_tests_people_antigen | Positive Antigen Tests (People) |
total_tests_antigen | Total Antigen Tests |
positive_tests_antigen | Positive Antigen Tests |
Not all measures are reported by all states.
The positive
, negative
, death
, death_confirmed
, probable_cases
and death_probable
measures are cumulative counts.
death_confirmed
is the total number deaths of individuals with COVID-19 infection confirmed by a laboratory test.
In states where the information is available, it tracks only those laboratory-confirmed deaths where COVID also contributed
to the death according to the death certificate. death_probable
is the total number of deaths where COVID was listed as a
cause of death and there is not a laboratory test confirming COVID-19 infection.
For further information on the COVID Tracking Project's measures, see https://covidtracking.com/about-data/data-definitions
The COVID-19 Tracking Project https://covidtracking.com
The COVID Racial Data Tracker advocates for, collects, publishes, and analyzes racial data on the pandemic across the United States. It’s a collaboration between the COVID Tracking Project and the Boston University Center for Antiracist Research.
covus_ethnicity
covus_ethnicity
A tibble with 15,960 rows and 7 columns
date
date Data reported as of this date
state
character State
group
character Ethnic group
cases
integer Total cases, count
deaths
integer Total deaths, count
hosp
integer Total hospitalizations, count
Table: Data summary
Name | covus_ethnicity |
Number of rows | 15960 |
Number of columns | 7 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 2 |
numeric | 4 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2020-04-12 | 2021-03-07 | 2020-09-23 | 95 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
state | 0 | 1 | 2 | 2 | 0 | 56 | 0 |
group | 0 | 1 | 7 | 12 | 0 | 3 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
cases | 3080 | 0.81 | 73357.18 | 166184.31 | 0 | 5529 | 21920.5 | 70265.5 | 2619476 | ▇▁▁▁▁ |
deaths | 3144 | 0.80 | 1645.64 | 3463.93 | -1 | 63 | 291.5 | 1401.0 | 32664 | ▇▁▁▁▁ |
hosp | 11662 | 0.27 | 5079.37 | 8831.52 | 0 | 556 | 1556.0 | 4959.5 | 56406 | ▇▁▁▁▁ |
tests | 14271 | 0.11 | 892566.44 | 2376098.22 | 0 | 58933 | 224156.0 | 537668.0 | 21633943 | ▇▁▁▁▁ |
The group
variable is coded as "Hispanic", "Non-Hispanic", or "Unknown". Hispanics may be of any race. State-level counts should
be handled with care, given the widely varying population distribution of people of different ethnic backgrounds by state.
Kieran Healy
https://covidtracking.com/race
The COVID Racial Data Tracker advocates for, collects, publishes, and analyzes racial data on the pandemic across the United States. It’s a collaboration between the COVID Tracking Project and the Boston University Center for Antiracist Research.
covus_race
covus_race
A tibble with 47,880 rows and 7 columns
date
date Data reported as of this date
state
character State
group
character Racial group
cases
integer Total cases, count
deaths
integer Total deaths, count
hosp
integer Total hospitalizations, count
Table: Data summary
Name | covus_race |
Number of rows | 47880 |
Number of columns | 7 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 2 |
numeric | 4 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2020-04-12 | 2021-03-07 | 2020-09-23 | 95 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
state | 0 | 1 | 2 | 2 | 0 | 56 | 0 |
group | 0 | 1 | 5 | 11 | 0 | 9 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
cases | 15684 | 0.67 | 30240.68 | 103176.64 | 0 | 568 | 3661 | 21026 | 2619476 | ▇▁▁▁▁ |
deaths | 17686 | 0.63 | 708.93 | 1836.84 | -1 | 12 | 68 | 440 | 24402 | ▇▁▁▁▁ |
hosp | 37253 | 0.22 | 2077.78 | 4654.37 | 0 | 67 | 345 | 1716 | 41099 | ▇▁▁▁▁ |
tests | 43549 | 0.09 | 349773.42 | 1269936.08 | 0 | 6298 | 36108 | 199214 | 18567612 | ▇▁▁▁▁ |
The group
variable is coded as follows:
groups |
White |
Black |
Latino |
Asian |
AI/AN |
NH/PI |
Multiracial |
Other |
Unknown |
AI/AN is American Indian/Alaska Native. NH/PI is Native Hawaiian/Pacific Islander. State-level counts should be handled with care, given the widely varying population distribution of people of different racial backgrounds by state.
Kieran Healy
https://covidtracking.com/race
Format fmt_nc in df
fmt_nc(x)
fmt_nc(x)
x |
df |
use in fn documentation
formatted string
Kieran Healy
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
Format fmt_nr in df
fmt_nr(x)
fmt_nr(x)
x |
df |
use in fn documentation
formatted string
Kieran Healy
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
FUNCTION_DESCRIPTION
mmwr_week_to_date(year, week, day = NULL)
mmwr_week_to_date(year, week, day = NULL)
year |
PARAM_DESCRIPTION |
week |
PARAM_DESCRIPTION |
day |
PARAM_DESCRIPTION, Default: NULL |
DETAILS
OUTPUT_DESCRIPTION
Kieran Healy
http://
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
FUNCTION_DESCRIPTION
MMWRweek2Date(MMWRyear, MMWRweek, MMWRday = NULL)
MMWRweek2Date(MMWRyear, MMWRweek, MMWRday = NULL)
MMWRyear |
PARAM_DESCRIPTION |
MMWRweek |
PARAM_DESCRIPTION |
MMWRday |
PARAM_DESCRIPTION, Default: NULL |
DETAILS
OUTPUT_DESCRIPTION
Kieran Healy
http://
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
FUNCTION_DESCRIPTION
MMWRweekday(date)
MMWRweekday(date)
date |
PARAM_DESCRIPTION |
DETAILS
OUTPUT_DESCRIPTION
Kieran Healy
http://
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
Deaths involving coronavirus disease (COVID-19), pneumonia, and influenza reported to NCHS by sex and age group and state.
nchs_sas
nchs_sas
A tibble with 115,668 rows and 15 variables:
data_as_of
date Date of data release
start_date
date First date of data period
end_date
date Last date of data period
group
character Unit of time observation: whether data in this row are measured By month, By total, or By year
year
integer Year of observation
month
integer Month of observation
state
character Jurisdiction of occurrence. One of: United States total, a US State, District of Columbia, and New York City, separate from New York state.
sex
character Sex
age_group
character Age group
covid_19_deaths
integer Deaths involving COVID-19 (ICD-code U07.1)
total_deaths
integer Deaths from all causes of death
pneumonia_deaths
integer Pneumonia Deaths (ICD-10 codes J12.0-J18.9)
pneumonia_and_covid_19_deaths
integer Deaths with Pneumonia and COVID-19 (ICD-10 codes J12.0-J18.9 and U07.1)
influenza_deaths
integer Influenza Deaths (ICD-10 codes J09-J11)
pneumonia_influenza_or_covid_19_deaths
integer Deaths with Pneumonia, Influenza, or COVID-19 (ICD-10 codes U07.1 or J09-J18.9)
Table: Data summary
Name | nchs_sas |
Number of rows | 115668 |
Number of columns | 15 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 6 |
numeric | 8 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
data_as_of | 0 | 1 | 2023-01-18 | 2023-01-18 | 2023-01-18 | 1 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
start_date | 0 | 1 | 10 | 10 | 0 | 37 | 0 |
end_date | 0 | 1 | 10 | 10 | 0 | 37 | 0 |
group | 0 | 1 | 7 | 8 | 0 | 3 | 0 |
state | 0 | 1 | 4 | 20 | 0 | 54 | 0 |
sex | 0 | 1 | 4 | 9 | 0 | 3 | 0 |
age_group | 0 | 1 | 8 | 17 | 0 | 17 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
year | 2754 | 0.98 | 2021.10 | 0.91 | 2020 | 2020 | 2021 | 2022 | 2023 | ▇▇▁▇▁ |
month | 13770 | 0.88 | 6.35 | 3.52 | 1 | 3 | 6 | 9 | 12 | ▇▅▅▅▇ |
covid_19_deaths | 31823 | 0.72 | 351.76 | 6263.51 | 0 | 0 | 10 | 60 | 1094723 | ▇▁▁▁▁ |
total_deaths | 17146 | 0.85 | 2812.18 | 52269.95 | 0 | 41 | 148 | 648 | 10144808 | ▇▁▁▁▁ |
pneumonia_deaths | 36293 | 0.69 | 349.71 | 6016.66 | 0 | 0 | 17 | 76 | 1030983 | ▇▁▁▁▁ |
pneumonia_and_covid_19_deaths | 30476 | 0.74 | 174.88 | 3162.39 | 0 | 0 | 0 | 26 | 550128 | ▇▁▁▁▁ |
influenza_deaths | 22407 | 0.81 | 4.94 | 103.26 | 0 | 0 | 0 | 0 | 18477 | ▇▁▁▁▁ |
pneumonia_influenza_or_covid_19_deaths | 35678 | 0.69 | 535.21 | 9239.91 | 0 | 0 | 25 | 112 | 1591892 | ▇▁▁▁▁ |
Number of deaths reported in this table are the total number of deaths received and coded as of the date of analysis, and do not represent all deaths that occurred in that period. Data during this period are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more. Missing values may indicate that a category has between 1 and 9 observed cases and have been suppressed in accordance with NHCS confidentiality standards. As of September 2, 2020, this data file includes the following age groups in addition to the age groups that are routinely included: 0-17, 18-29, 30-49, and 50-64. The new age groups are consistent with categories used across CDC COVID-19 surveillance pages. When analyzing the file, the user should make sure to select only the desired age groups. Summing across all age categories provided will result in double counting deaths from certain age groups. Similarly, the state variable includes the United States as a whole, and New York City counted separately from the rest of New York State. The temporal unit of observation also varies, with totals given by year, by month, and overall. It is necessary to first filter the data by desired time unit, region, and age group to ensure there is no double-counting in subsequent calculations.
Kieran Healy
National Center for Health Statistics https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Sex-Age-and-S/9bhg-hcku
https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Sex-Age-and-S/9bhg-hcku
Final counts of deaths by the week the deaths occurred, by state of occurrence, and by select causes of death for 2014-2018, and Provisional counts of deaths by the week the deaths occurred, by state of occurrence, and by select underlying causes of death for 2019-2020. The dataset also includes weekly provisional counts of death for COVID-19, coded to ICD-10 code U07.1 as an underlying or multiple cause of death.
nchs_wdc
nchs_wdc
A data frame with 347,706 rows and 7 variables:
jurisdiction
character Jurisdiction of Occurrence
year
double MMWR Year
week
double MMWR Week
week_ending_date
double MMWR Week ending date
cause_detailed
character Cause with ICD Codes
n
double Count of deaths
cause
character Cause of death
For 2014-2019, death counts in this dataset were derived from the National Vital Statistics System database that provides the most timely access to the data. Therefore, counts may differ slightly from final data due to differences in processing, recoding, and imputation. For 2019-2021, the dataset also includes weekly provisional counts of death for COVID-19, coded to ICD-10 code U07.1 as an underlying or multiple cause of death. Number of deaths reported in this table are the total number of deaths received and coded as of the date of analysis, and do not represent all deaths that occurred in that period. Data for 2020 and 2021 are provisional and may be incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. Causes of death included in this dataset are tabulated by underlying cause of death ICD-10 codes. COVID-19 deaths by underlying cause and multiple cause of death are also included.
Kieran Healy
2014-2019: https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/3yf8-kanr. 2020-2021: https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/muzy-jte6
This report provides a weekly summary of deaths with coronavirus disease 2019 (COVID-19) by select geographic and demographic variables. In this release, counts of deaths are provided by the race and Hispanic origin of the decedent.
nchs_wss
nchs_wss
A tibble with 15,582 rows and 12 variables:
data_as_of
date Date of analysis
start_date
date Start date of coverage
end_date
date End date of coverage
year
character Year. One of "2020", "2021", or "2020/2021".
month
dbl Month
obs_unit
character Unit of observation. One of: By Total, By Year, By Month.
state
character Geographical unit. One of: the United States, a U.S. State, the District of Columbia, or New York City. New York state measures do not include New York City
race_ethnicity
chr Race and ethnic group. One of: Non-Hispanic White, Non-Hispanic Black or African American, Non-Hispanic American Indian or Alaska Native, Non-Hispanic Asian, Non-Hispanic Native Hawaiian or Other Pacific Islander, Non Hispanic more than one race, Hispanic or Latino.
deaths
integer Count of deaths
dist_pct
double Distribution of COVID-19 deaths (%): Deaths for each group as a percent of the total number of COVID-19 deaths reported.
uw_dist_pop_pct
double Unweighted distribution of population (%): Population of each group as a percent of the total population.
wt_dist_pop_pct
double Weighted distribution of population (%): Population of each group as percent of the total population after accounting for how the race and Hispanic origin population is distributed in relation to the geographic areas impacted by COVID-19.
Table: Data summary
Name | nchs_wss |
Number of rows | 15582 |
Number of columns | 12 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 6 |
numeric | 5 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
data_as_of | 0 | 1 | 2023-01-18 | 2023-01-18 | 2023-01-18 | 1 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
start_date | 0 | 1 | 10 | 10 | 0 | 37 | 0 |
end_date | 0 | 1 | 10 | 10 | 0 | 37 | 0 |
year | 0 | 1 | 4 | 9 | 0 | 5 | 0 |
obs_unit | 0 | 1 | 7 | 8 | 0 | 3 | 0 |
state | 0 | 1 | 4 | 20 | 0 | 53 | 0 |
race_ethnicity | 0 | 1 | 18 | 54 | 0 | 7 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
month | 1855 | 0.88 | 6.35 | 3.52 | 1 | 3.0 | 6.0 | 9.0 | 12.0 | ▇▅▅▅▇ |
deaths | 4625 | 0.70 | 596.40 | 8680.87 | 0 | 0.0 | 14.0 | 100.0 | 718968.0 | ▇▁▁▁▁ |
dist_pct | 4625 | 0.70 | 17.59 | 29.22 | 0 | 0.0 | 1.1 | 19.7 | 100.0 | ▇▁▁▁▁ |
uw_dist_pop_pct | 0 | 1.00 | 14.28 | 23.57 | 0 | 0.9 | 3.1 | 12.7 | 92.7 | ▇▁▁▁▁ |
wt_dist_pop_pct | 0 | 1.00 | 13.68 | 21.60 | 0 | 0.5 | 3.2 | 14.4 | 93.6 | ▇▁▁▁▁ |
The percent of deaths reported in this table are the total number of represent all deaths received and coded as of the date of analysis and do not represent all deaths that occurred in that period. Data are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more, depending on the jurisdiction, age, and cause of death. Provisional counts reported here track approximately 1–2 weeks behind other published data sources on the number of COVID-19 deaths in the U.S. COVID-19 deaths are defined as having confirmed or presumed COVID-19, and are coded to ICD–10 code U07.1. Unweighted population percentages are based on the Single-Race Population Estimates from the U.S. Census Bureau, for the year 2018 (available from: https://wonder.cdc.gov/single-race-population.html). Weighted population percentages are computed by multiplying county-level population counts by the count of COVID deaths for each county, summing to the state-level, and then estimating the percent of the population within each racial and ethnic group. These weighted population distributions therefore more accurately reflect the geographic locations where COVID outbreaks are occurring. Jurisdictions are included in this table if more than 100 deaths were received and processed by NCHS as of the data of analysis.
Race and Hispanic-origin categories are based on the 1997 Office of Management and Budget (OMB) standards (1,2), allowing for the presentation of data by single race and Hispanic origin. These race and Hispanic-origin groups—non-Hispanic single-race white, non-Hispanic single-race black or African American, non-Hispanic single-race American Indian or Alaska Native (AIAN), non-Hispanic single-race Asian, and non-Hispanic single-race Native Hawaiian and Other Pacific Islander —differ from the bridged-race categories shown in most reports using mortality data.
New York State totals exclude New York City (provided in table separately).
Missing values may indicate that a category has between 1 and 9 observed cases and have been suppressed in accordance with NHCS confidentiality standards.
Kieran Healy
National Center for Health Statistics https://data.cdc.gov/NCHS/Provisional-Death-Counts-for-Coronavirus-Disease-C/pj7m-y5uh
National Syndromic Surveillance Program (NSSP): Emergency Department Visits and Percentage of Visits for COVID-19-Like Illness (CLI) or Influenza-like Illness (ILI)
nssp_covid_er_nat
nssp_covid_er_nat
A data frame with 54 rows and 9 variables:
week
integer COLUMN_DESCRIPTION
num_fac
integer COLUMN_DESCRIPTION
total_ed_visits
character COLUMN_DESCRIPTION
visits
integer COLUMN_DESCRIPTION
pct_visits
double COLUMN_DESCRIPTION
visit_type
character COLUMN_DESCRIPTION
region
character COLUMN_DESCRIPTION
source
character COLUMN_DESCRIPTION
year
integer COLUMN_DESCRIPTION
Table: Data summary
Name | nssp_covid_er_nat |
Number of rows | 54 |
Number of columns | 9 |
_______________________ | |
Column type frequency: | |
character | 4 |
numeric | 5 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
total_ed_visits | 0 | 1 | 7 | 7 | 0 | 27 | 0 |
visit_type | 0 | 1 | 3 | 3 | 0 | 2 | 0 |
region | 0 | 1 | 8 | 8 | 0 | 1 | 0 |
source | 0 | 1 | 21 | 21 | 0 | 1 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
week | 0 | 1 | 26.04 | 19.81 | 1.00 | 7.25 | 14.00 | 45.75 | 52.00 | ▇▂▁▂▇ |
num_fac | 0 | 1 | 3346.89 | 48.97 | 3249.00 | 3329.50 | 3352.00 | 3389.50 | 3406.00 | ▃▁▆▃▇ |
visits | 0 | 1 | 41521.67 | 16344.25 | 17639.00 | 31216.00 | 39183.50 | 50532.00 | 86088.00 | ▅▇▃▂▁ |
pct_visits | 0 | 1 | 0.02 | 0.01 | 0.01 | 0.01 | 0.02 | 0.02 | 0.05 | ▇▆▂▁▂ |
year | 0 | 1 | 2019.52 | 0.50 | 2019.00 | 2019.00 | 2020.00 | 2020.00 | 2020.00 | ▇▁▁▁▇ |
The U.S. Centers for Disease Control provides weekly summary and interpretation of key indicators that have been adapted to track the COVID-19 pandemic in the United States. Data is retrieved using the cdccovidview package from both COVIDView (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html) and COVID-NET (https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html).
Kieran Healy
Courtesy of Bob Rudis's cdccovidview package
https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/04102020/nssp-regions.html
Regional Syndromic Surveillance Program (NSSP): Emergency Department Visits and Percentage of Visits for COVID-19-Like Illness (CLI) or Influenza-like Illness (ILI)
nssp_covid_er_reg
nssp_covid_er_reg
A tibble with 538 rows and 9 variables:
week
integer COLUMN_DESCRIPTION
num_fac
integer COLUMN_DESCRIPTION
total_ed_visits
character COLUMN_DESCRIPTION
visits
integer COLUMN_DESCRIPTION
pct_visits
double COLUMN_DESCRIPTION
visit_type
character COLUMN_DESCRIPTION
region
character COLUMN_DESCRIPTION
source
character COLUMN_DESCRIPTION
year
integer COLUMN_DESCRIPTION
Table: Data summary
Name | nssp_covid_er_reg |
Number of rows | 538 |
Number of columns | 9 |
_______________________ | |
Column type frequency: | |
character | 4 |
numeric | 5 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
total_ed_visits | 0 | 1 | 5 | 6 | 0 | 269 | 0 |
visit_type | 0 | 1 | 3 | 3 | 0 | 2 | 0 |
region | 0 | 1 | 8 | 9 | 0 | 10 | 0 |
source | 0 | 1 | 21 | 21 | 0 | 1 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
week | 0 | 1 | 25.99 | 19.66 | 1 | 7.00 | 14.00 | 46.00 | 52.00 | ▇▂▁▂▇ |
num_fac | 0 | 1 | 335.18 | 234.58 | 135 | 190.00 | 222.00 | 343.00 | 884.00 | ▇▃▁▂▂ |
visits | 0 | 1 | 4164.87 | 4028.53 | 279 | 1596.00 | 2780.00 | 4723.75 | 23345.00 | ▇▂▁▁▁ |
pct_visits | 0 | 1 | 0.02 | 0.01 | 0 | 0.01 | 0.02 | 0.02 | 0.11 | ▇▂▁▁▁ |
year | 0 | 1 | 2019.52 | 0.50 | 2019 | 2019.00 | 2020.00 | 2020.00 | 2020.00 | ▇▁▁▁▇ |
The U.S. Centers for Disease Control provides weekly summary and interpretation of key indicators that have been adapted to track the COVID-19 pandemic in the United States. Data is retrieved using the cdccovidview package from both COVIDView (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html) and COVID-NET (https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html).
Kieran Healy
Courtesy of Bob Rudis's cdccovidview package
https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/04102020/nssp-regions.html
A dataset containing US county-level data on COVID-19, collected by the New York Times.
nytcovcounty
nytcovcounty
A tibble with 2,502,832 rows and 6 columns
Date in YYYY-MM-DD format (date)
County name (character)
State name (character)
County FIPS code (character)
Cumulative N reported cases
Cumulative N reported deaths
Table: Data summary
Name | nytcovcounty |
Number of rows | 2502832 |
Number of columns | 6 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 3 |
numeric | 2 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2020-01-21 | 2022-05-13 | 2021-04-23 | 844 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
county | 0 | 1.00 | 3 | 35 | 0 | 1932 | 0 |
state | 0 | 1.00 | 4 | 24 | 0 | 56 | 0 |
fips | 23678 | 0.99 | 5 | 5 | 0 | 3220 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
cases | 0 | 1.00 | 10033.80 | 47525.22 | 0 | 382 | 1773 | 5884 | 2908425 | ▇▁▁▁▁ |
deaths | 57605 | 0.98 | 161.61 | 820.33 | 0 | 6 | 33 | 101 | 40267 | ▇▁▁▁▁ |
The New York Times https://github.com/nytimes/covid-19-data For details on the methods and limitations see https://github.com/nytimes/covid-19-data. For county data, note in particular:
New York: All cases for the five boroughs of New York City (New York, Kings, Queens, Bronx and Richmond counties) are assigned to a single area called New York City. There is a large jump in the number of deaths on April 6th due to switching from data from New York City to data from New York state for deaths. For all New York state counties, starting on April 8th we are reporting deaths by place of fatality instead of residence of individual.
Kansas City, Mo: Four counties (Cass, Clay, Jackson and Platte) overlap the municipality of Kansas City, Mo. The cases and deaths that we show for these four counties are only for the portions exclusive of Kansas City. Cases and deaths for Kansas City are reported as their own line.
Alameda County, Calif: Counts for Alameda County include cases and deaths from Berkeley and the Grand Princess cruise ship.
Douglas County, Neb. Counts for Douglas County include cases brought to the state from the Diamond Princess cruise ship.
Chicago: All cases and deaths for Chicago are reported as part of Cook County.
Guam: Counts for Guam include cases reported from the USS Theodore Roosevelt.
A dataset containing US state-level data on COVID-19, collected by the New York Times.
nytcovstate
nytcovstate
A tibble with 58,526 rows and 5 columns
Date in YYYY-MM-DD format (date)
State name (character)
State FIPS code (character)
Cumulative N reported cases
Cumulative N reported deaths
Table: Data summary
Name | nytcovstate |
Number of rows | 58526 |
Number of columns | 5 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 2 |
numeric | 2 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2020-01-21 | 2023-01-21 | 2021-08-16 | 1097 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
state | 0 | 1 | 4 | 24 | 0 | 56 | 0 |
fips | 0 | 1 | 2 | 2 | 0 | 56 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
cases | 0 | 1 | 834511.91 | 1394631.70 | 1 | 64160 | 324958 | 985279.8 | 11955605 | ▇▁▁▁▁ |
deaths | 0 | 1 | 11294.84 | 16797.98 | 0 | 1080 | 4790 | 14373.0 | 101982 | ▇▁▁▁▁ |
The New York Times https://github.com/nytimes/covid-19-data. For details on the methods and limitations see https://github.com/nytimes/covid-19-data.
A dataset containing US national-level data on COVID-19, collected by the New York Times.
nytcovus
nytcovus
A tibble with 1,097 rows and 3 columns
Date in YYYY-MM-DD format (date)
Cumulative N reported cases
Cumulative N reported deaths
Table: Data summary
Name | nytcovus |
Number of rows | 1097 |
Number of columns | 3 |
_______________________ | |
Column type frequency: | |
Date | 1 |
numeric | 2 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
date | 0 | 1 | 2020-01-21 | 2023-01-21 | 2021-07-22 | 1097 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
cases | 0 | 1 | 44522009.0 | 35239239.4 | 1 | 8404635 | 34364829 | 80836264 | 101726588 | ▇▆▃▂▆ |
deaths | 0 | 1 | 602590.7 | 370532.5 | 0 | 222195 | 609870 | 989584 | 1111011 | ▆▂▅▃▇ |
The New York Times https://github.com/nytimes/covid-19-data. For details on the methods and limitations see https://github.com/nytimes/covid-19-data.
All-cause mortality is widely used by demographers and other researchers to understand the full impact of deadly events, including epidemics, wars and natural disasters. The totals in this data include deaths from Covid-19 as well as those from other causes, likely including people who could not be treated or did not seek treatment for other conditions.
nytexcess
nytexcess
A tibble with 7,258 rows and 12 columns
country
character Country Name
placename
character Place Name
frequency
character Reporting period. Weekly or monthly, depending on how the data is recorded.
start_date
date The first date included in the period.
end_date
date The last date included in the period,
year
character Year of data. Note that this variable is of type character and not integer because several observations are notes to the effect that the year is an average of two years.
month
integer Numerical month.
week
integer Numerical week.
deaths
integer The total number of confirmed deaths recorded from any cause.
expected_deaths
integer The baseline number of expected deaths, calculated from a historical average. See details below.
excess_deaths
integer The number of deaths minus the expected deaths.
baseline
character The years used to calculate expected_deaths.
Table: Data summary
Name | nytexcess |
Number of rows | 7258 |
Number of columns | 12 |
_______________________ | |
Column type frequency: | |
Date | 2 |
character | 5 |
numeric | 5 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
start_date | 768 | 0.89 | 2010-01-09 | 2020-12-23 | 2018-02-05 | 1267 |
end_date | 768 | 0.89 | 2010-01-15 | 2020-12-29 | 2018-02-11 | 1267 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
country | 0 | 1.00 | 4 | 14 | 0 | 35 | 0 |
placename | 6883 | 0.05 | 6 | 8 | 0 | 4 | 0 |
frequency | 0 | 1.00 | 6 | 7 | 0 | 2 | 0 |
year | 0 | 1.00 | 4 | 17 | 0 | 15 | 0 |
baseline | 5990 | 0.17 | 20 | 25 | 0 | 7 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
month | 0 | 1.00 | 6.60 | 3.36 | 1 | 4.00 | 7.0 | 9.0 | 12 | ▇▆▆▆▇ |
week | 666 | 0.91 | 26.77 | 14.58 | 2 | 14.00 | 27.0 | 39.0 | 52 | ▇▇▇▇▇ |
deaths | 0 | 1.00 | 7968.24 | 14334.14 | 455 | 1460.00 | 2395.5 | 10486.0 | 141292 | ▇▁▁▁▁ |
expected_deaths | 5990 | 0.17 | 9237.09 | 15850.00 | 548 | 1443.00 | 2423.0 | 10771.5 | 139343 | ▇▁▁▁▁ |
excess_deaths | 5990 | 0.17 | 1195.43 | 3242.72 | -6721 | -42.25 | 76.5 | 926.0 | 30400 | ▇▂▁▁▁ |
Expected deaths for each area based on historical data for the same time of year. These expected deaths are the basis for our excess death calculations, which estimate how many more people have died this year than in an average year.
The number of years used in the historical averages changes depending on what data is available, whether it is reliable and underlying demographic changes. See Data Sources for the years used to calculate the baselines. The baselines do not adjust for changes in age or other demographics, and they do not account for changes in total population.
The number of expected deaths are not adjusted for how non-Covid-19 deaths may change during the outbreak, which will take some time to figure out. As countries impose control measures, deaths from causes like road accidents and homicides may decline. And people who die from Covid-19 cannot die later from other causes, which may reduce other causes of death. Both of these factors, if they play a role, would lead these baselines to understate, rather than overstate, the number of excess deaths.
Kieran Healy
The New York Times https://github.com/nytimes/covid-19-data/tree/master/excess-deaths.
For further details on these data see https://github.com/nytimes/covid-19-data/tree/master/excess-deaths
FUNCTION_DESCRIPTION
start_date(year)
start_date(year)
year |
PARAM_DESCRIPTION |
DETAILS
OUTPUT_DESCRIPTION
AUTHOR_NAME
http://
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
Human Mortality Database (HMD) series of weekly death counts across countries.
stmf
stmf
A tibble with 580,395 rows and 17 variables:
country_code
Mortality database country code
cname
character Country name
iso2
character ISO2 country code
iso3
character ISO3 country code
year
double Year
week
double Week number. Each year in the STMF refers to 52 weeks, each week has 7 days. In some cases, the first week of a year may include several days from the previous year or the last week of a year may include days (and, respectively, deaths) of the next year. In particular, it means that a statistical year in the STMF is equal to the statistical year in annual country-specific statistics.
sex
character Sex. m = Males. f = Females. b = Both combined.
split
double Indicates if data were split from aggregated age groups (0 if the original data has necessary detailed age scale). For example, if the original age scale was 0-4, 5-29, 30-65, 65+, then split will be equal to 1
split_sex
double Indicates if the original data are available by sex (0) or data are interpolated (1)
forecast
double Equals 1 for all years where forecasted population exposures were used to calculate weekly death rates.
approx_date
double Approximate date (derived from the year
and week
number).
age_group
character Age group for death counts and rates
death_count
double Weekly death count. This number need not be an integer, because the age categories may be aggregated or split across the source national data.
death_rate
double Weekly death rate.
deaths_total
double Count of deaths for all ages combined.
rate_total
double Crude death rate.
For further details on the construction of this dataset see the codebook at https://www.mortality.org/Public/STMF_DOC/STMFNote.pdf. For the original input data files in standardized form, see https://www.mortality.org/Public/STMF/Inputs/STMFinput.zip.
Countries and years covered in the dataset:
cname | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 |
Australia | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y |
Austria | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Belgium | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Bulgaria | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Canada | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Chile | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y |
Croatia | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Czech Republic | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Denmark | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
England and Wales | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Estonia | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Finland | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
France | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Germany | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Greece | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y |
Hungary | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Iceland | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Israel | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Italy | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Korea, Republic of | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Latvia | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Lithuania | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Luxembourg | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Netherlands | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
New Zealand | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Northern Ireland | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y |
Norway | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Poland | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Portugal | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Russian Federation | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | - | - |
Scotland | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Slovakia | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Slovenia | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Spain | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Sweden | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Switzerland | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Taiwan, Province of China | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | - |
United States | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | Y | Y | Y | Y | Y | Y | Y | Y |
Variables Table: Data summary
Name | stmf |
Number of rows | 580395 |
Number of columns | 17 |
_______________________ | |
Column type frequency: | |
Date | 1 |
character | 7 |
numeric | 9 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
approx_date | 0 | 1 | 1990-01-07 | 2023-01-01 | 2012-10-07 | 1722 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
country_code | 0 | 1.00 | 3 | 7 | 0 | 38 | 0 |
cname | 0 | 1.00 | 5 | 25 | 0 | 38 | 0 |
iso2 | 34380 | 0.94 | 2 | 2 | 0 | 35 | 0 |
continent | 35850 | 0.94 | 4 | 13 | 0 | 5 | 0 |
iso3 | 34380 | 0.94 | 3 | 3 | 0 | 35 | 0 |
sex | 0 | 1.00 | 1 | 1 | 0 | 3 | 0 |
age_group | 0 | 1.00 | 3 | 5 | 0 | 5 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
year | 0 | 1 | 2011.58 | 6.88 | 1990 | 2006.00 | 2012.00 | 2017.00 | 2022.00 | ▁▂▆▆▇ |
week | 0 | 1 | 26.50 | 15.03 | 1 | 13.00 | 26.00 | 39.00 | 53.00 | ▇▇▇▇▇ |
split | 0 | 1 | 0.12 | 0.32 | 0 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▁ |
split_sex | 0 | 1 | 0.00 | 0.07 | 0 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▁ |
forecast | 0 | 1 | 0.10 | 0.30 | 0 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▁ |
death_count | 0 | 1 | 617.60 | 1585.49 | 0 | 39.00 | 162.00 | 449.75 | 26362.00 | ▇▁▁▁▁ |
death_rate | 0 | 1 | 0.05 | 0.07 | 0 | 0.00 | 0.02 | 0.07 | 0.57 | ▇▂▁▁▁ |
deaths_total | 0 | 1 | 3088.00 | 6498.29 | 2 | 472.00 | 998.00 | 2543.00 | 87413.00 | ▇▁▁▁▁ |
rate_total | 0 | 1 | 0.01 | 0.00 | 0 | 0.01 | 0.01 | 0.01 | 0.04 | ▅▇▁▁▁ |
Kieran Healy
Human Mortality Database, http://mortality.org
"Short-term Mortality Fluctuations Dataseries" n.d., https://www.mortality.org/Public/STMF_DOC/STMFNote.pdf
Make a table of stmf country years
stmf_country_years(df = stmf)
stmf_country_years(df = stmf)
df |
The stmf data frame |
Get a table of country x year coverage for stmf
A tibble
Kieran Healy
http://
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
Make an Rd table from a data frame
tabular(df, ...)
tabular(df, ...)
df |
Data frame |
... |
Other args |
DETAILS
Rd table
Kieran Healy
http://
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
## Not run: if(interactive()){ #EXAMPLE1 } ## End(Not run)
Population estimates for US States as of July 1st 2018
uspop
uspop
A tibble with 459 rows and 17 variables:
state
character State Name
state_abbr
character State Abbreviation
statefips
character 2-digit FIPS code
region_name
character Census region
division_name
character Census Division
sex_id
character Sex id
sex
character Sex label
hisp_id
character Ethnicity: Hispanic id
hisp_label
character Hispanic label
fips
character Full FIPS code
pop
double Total population
white
double Race alone: White
black
double Race alone: Black or African-American
amind
double Race alone: American Indian and Alaska Native
asian
double Race alone: Asian
nhopi
double Race alone: Native Hawaiian and Other Pacific Islander
tom
double Race alone: Two or more races
Table: Data summary
Name | uspop |
Number of rows | 459 |
Number of columns | 17 |
_______________________ | |
Column type frequency: | |
character | 10 |
numeric | 7 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
state | 0 | 1.00 | 4 | 20 | 0 | 51 | 0 |
state_abbr | 9 | 0.98 | 2 | 2 | 0 | 50 | 0 |
statefips | 0 | 1.00 | 2 | 2 | 0 | 51 | 0 |
region_name | 9 | 0.98 | 4 | 9 | 0 | 4 | 0 |
division_name | 9 | 0.98 | 7 | 18 | 0 | 9 | 0 |
sex_id | 0 | 1.00 | 4 | 6 | 0 | 3 | 0 |
sex | 0 | 1.00 | 4 | 10 | 0 | 3 | 0 |
hisp_id | 0 | 1.00 | 4 | 7 | 0 | 3 | 0 |
hisp_label | 0 | 1.00 | 5 | 12 | 0 | 3 | 0 |
fips | 0 | 1.00 | 11 | 11 | 0 | 51 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
pop | 0 | 1 | 2851132.32 | 4198641.26 | 6154 | 386961.5 | 1349442 | 3558480.0 | 39557045 | ▇▁▁▁▁ |
white | 0 | 1 | 2179861.40 | 3116129.25 | 5120 | 296294.0 | 1088503 | 2759335.5 | 28531740 | ▇▁▁▁▁ |
black | 0 | 1 | 381736.98 | 644380.66 | 260 | 11907.0 | 80714 | 486281.5 | 3673855 | ▇▁▁▁▁ |
amind | 0 | 1 | 36143.97 | 65036.83 | 161 | 6103.5 | 15273 | 35770.5 | 651076 | ▇▁▁▁▁ |
asian | 0 | 1 | 168458.39 | 515557.14 | 79 | 5045.5 | 26484 | 140424.5 | 6063600 | ▇▁▁▁▁ |
nhopi | 0 | 1 | 6966.61 | 18657.18 | 23 | 669.0 | 2029 | 5063.5 | 199872 | ▇▁▁▁▁ |
tom | 0 | 1 | 77964.97 | 131251.16 | 455 | 12091.0 | 33757 | 98669.5 | 1554757 | ▇▁▁▁▁ |
U.S. Census estimates. Be aware of the US Census classifications of Race and Ethnicity. For the estimated total population for each State, jointly filter on totsex
in sex_id
and tothisp
in hisp_id
and then select pop
.
Kieran Healy
https://www.census.gov/data/datasets/time-series/demo/popest/2010s-state-detail.html
https://www2.census.gov/programs-surveys/popest/tables/2010-2018/state/asrh/PEPSR6H.pdf