Package 'covdata' reference manual

Title:	COVID-19 Data
Description:	COVID-19 related data from the ECDC, the COVID-19 Tracking Project, the New York Times, the Human Mortality Database, and Apple. Packaged for R.
Authors:	Kieran Healy [aut, cre]
Maintainer:	Kieran Healy <[email protected]>
License:	MIT + file LICENSE
Version:	1.01
Built:	2025-03-22 04:02:47 UTC
Source:	https://github.com/kjhealy/covdata

`⁠%nin%⁠`

Description

Convenience 'not-in' operator

Usage

x %nin% y
x %nin% y

Arguments

`x`	vector of items
`y`	vector of all values

Details

Complement of the built-in operator %in%. Returns the elements of x that are not in y.

Value

logical vector of items in x not in y

Author(s)

Kieran Healy

Examples

fruit <- c("apples", "oranges", "banana")
"apples" %nin% fruit
"pears" %nin% fruit
fruit <- c("apples", "oranges", "banana")
"apples" %nin% fruit
"pears" %nin% fruit

Apple Mobility Data

Description

Data from Apple Maps on relative changes in mobility in various cities and countries.

Usage

apple_mobility
apple_mobility

Format

A data frame with 2,254,515 rows and 7 variables:

country: character Country name (not provided for all countries)
sub_region: character Subregion names
subregion_and_city: character Subregion and city names
geo_type: character Type geographical unit. Values: city, country/region, sub-region
transportation_type: character Mode of transport. Values: driving, transit, or walking
date: double Date in yyyy-mm-dd format
score: double Activity score. Indexed to 100 on the first date of observation for a given mode of transport.

Details

Table: Data summary


Name	apple_mobility
Number of rows	2254515
Number of columns	7
_______________________
Column type frequency:
Date	1
character	5
numeric	1
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2020-01-13	2022-04-12	2021-02-26	819

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
country	0	1	5	20	0	63	0
sub_region	0	1	4	46	0	606	0
subregion_and_city	0	1	4	46	0	853	0
geo_type	0	1	4	14	0	3	0
transportation_type	0	1	7	7	0	3	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
score	608041	0.73	122.59	66.81	2.43	83.79	113.72	148.8	2228.83	▇▁▁▁▁

Data made available by Apple, Inc. at https://www.apple.com/covid19/mobility, showing relative volume of directions requests per country/region or city compared to a baseline volume on January 13th, 2020. Apple defines the day as midnight-to-midnight, Pacific time. Cities represent usage in greater metropolitan areas and are stably defined during this period. In many countries/regions and cities, relative volume has increased since January 13th, consistent with normal, seasonal usage of Apple Maps. Day of week effects are important to normalize as you use this data. Data that is sent from users’ devices to the Apple Maps service is associated with random, rotating identifiers so Apple does not have a profile of individual movements and searches. Apple Maps has no demographic information about its users, and so cannot make any statements about the representativeness of its usage against the overall population.

Author(s)

Kieran Healy

Source

https://www.apple.com/covid19/mobility

References

See https://www.apple.com/covid19/mobility for detailed terms of use.

CDC surveillance network and network catchment area

Description

What the CDC surveillance network covers

Usage

cdc_catchments
cdc_catchments

Format

A data frame with 17 rows and 2 variables:

name: character Network name
area: character Area

Details

Table: Data summary


Name	cdc_catchments
Number of rows	17
Number of columns	2
_______________________
Column type frequency:
character	2
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
name	0	1	3	9	0	3	0
area	0	1	4	14	0	15	0

The Coronavirus Disease 2019 (COVID-19)-Associated Hospitalization Surveillance Network (COVID-NET) conducts population-based surveillance for laboratory-confirmed COVID-19-associated hospitalizations in children (persons younger than 18 years) and adults. The current network covers nearly 100 counties in the 10 Emerging Infections Program (EIP) states (CA, CO, CT, GA, MD, MN, NM, NY, OR, and TN) and four additional states through the Influenza Hospitalization Surveillance Project (IA, MI, OH, and UT). The network represents approximately 10% of US population (~32 million people). Cases are identified by reviewing hospital, laboratory, and admission databases and infection control logs for patients hospitalized with a documented positive SARS-CoV-2 test. Data gathered are used to estimate age-specific hospitalization rates on a weekly basis and describe characteristics of persons hospitalized with COVID-19. Laboratory confirmation is dependent on clinician-ordered SARS-CoV-2 testing. Therefore, the unadjusted rates provided are likely to be underestimated as COVID-19-associated hospitalizations can be missed due to test availability and provider or facility testing practices. COVID-NET hospitalization data are preliminary and subject to change as more data become available. All incidence rates are unadjusted. Please use the following citation when referencing these data: “COVID-NET: COVID-19-Associated Hospitalization Surveillance Network, Centers for Disease Control and Prevention. WEBSITE. Accessed on DATE”.

name	area
COVID-NET	Entire Network
EIP	California
EIP	Colorado
EIP	Connecticut
EIP	Entire Network
EIP	Georgia
EIP	Maryland
EIP	Minnesota
EIP	New Mexico
EIP	New York
EIP	Oregon
EIP	Tennessee
IHSP	Entire Network
IHSP	Iowa
IHSP	Michigan
IHSP	Ohio
IHSP	Utah

Author(s)

Kieran Healy

Source

Courtesy of Bob Rudis's cdccovidview package

References

https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html

CDC Surveillance Network Death Counts by Age

Description

Provisional Death Counts for Coronavirus Disease (COVID-19)

Usage

cdc_deaths_by_age
cdc_deaths_by_age

Format

A data frame with 12 rows and 10 variables:

data_as_of: date When the data were most recently recorded
age_group: character Age range
start_week: date Start week
end_week: date End week
covid_deaths: integer COLUMN_DESCRIPTION
total_deaths: integer COLUMN_DESCRIPTION
percent_expected_deaths: double COLUMN_DESCRIPTION
pneumonia_deaths: integer COLUMN_DESCRIPTION
pneumonia_and_covid_deaths: integer COLUMN_DESCRIPTION
all_influenza_deaths_j09_j11: integer COLUMN_DESCRIPTION

Details

Table: Data summary


Name	cdc_deaths_by_age
Number of rows	12
Number of columns	10
_______________________
Column type frequency:
Date	3
character	1
numeric	6
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
data_as_of	0	1	2020-04-30	2020-04-30	2020-04-30	1
start_week	0	1	2020-02-01	2020-02-01	2020-02-01	1
end_week	0	1	2020-04-25	2020-04-25	2020-04-25	1

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
age_group	0	1	5	10	0	12	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
covid_deaths	0	1	5753.50	9877.31	2.00	30.25	1211.50	7918.25	34521.00	▇▃▁▁▁
total_deaths	0	1	118897.67	202377.07	712.00	5675.25	28460.00	149341.50	713386.00	▇▂▁▁▁
percent_expected_deaths	0	1	0.97	0.00	0.97	0.97	0.97	0.97	0.97	▁▁▇▁▁
pneumonia_deaths	0	1	10454.17	18036.25	33.00	109.00	1799.50	14114.25	62725.00	▇▃▁▁▁
pneumonia_and_covid_deaths	0	1	2550.17	4387.93	0.00	12.50	491.50	3515.75	15301.00	▇▃▁▁▁
all_influenza_deaths_j09_j11	0	1	970.17	1618.90	11.00	40.75	358.50	1222.75	5821.00	▇▃▁▁▁

The U.S. Centers for Disease Control provides weekly summary and interpretation of key indicators that have been adapted to track the COVID-19 pandemic in the United States. Data is retrieved using the cdccovidview package from both COVIDView (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html) and COVID-NET (https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html). Please see the indicated reference for all the caveats and precise meanings for each field.

Author(s)

Kieran Healy

Source

Courtesy of Bob Rudis's cdccovidview package

References

https://data.cdc.gov/api/views/hc4f-j6nb/rows.csv?accessType=DOWNLOAD&bom=true&format=true

CDC provisional death counts by sex

Description

Provisional Death Counts for Coronavirus Disease (COVID-19)

Usage

cdc_deaths_by_sex
cdc_deaths_by_sex

Format

A data frame with 3 rows and 10 variables:

data_as_of: date Date most recently updated
sex: character Sex
start_week: date Beginning week
end_week: date Ending week
covid_deaths: integer COVID deaths
total_deaths: integer Total deaths
percent_expected_deaths: double COLUMN_DESCRIPTION
pneumonia_deaths: integer COLUMN_DESCRIPTION
pneumonia_and_covid_deaths: integer COLUMN_DESCRIPTION
all_influenza_deaths_j09_j11: integer COLUMN_DESCRIPTION

Details

Table: Data summary


Name	cdc_deaths_by_sex
Number of rows	3
Number of columns	10
_______________________
Column type frequency:
Date	3
character	1
numeric	6
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
data_as_of	0	1	2020-04-30	2020-04-30	2020-04-30	1
start_week	0	1	2020-02-01	2020-02-01	2020-02-01	1
end_week	0	1	2020-04-25	2020-04-25	2020-04-25	1

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
sex	0	1	4	7	0	3	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
covid_deaths	0	1	11507.33	10231.40	1.00	7470.50	14940.00	17260.50	19581.00	▇▁▁▇▇
total_deaths	0	1	237795.00	206241.06	25.00	172555.00	345085.00	356680.00	368275.00	▃▁▁▁▇
percent_expected_deaths	0	1	0.97	0.00	0.97	0.97	0.97	0.97	0.97	▁▁▇▁▁
pneumonia_deaths	0	1	20908.33	18248.40	1.00	14545.00	29089.00	31362.00	33635.00	▃▁▁▁▇
pneumonia_and_covid_deaths	0	1	5100.33	4559.67	1.00	3258.00	6515.00	7650.00	8785.00	▇▁▁▇▇
all_influenza_deaths_j09_j11	0	1	1940.33	1682.21	0.00	1416.00	2832.00	2910.50	2989.00	▃▁▁▁▇

Author(s)

Kieran Healy

Source

Courtesy of Bob Rudis's cdccovidview package

References

https://data.cdc.gov/api/views/hc4f-j6nb/rows.csv?accessType=DOWNLOAD&bom=true&format=true

CDC provisional death counts by state

Description

CDC Surveillance Network provisional death counts

Usage

cdc_deaths_by_state
cdc_deaths_by_state

Format

A data frame with 53 rows and 10 variables:

data_as_of: date Date most recently updated
state: character State name
start_week: date Start week
end_week: double End week
covid_deaths: integer COVID Deaths
total_deaths: integer Total deaths
percent_expected_deaths: double COLUMN_DESCRIPTION
pneumonia_deaths: integer COLUMN_DESCRIPTION
pneumonia_and_covid_deaths: integer COLUMN_DESCRIPTION
all_influenza_deaths_j09_j11: integer COLUMN_DESCRIPTION

Details

Table: Data summary


Name	cdc_deaths_by_state
Number of rows	53
Number of columns	10
_______________________
Column type frequency:
Date	3
character	1
numeric	6
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
data_as_of	0	1	2020-04-30	2020-04-30	2020-04-30	1
start_week	0	1	2020-02-01	2020-02-01	2020-02-01	1
end_week	0	1	2020-04-25	2020-04-25	2020-04-25	1

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
state	0	1	4	20	0	53	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
covid_deaths	6	0.89	735.02	1801.11	0	54.50	153.00	519.00	10978.00	▇▁▁▁▁
total_deaths	0	1.00	13557.43	13996.83	856	3813.00	10721.00	17624.00	69341.00	▇▂▁▁▁
percent_expected_deaths	0	1.00	0.93	0.27	0	0.86	0.95	0.99	2.19	▁▂▇▁▁
pneumonia_deaths	0	1.00	1197.26	1453.17	41	277.00	769.00	1306.00	6076.00	▇▁▁▁▁
pneumonia_and_covid_deaths	10	0.81	355.81	759.51	0	30.50	65.00	296.00	4019.00	▇▁▁▁▁
all_influenza_deaths_j09_j11	3	0.94	116.58	142.24	14	30.50	87.50	125.50	850.00	▇▁▁▁▁

The U.S. Centers for Disease Control provides weekly summary and interpretation of key indicators that have been adapted to track the COVID-19 pandemic in the United States. Data is retrieved using the cdccovidview package from both COVIDView (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html) and COVID-NET. Please see the indicated reference for all the caveats and precise meanings for each field. (https://gis.cdc.gov/grasp/COVIDNet/COVID19_3.html).

Author(s)

Kieran Healy

References

https://data.cdc.gov/api/views/hc4f-j6nb/rows.csv?accessType=DOWNLOAD&bom=true&format=true

CDC Provisional death counts by week

Description

Provisional Death Counts for Coronavirus Disease (COVID-19)

Usage

cdc_deaths_by_week
cdc_deaths_by_week

Format

A data frame with 13 rows and 10 variables:

data_as_of: date When the data were most recently recorded
start_week: date Start week
end_week: double End week
covid_deaths: integer COVID deaths
total_deaths: integer Total deaths
percent_expected_deaths: double COLUMN_DESCRIPTION
pneumonia_deaths: integer COLUMN_DESCRIPTION
pneumonia_and_covid_deaths: integer COLUMN_DESCRIPTION
all_influenza_deaths_j09_j11: integer COLUMN_DESCRIPTION
pneumonia_influenza_and_covid_19_deaths: integer COLUMN_DESCRIPTION

Details

Table: Data summary


Name	cdc_deaths_by_week
Number of rows	13
Number of columns	10
_______________________
Column type frequency:
Date	3
numeric	7
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
data_as_of	0	1	2020-04-30	2020-04-30	2020-04-30	1
start_week	0	1	2020-02-01	2020-04-25	2020-03-14	13
end_week	0	1	2020-02-01	2020-04-25	2020-03-14	13

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
covid_deaths	0	1	2655.46	4194.37	0.00	0.00	49.00	2659.00	11864.00	▇▁▁▂▁
total_deaths	0	1	54875.85	9864.46	24387.00	53940.00	56831.00	57299.00	65676.00	▁▁▁▇▂
percent_expected_deaths	0	1	0.97	0.17	0.45	0.97	0.97	0.99	1.19	▁▁▁▇▂
pneumonia_deaths	0	1	4825.00	2217.19	2219.00	3671.00	3692.00	5598.00	9580.00	▇▃▁▁▂
pneumonia_and_covid_deaths	0	1	1177.00	1863.76	0.00	0.00	25.00	1220.00	5281.00	▇▁▁▂▁
all_influenza_deaths_j09_j11	0	1	447.77	156.19	58.00	427.00	494.00	536.00	619.00	▁▁▁▇▇
pneumonia_influenza_and_covid_19_deaths	0	1	6690.23	4292.62	3553.00	4165.00	4275.00	7397.00	16272.00	▇▁▁▂▁

Author(s)

Kieran Healy

Source

Courtesy of Bob Rudis's cdccovidview package

References

https://data.cdc.gov/api/views/hc4f-j6nb/rows.csv?accessType=DOWNLOAD&bom=true&format=true

Country Names and ISO codes

Description

Convenience table of country names and their abbreviated names

Usage

countries
countries

Format

A data frame with 213 rows and 4 variables:

cname: character Country name
iso3: character ISO 3 designation
iso2: character ISO 2 designation
continent: Continent

Details

Table: Data summary


Name	dplyr::ungroup(countries)
Number of rows	213
Number of columns	4
_______________________
Column type frequency:
character	4
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
cname	0	1.00	4	42	0	213	0
iso3	0	1.00	3	3	0	213	0
iso2	2	0.99	2	2	0	211	0
continent	0	1.00	4	13	0	6	0

Produced from the ECDC tables in the covdata package.

Author(s)

Kieran Healy

References

ISO 2: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 ISO 3: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3

Daily international COVID-19 cases and deaths for 2020

Description

A dataset containing daily national-level ECDC data on COVID-19. Archived as of December 14th 2020. ECDC switched to a weekly reporting schedule for the COVID-19 situation worldwide and in the EU/EEA and the UK on 17 December 2020. Daily updates have been discontinued from 14 December 2020.

Usage

covnat_daily
covnat_daily

Format

A tibble with 61,836 rows and 8 columns

date: date in YYYY-MM-DD format
cname: Name of country (character)
iso3: ISO3 country code (character)
cases: N reported COVID-19 cases for this day
deaths: N reported COVID-19 deaths for this day
pop: Country population from Eurostat or UN data
cu_cases: Cumulative N reported COVID-19 cases up to and including this day
cu_deaths: Cumulative N reported COVID-19 deaths up to and including this day

Details

Table: Data summary


Name	dplyr::ungroup(covnat_dai...
Number of rows	61836
Number of columns	8
_______________________
Column type frequency:
Date	1
character	2
numeric	5
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2019-12-31	2020-12-14	2020-07-21	350

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
cname	0	1	4	42	0	213	0
iso3	0	1	3	3	0	213	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
cases	0	1	1156.33	6782.63	-8261	0	15	275.00	234633	▇▁▁▁▁
deaths	0	1	26.08	131.29	-1918	0	0	4.00	4928	▁▇▁▁▁
pop	59	1	40987698.23	153129379.34	815	1293120	7169456	28515829.00	1433783692	▇▁▁▁▁
cu_cases	0	1	100686.99	607743.06	0	129	2055	24650.00	16256754	▇▁▁▁▁
cu_deaths	0	1	3104.89	15545.84	0	1	42	464.25	299177	▇▁▁▁▁

Source

https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide

Weekly International COVID-19 cases and deaths, current as of Sunday, January 22, 2023

Description

A dataset containing weekly national-level ECDC data on COVID-19

Usage

covnat_weekly
covnat_weekly

Format

A tibble with 4,966 rows and 11 columns

date: date in YYYY-MM-DD format
year_week: Year and week of reporting (character, YYYY-WW)
cname: Name of country (character)
pop: Country population from Eurostat or UN data
iso3: ISO3 country code (character)
cases: N reported COVID-19 cases for this week
deaths: N reported COVID-19 deaths for this week
cu_cases: Cumulative N reported COVID-19 cases up to and including this week
cu_deaths: Cumulative N reported COVID-19 deaths up to and including this week
r14_cases: 14-day notification rate of reported COVID-19 cases per 100,000 population
r14_deaths: 14-day notification rate of reported COVID-19 cases per 100,000 population

Details

Table: Data summary


Name	dplyr::ungroup(covnat_wee...
Number of rows	4966
Number of columns	11
_______________________
Column type frequency:
Date	1
character	3
numeric	7
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2019-12-30	2023-01-09	2021-07-05	159

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
year_week	0	1.00	7	7	0	159	0
cname	0	1.00	5	14	0	31	0
iso3	196	0.96	3	3	0	30	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
pop	0	1.00	31613614.13	85253844.55	39055	2108977.00	6916548.00	17475415.00	453006705.00	▇▁▁▁▁
cases	222	0.96	77511.62	374657.80	0	1127.00	5487.00	28342.00	9023067.00	▇▁▁▁▁
deaths	279	0.94	514.14	2005.64	0	8.00	46.00	250.50	28380.00	▇▁▁▁▁
cu_cases	222	0.96	4188407.63	16969793.99	0	43400.25	485047.50	2117551.00	183857564.00	▇▁▁▁▁
cu_deaths	279	0.94	44362.78	142967.65	0	651.00	6268.00	28807.00	1204878.00	▇▁▁▁▁
r14_cases	263	0.95	557.34	1044.46	0	51.61	216.74	576.99	13728.65	▇▁▁▁▁
r14_deaths	321	0.94	34.08	50.74	0	3.81	14.21	42.57	435.28	▇▁▁▁▁

Source

http://ecdc.europa.eu/

COVID-19 data for the USA, current as of Sunday, January 22, 2023

Description

A dataset containing US state-level data on COVID-19

Usage

covus
covus

Format

A tibble with 664,960 rows and 7 columns

date: Date in YYYY-MM-DD format (date)
state: Two letter State abbreviation (character)
fips: State FIPS code (character)
data_quality_grade: character Data quality as assessed by COVID Tracking Project staff
measure: Outcome measure for this date
count: Count of measure
measure_label: character Outcome measure, suitable for use as a plot label

Details

Table: Data summary


Name	covus
Number of rows	664960
Number of columns	7
_______________________
Column type frequency:
Date	1
character	4
logical	1
numeric	1
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2020-01-13	2021-03-07	2020-09-03	420

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
state	0	1	2	2	0	56	0
fips	0	1	2	2	0	56	0
measure	0	1	5	30	0	31	0
measure_label	0	1	6	54	0	32	0

Variable type: logical

skim_variable	n_missing	complete_rate	mean	count
data_quality_grade	664960	0	NaN	:

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
count	434365	0.35	387436.8	1638507	0	498	7782	134223	49646014	▇▁▁▁▁

The measures tracked by the COVID tracking project are as follows:

measure	measure_label
positive	Positive Tests
probable_cases	Probable Cases
negative	Negative Tests
pending	Pending Tests
hospitalized_currently	Currently Hospitalized
hospitalized_cumulative	Cumulative Hospitalized
in_icu_currently	Currently in ICU
in_icu_cumulative	Cumulative in ICU
on_ventilator_currently	Currently on Ventilator
on_ventilator_cumulative	Cumulative on Ventilator
recovered	Recovered
death	Deaths
hospitalized_discharged	Total Discharged from Hospital
total_tests_viral	Total number of PCR tests performed
positive_tests_viral	Total number of positive PCR tests
negative_tests_viral	Total number of negative PCR tests
positive_cases_viral	Total number of positive cases measured with PCR tests
death_confirmed	Deaths Confirmed
death_probable	Deaths Probable
total_test_encounters_viral	Total Test Encounters (PCR)
total_tests_people_viral	Total PCR Tests (People)
total_tests_antibody	Total Antibody Tests
positive_tests_antibody	Positive Antibody Tests
negative_tests_antibody	Total number of negative antibody tests
negative_tests_antibody	Negative Antibody Tests
total_tests_people_antibody	Total Antibody Tests (People)
positive_tests_people_antibody	Positive Antibody Tests (People)
negative_tests_people_antibody	Negative Antibody Tests (People)
total_tests_people_antigen	Total Antigen Tests (People)
positive_tests_people_antigen	Positive Antigen Tests (People)
total_tests_antigen	Total Antigen Tests
positive_tests_antigen	Positive Antigen Tests

Not all measures are reported by all states. The positive, negative, death, death_confirmed, probable_cases and death_probable measures are cumulative counts. death_confirmed is the total number deaths of individuals with COVID-19 infection confirmed by a laboratory test. In states where the information is available, it tracks only those laboratory-confirmed deaths where COVID also contributed to the death according to the death certificate. death_probable is the total number of deaths where COVID was listed as a cause of death and there is not a laboratory test confirming COVID-19 infection.

For further information on the COVID Tracking Project's measures, see https://covidtracking.com/about-data/data-definitions

Source

The COVID-19 Tracking Project https://covidtracking.com

COVID-19 case and death counts for the USA by Hispanic/Non-Hispanic ethnicity and state current as of Sunday, January 22, 2023

Description

The COVID Racial Data Tracker advocates for, collects, publishes, and analyzes racial data on the pandemic across the United States. It’s a collaboration between the COVID Tracking Project and the Boston University Center for Antiracist Research.

Usage

covus_ethnicity
covus_ethnicity

Format

A tibble with 15,960 rows and 7 columns

date: date Data reported as of this date
state: character State
group: character Ethnic group
cases: integer Total cases, count
deaths: integer Total deaths, count
hosp: integer Total hospitalizations, count

Details

Table: Data summary


Name	covus_ethnicity
Number of rows	15960
Number of columns	7
_______________________
Column type frequency:
Date	1
character	2
numeric	4
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2020-04-12	2021-03-07	2020-09-23	95

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
state	0	1	2	2	0	56	0
group	0	1	7	12	0	3	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
cases	3080	0.81	73357.18	166184.31	0	5529	21920.5	70265.5	2619476	▇▁▁▁▁
deaths	3144	0.80	1645.64	3463.93	-1	63	291.5	1401.0	32664	▇▁▁▁▁
hosp	11662	0.27	5079.37	8831.52	0	556	1556.0	4959.5	56406	▇▁▁▁▁
tests	14271	0.11	892566.44	2376098.22	0	58933	224156.0	537668.0	21633943	▇▁▁▁▁

The group variable is coded as "Hispanic", "Non-Hispanic", or "Unknown". Hispanics may be of any race. State-level counts should be handled with care, given the widely varying population distribution of people of different ethnic backgrounds by state.

Author(s)

Kieran Healy

Source

https://covidtracking.com/race

COVID-19 case and death counts for the USA by race and state current as of Sunday, January 22, 2023

Description

Usage

covus_race
covus_race

Format

A tibble with 47,880 rows and 7 columns

date: date Data reported as of this date
state: character State
group: character Racial group
cases: integer Total cases, count
deaths: integer Total deaths, count
hosp: integer Total hospitalizations, count

Details

Table: Data summary


Name	covus_race
Number of rows	47880
Number of columns	7
_______________________
Column type frequency:
Date	1
character	2
numeric	4
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2020-04-12	2021-03-07	2020-09-23	95

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
state	0	1	2	2	0	56	0
group	0	1	5	11	0	9	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
cases	15684	0.67	30240.68	103176.64	0	568	3661	21026	2619476	▇▁▁▁▁
deaths	17686	0.63	708.93	1836.84	-1	12	68	440	24402	▇▁▁▁▁
hosp	37253	0.22	2077.78	4654.37	0	67	345	1716	41099	▇▁▁▁▁
tests	43549	0.09	349773.42	1269936.08	0	6298	36108	199214	18567612	▇▁▁▁▁

The group variable is coded as follows:

groups

White

Black

Latino

Asian

AI/AN

NH/PI

Multiracial

Other

Unknown

AI/AN is American Indian/Alaska Native. NH/PI is Native Hawaiian/Pacific Islander. State-level counts should be handled with care, given the widely varying population distribution of people of different racial backgrounds by state.

Author(s)

Kieran Healy

Source

https://covidtracking.com/race

fmt_nc

Description

Format fmt_nc in df

Usage

fmt_nc(x)
fmt_nc(x)

Arguments

x

Details

use in fn documentation

Value

formatted string

Author(s)

Kieran Healy

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)
## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

fmt_nr

Description

Format fmt_nr in df

Usage

fmt_nr(x)
fmt_nr(x)

Arguments

x

Details

use in fn documentation

Value

formatted string

Author(s)

Kieran Healy

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)
## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

FUNCTION_TITLE

Description

FUNCTION_DESCRIPTION

Usage

mmwr_week_to_date(year, week, day = NULL)
mmwr_week_to_date(year, week, day = NULL)

Arguments

`year`	PARAM_DESCRIPTION
`week`	PARAM_DESCRIPTION
`day`	PARAM_DESCRIPTION, Default: NULL

Details

DETAILS

Value

OUTPUT_DESCRIPTION

Author(s)

Kieran Healy

Source

http://

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)
## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

FUNCTION_TITLE

Description

FUNCTION_DESCRIPTION

Usage

MMWRweek2Date(MMWRyear, MMWRweek, MMWRday = NULL)
MMWRweek2Date(MMWRyear, MMWRweek, MMWRday = NULL)

Arguments

`MMWRyear`	PARAM_DESCRIPTION
`MMWRweek`	PARAM_DESCRIPTION
`MMWRday`	PARAM_DESCRIPTION, Default: NULL

Details

DETAILS

Value

OUTPUT_DESCRIPTION

Author(s)

Kieran Healy

Source

http://

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)
## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

FUNCTION_TITLE

Description

FUNCTION_DESCRIPTION

Usage

MMWRweekday(date)
MMWRweekday(date)

Arguments

date

PARAM_DESCRIPTION

Details

DETAILS

Value

OUTPUT_DESCRIPTION

Author(s)

Kieran Healy

Source

http://

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)
## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

Provisional COVID-19 Death Counts by Sex, Age, and State

Description

Deaths involving coronavirus disease (COVID-19), pneumonia, and influenza reported to NCHS by sex and age group and state.

Usage

nchs_sas
nchs_sas

Format

A tibble with 115,668 rows and 15 variables:

data_as_of: date Date of data release
start_date: date First date of data period
end_date: date Last date of data period
group: character Unit of time observation: whether data in this row are measured By month, By total, or By year
year: integer Year of observation
month: integer Month of observation
state: character Jurisdiction of occurrence. One of: United States total, a US State, District of Columbia, and New York City, separate from New York state.
sex: character Sex
age_group: character Age group
covid_19_deaths: integer Deaths involving COVID-19 (ICD-code U07.1)
total_deaths: integer Deaths from all causes of death
pneumonia_deaths: integer Pneumonia Deaths (ICD-10 codes J12.0-J18.9)
pneumonia_and_covid_19_deaths: integer Deaths with Pneumonia and COVID-19 (ICD-10 codes J12.0-J18.9 and U07.1)
influenza_deaths: integer Influenza Deaths (ICD-10 codes J09-J11)
pneumonia_influenza_or_covid_19_deaths: integer Deaths with Pneumonia, Influenza, or COVID-19 (ICD-10 codes U07.1 or J09-J18.9)

Details

Table: Data summary


Name	nchs_sas
Number of rows	115668
Number of columns	15
_______________________
Column type frequency:
Date	1
character	6
numeric	8
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
data_as_of	0	1	2023-01-18	2023-01-18	2023-01-18	1

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
start_date	0	1	10	10	0	37	0
end_date	0	1	10	10	0	37	0
group	0	1	7	8	0	3	0
state	0	1	4	20	0	54	0
sex	0	1	4	9	0	3	0
age_group	0	1	8	17	0	17	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
year	2754	0.98	2021.10	0.91	2020	2020	2021	2022	2023	▇▇▁▇▁
month	13770	0.88	6.35	3.52	1	3	6	9	12	▇▅▅▅▇
covid_19_deaths	31823	0.72	351.76	6263.51	0	0	10	60	1094723	▇▁▁▁▁
total_deaths	17146	0.85	2812.18	52269.95	0	41	148	648	10144808	▇▁▁▁▁
pneumonia_deaths	36293	0.69	349.71	6016.66	0	0	17	76	1030983	▇▁▁▁▁
pneumonia_and_covid_19_deaths	30476	0.74	174.88	3162.39	0	0	0	26	550128	▇▁▁▁▁
influenza_deaths	22407	0.81	4.94	103.26	0	0	0	0	18477	▇▁▁▁▁
pneumonia_influenza_or_covid_19_deaths	35678	0.69	535.21	9239.91	0	0	25	112	1591892	▇▁▁▁▁

Number of deaths reported in this table are the total number of deaths received and coded as of the date of analysis, and do not represent all deaths that occurred in that period. Data during this period are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more. Missing values may indicate that a category has between 1 and 9 observed cases and have been suppressed in accordance with NHCS confidentiality standards. As of September 2, 2020, this data file includes the following age groups in addition to the age groups that are routinely included: 0-17, 18-29, 30-49, and 50-64. The new age groups are consistent with categories used across CDC COVID-19 surveillance pages. When analyzing the file, the user should make sure to select only the desired age groups. Summing across all age categories provided will result in double counting deaths from certain age groups. Similarly, the state variable includes the United States as a whole, and New York City counted separately from the rest of New York State. The temporal unit of observation also varies, with totals given by year, by month, and overall. It is necessary to first filter the data by desired time unit, region, and age group to ensure there is no double-counting in subsequent calculations.

Author(s)

Kieran Healy

Source

National Center for Health Statistics https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Sex-Age-and-S/9bhg-hcku

References

https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Sex-Age-and-S/9bhg-hcku

Weekly Counts of Deaths by State and Select Causes 2014-2021

Description

Final counts of deaths by the week the deaths occurred, by state of occurrence, and by select causes of death for 2014-2018, and Provisional counts of deaths by the week the deaths occurred, by state of occurrence, and by select underlying causes of death for 2019-2020. The dataset also includes weekly provisional counts of death for COVID-19, coded to ICD-10 code U07.1 as an underlying or multiple cause of death.

Usage

nchs_wdc
nchs_wdc

Format

A data frame with 347,706 rows and 7 variables:

jurisdiction: character Jurisdiction of Occurrence
year: double MMWR Year
week: double MMWR Week
week_ending_date: double MMWR Week ending date
cause_detailed: character Cause with ICD Codes
n: double Count of deaths
cause: character Cause of death

Details

For 2014-2019, death counts in this dataset were derived from the National Vital Statistics System database that provides the most timely access to the data. Therefore, counts may differ slightly from final data due to differences in processing, recoding, and imputation. For 2019-2021, the dataset also includes weekly provisional counts of death for COVID-19, coded to ICD-10 code U07.1 as an underlying or multiple cause of death. Number of deaths reported in this table are the total number of deaths received and coded as of the date of analysis, and do not represent all deaths that occurred in that period. Data for 2020 and 2021 are provisional and may be incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. Causes of death included in this dataset are tabulated by underlying cause of death ICD-10 codes. COVID-19 deaths by underlying cause and multiple cause of death are also included.

Author(s)

Kieran Healy

Source

2014-2019: https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/3yf8-kanr. 2020-2021: https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/muzy-jte6

Provisional Death Counts for Coronavirus Disease (COVID-19): Weekly State-Specific Data Updates

Description

This report provides a weekly summary of deaths with coronavirus disease 2019 (COVID-19) by select geographic and demographic variables. In this release, counts of deaths are provided by the race and Hispanic origin of the decedent.

Usage

nchs_wss
nchs_wss

Format

A tibble with 15,582 rows and 12 variables:

data_as_of: date Date of analysis
start_date: date Start date of coverage
end_date: date End date of coverage
year: character Year. One of "2020", "2021", or "2020/2021".
month: dbl Month
obs_unit: character Unit of observation. One of: By Total, By Year, By Month.
state: character Geographical unit. One of: the United States, a U.S. State, the District of Columbia, or New York City. New York state measures do not include New York City
race_ethnicity: chr Race and ethnic group. One of: Non-Hispanic White, Non-Hispanic Black or African American, Non-Hispanic American Indian or Alaska Native, Non-Hispanic Asian, Non-Hispanic Native Hawaiian or Other Pacific Islander, Non Hispanic more than one race, Hispanic or Latino.
deaths: integer Count of deaths
dist_pct: double Distribution of COVID-19 deaths (%): Deaths for each group as a percent of the total number of COVID-19 deaths reported.
uw_dist_pop_pct: double Unweighted distribution of population (%): Population of each group as a percent of the total population.
wt_dist_pop_pct: double Weighted distribution of population (%): Population of each group as percent of the total population after accounting for how the race and Hispanic origin population is distributed in relation to the geographic areas impacted by COVID-19.

Details

Table: Data summary


Name	nchs_wss
Number of rows	15582
Number of columns	12
_______________________
Column type frequency:
Date	1
character	6
numeric	5
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
data_as_of	0	1	2023-01-18	2023-01-18	2023-01-18	1

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
start_date	0	1	10	10	0	37	0
end_date	0	1	10	10	0	37	0
year	0	1	4	9	0	5	0
obs_unit	0	1	7	8	0	3	0
state	0	1	4	20	0	53	0
race_ethnicity	0	1	18	54	0	7	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
month	1855	0.88	6.35	3.52	1	3.0	6.0	9.0	12.0	▇▅▅▅▇
deaths	4625	0.70	596.40	8680.87	0	0.0	14.0	100.0	718968.0	▇▁▁▁▁
dist_pct	4625	0.70	17.59	29.22	0	0.0	1.1	19.7	100.0	▇▁▁▁▁
uw_dist_pop_pct	0	1.00	14.28	23.57	0	0.9	3.1	12.7	92.7	▇▁▁▁▁
wt_dist_pop_pct	0	1.00	13.68	21.60	0	0.5	3.2	14.4	93.6	▇▁▁▁▁

The percent of deaths reported in this table are the total number of represent all deaths received and coded as of the date of analysis and do not represent all deaths that occurred in that period. Data are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more, depending on the jurisdiction, age, and cause of death. Provisional counts reported here track approximately 1–2 weeks behind other published data sources on the number of COVID-19 deaths in the U.S. COVID-19 deaths are defined as having confirmed or presumed COVID-19, and are coded to ICD–10 code U07.1. Unweighted population percentages are based on the Single-Race Population Estimates from the U.S. Census Bureau, for the year 2018 (available from: https://wonder.cdc.gov/single-race-population.html). Weighted population percentages are computed by multiplying county-level population counts by the count of COVID deaths for each county, summing to the state-level, and then estimating the percent of the population within each racial and ethnic group. These weighted population distributions therefore more accurately reflect the geographic locations where COVID outbreaks are occurring. Jurisdictions are included in this table if more than 100 deaths were received and processed by NCHS as of the data of analysis.

Race and Hispanic-origin categories are based on the 1997 Office of Management and Budget (OMB) standards (1,2), allowing for the presentation of data by single race and Hispanic origin. These race and Hispanic-origin groups—non-Hispanic single-race white, non-Hispanic single-race black or African American, non-Hispanic single-race American Indian or Alaska Native (AIAN), non-Hispanic single-race Asian, and non-Hispanic single-race Native Hawaiian and Other Pacific Islander —differ from the bridged-race categories shown in most reports using mortality data.

New York State totals exclude New York City (provided in table separately).

Missing values may indicate that a category has between 1 and 9 observed cases and have been suppressed in accordance with NHCS confidentiality standards.

Author(s)

Kieran Healy

Source

National Center for Health Statistics https://data.cdc.gov/NCHS/Provisional-Death-Counts-for-Coronavirus-Disease-C/pj7m-y5uh

NSSP National COVID-related ER Visits

Description

National Syndromic Surveillance Program (NSSP): Emergency Department Visits and Percentage of Visits for COVID-19-Like Illness (CLI) or Influenza-like Illness (ILI)

Usage

nssp_covid_er_nat
nssp_covid_er_nat

Format

A data frame with 54 rows and 9 variables:

week: integer COLUMN_DESCRIPTION
num_fac: integer COLUMN_DESCRIPTION
total_ed_visits: character COLUMN_DESCRIPTION
visits: integer COLUMN_DESCRIPTION
pct_visits: double COLUMN_DESCRIPTION
visit_type: character COLUMN_DESCRIPTION
region: character COLUMN_DESCRIPTION
source: character COLUMN_DESCRIPTION
year: integer COLUMN_DESCRIPTION

Details

Table: Data summary


Name	nssp_covid_er_nat
Number of rows	54
Number of columns	9
_______________________
Column type frequency:
character	4
numeric	5
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
total_ed_visits	0	1	7	7	0	27	0
visit_type	0	1	3	3	0	2	0
region	0	1	8	8	0	1	0
source	0	1	21	21	0	1	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
week	0	1	26.04	19.81	1.00	7.25	14.00	45.75	52.00	▇▂▁▂▇
num_fac	0	1	3346.89	48.97	3249.00	3329.50	3352.00	3389.50	3406.00	▃▁▆▃▇
visits	0	1	41521.67	16344.25	17639.00	31216.00	39183.50	50532.00	86088.00	▅▇▃▂▁
pct_visits	0	1	0.02	0.01	0.01	0.01	0.02	0.02	0.05	▇▆▂▁▂
year	0	1	2019.52	0.50	2019.00	2019.00	2020.00	2020.00	2020.00	▇▁▁▁▇

Author(s)

Kieran Healy

Source

Courtesy of Bob Rudis's cdccovidview package

References

https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/04102020/nssp-regions.html

NSSP Regional COVID ER Visits

Description

Regional Syndromic Surveillance Program (NSSP): Emergency Department Visits and Percentage of Visits for COVID-19-Like Illness (CLI) or Influenza-like Illness (ILI)

Usage

nssp_covid_er_reg
nssp_covid_er_reg

Format

A tibble with 538 rows and 9 variables:

week: integer COLUMN_DESCRIPTION
num_fac: integer COLUMN_DESCRIPTION
total_ed_visits: character COLUMN_DESCRIPTION
visits: integer COLUMN_DESCRIPTION
pct_visits: double COLUMN_DESCRIPTION
visit_type: character COLUMN_DESCRIPTION
region: character COLUMN_DESCRIPTION
source: character COLUMN_DESCRIPTION
year: integer COLUMN_DESCRIPTION

Details

Table: Data summary


Name	nssp_covid_er_reg
Number of rows	538
Number of columns	9
_______________________
Column type frequency:
character	4
numeric	5
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
total_ed_visits	0	1	5	6	0	269	0
visit_type	0	1	3	3	0	2	0
region	0	1	8	9	0	10	0
source	0	1	21	21	0	1	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
week	0	1	25.99	19.66	1	7.00	14.00	46.00	52.00	▇▂▁▂▇
num_fac	0	1	335.18	234.58	135	190.00	222.00	343.00	884.00	▇▃▁▂▂
visits	0	1	4164.87	4028.53	279	1596.00	2780.00	4723.75	23345.00	▇▂▁▁▁
pct_visits	0	1	0.02	0.01	0	0.01	0.02	0.02	0.11	▇▂▁▁▁
year	0	1	2019.52	0.50	2019	2019.00	2020.00	2020.00	2020.00	▇▁▁▁▇

Author(s)

Kieran Healy

Source

Courtesy of Bob Rudis's cdccovidview package

References

https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/04102020/nssp-regions.html

NYT COVID-19 data for US counties, current as of Sunday, January 22, 2023

Description

A dataset containing US county-level data on COVID-19, collected by the New York Times.

Usage

nytcovcounty
nytcovcounty

Format

A tibble with 2,502,832 rows and 6 columns

date: Date in YYYY-MM-DD format (date)
county: County name (character)
state: State name (character)
fips: County FIPS code (character)
cases: Cumulative N reported cases
deaths: Cumulative N reported deaths

Details

Table: Data summary


Name	nytcovcounty
Number of rows	2502832
Number of columns	6
_______________________
Column type frequency:
Date	1
character	3
numeric	2
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2020-01-21	2022-05-13	2021-04-23	844

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
county	0	1.00	3	35	0	1932	0
state	0	1.00	4	24	0	56	0
fips	23678	0.99	5	5	0	3220	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
cases	0	1.00	10033.80	47525.22	0	382	1773	5884	2908425	▇▁▁▁▁
deaths	57605	0.98	161.61	820.33	0	6	33	101	40267	▇▁▁▁▁

Source

The New York Times https://github.com/nytimes/covid-19-data For details on the methods and limitations see https://github.com/nytimes/covid-19-data. For county data, note in particular:

New York: All cases for the five boroughs of New York City (New York, Kings, Queens, Bronx and Richmond counties) are assigned to a single area called New York City. There is a large jump in the number of deaths on April 6th due to switching from data from New York City to data from New York state for deaths. For all New York state counties, starting on April 8th we are reporting deaths by place of fatality instead of residence of individual.
Kansas City, Mo: Four counties (Cass, Clay, Jackson and Platte) overlap the municipality of Kansas City, Mo. The cases and deaths that we show for these four counties are only for the portions exclusive of Kansas City. Cases and deaths for Kansas City are reported as their own line.
Alameda County, Calif: Counts for Alameda County include cases and deaths from Berkeley and the Grand Princess cruise ship.
Douglas County, Neb. Counts for Douglas County include cases brought to the state from the Diamond Princess cruise ship.
Chicago: All cases and deaths for Chicago are reported as part of Cook County.
Guam: Counts for Guam include cases reported from the USS Theodore Roosevelt.

NYT COVID-19 data for the US states, current as of Sunday, January 22, 2023

Description

A dataset containing US state-level data on COVID-19, collected by the New York Times.

Usage

nytcovstate
nytcovstate

Format

A tibble with 58,526 rows and 5 columns

date: Date in YYYY-MM-DD format (date)
state: State name (character)
fips: State FIPS code (character)
cases: Cumulative N reported cases
deaths: Cumulative N reported deaths

Details

Table: Data summary


Name	nytcovstate
Number of rows	58526
Number of columns	5
_______________________
Column type frequency:
Date	1
character	2
numeric	2
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2020-01-21	2023-01-21	2021-08-16	1097

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
state	0	1	4	24	0	56	0
fips	0	1	2	2	0	56	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
cases	0	1	834511.91	1394631.70	1	64160	324958	985279.8	11955605	▇▁▁▁▁
deaths	0	1	11294.84	16797.98	0	1080	4790	14373.0	101982	▇▁▁▁▁

Source

The New York Times https://github.com/nytimes/covid-19-data. For details on the methods and limitations see https://github.com/nytimes/covid-19-data.

NYT COVID-19 data for the US, current as of Sunday, January 22, 2023

Description

A dataset containing US national-level data on COVID-19, collected by the New York Times.

Usage

nytcovus
nytcovus

Format

A tibble with 1,097 rows and 3 columns

date: Date in YYYY-MM-DD format (date)
cases: Cumulative N reported cases
deaths: Cumulative N reported deaths

Details

Table: Data summary


Name	nytcovus
Number of rows	1097
Number of columns	3
_______________________
Column type frequency:
Date	1
numeric	2
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
date	0	1	2020-01-21	2023-01-21	2021-07-22	1097

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
cases	0	1	44522009.0	35239239.4	1	8404635	34364829	80836264	101726588	▇▆▃▂▆
deaths	0	1	602590.7	370532.5	0	222195	609870	989584	1111011	▆▂▅▃▇

Source

The New York Times https://github.com/nytimes/covid-19-data. For details on the methods and limitations see https://github.com/nytimes/covid-19-data.

NYT Excess Mortality Estimates, current as of Sunday, January 22, 2023

Description

All-cause mortality is widely used by demographers and other researchers to understand the full impact of deadly events, including epidemics, wars and natural disasters. The totals in this data include deaths from Covid-19 as well as those from other causes, likely including people who could not be treated or did not seek treatment for other conditions.

Usage

nytexcess
nytexcess

Format

A tibble with 7,258 rows and 12 columns

country: character Country Name
placename: character Place Name
frequency: character Reporting period. Weekly or monthly, depending on how the data is recorded.
start_date: date The first date included in the period.
end_date: date The last date included in the period,
year: character Year of data. Note that this variable is of type character and not integer because several observations are notes to the effect that the year is an average of two years.
month: integer Numerical month.
week: integer Numerical week.
deaths: integer The total number of confirmed deaths recorded from any cause.
expected_deaths: integer The baseline number of expected deaths, calculated from a historical average. See details below.
excess_deaths: integer The number of deaths minus the expected deaths.
baseline: character The years used to calculate expected_deaths.

Details

Table: Data summary


Name	nytexcess
Number of rows	7258
Number of columns	12
_______________________
Column type frequency:
Date	2
character	5
numeric	5
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
start_date	768	0.89	2010-01-09	2020-12-23	2018-02-05	1267
end_date	768	0.89	2010-01-15	2020-12-29	2018-02-11	1267

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
country	0	1.00	4	14	0	35	0
placename	6883	0.05	6	8	0	4	0
frequency	0	1.00	6	7	0	2	0
year	0	1.00	4	17	0	15	0
baseline	5990	0.17	20	25	0	7	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
month	0	1.00	6.60	3.36	1	4.00	7.0	9.0	12	▇▆▆▆▇
week	666	0.91	26.77	14.58	2	14.00	27.0	39.0	52	▇▇▇▇▇
deaths	0	1.00	7968.24	14334.14	455	1460.00	2395.5	10486.0	141292	▇▁▁▁▁
expected_deaths	5990	0.17	9237.09	15850.00	548	1443.00	2423.0	10771.5	139343	▇▁▁▁▁
excess_deaths	5990	0.17	1195.43	3242.72	-6721	-42.25	76.5	926.0	30400	▇▂▁▁▁

Expected deaths for each area based on historical data for the same time of year. These expected deaths are the basis for our excess death calculations, which estimate how many more people have died this year than in an average year.

The number of years used in the historical averages changes depending on what data is available, whether it is reliable and underlying demographic changes. See Data Sources for the years used to calculate the baselines. The baselines do not adjust for changes in age or other demographics, and they do not account for changes in total population.

The number of expected deaths are not adjusted for how non-Covid-19 deaths may change during the outbreak, which will take some time to figure out. As countries impose control measures, deaths from causes like road accidents and homicides may decline. And people who die from Covid-19 cannot die later from other causes, which may reduce other causes of death. Both of these factors, if they play a role, would lead these baselines to understate, rather than overstate, the number of excess deaths.

Author(s)

Kieran Healy

Source

The New York Times https://github.com/nytimes/covid-19-data/tree/master/excess-deaths.

References

For further details on these data see https://github.com/nytimes/covid-19-data/tree/master/excess-deaths

FUNCTION_TITLE

Description

FUNCTION_DESCRIPTION

Usage

start_date(year)
start_date(year)

Arguments

year

PARAM_DESCRIPTION

Details

DETAILS

Value

OUTPUT_DESCRIPTION

Author(s)

AUTHOR_NAME

Source

http://

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)
## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

Short Term Mortality Fluctuations (STMF) data series

Description

Human Mortality Database (HMD) series of weekly death counts across countries.

Usage

stmf
stmf

Format

A tibble with 580,395 rows and 17 variables:

country_code: Mortality database country code
cname: character Country name
iso2: character ISO2 country code
iso3: character ISO3 country code
year: double Year
week: double Week number. Each year in the STMF refers to 52 weeks, each week has 7 days. In some cases, the first week of a year may include several days from the previous year or the last week of a year may include days (and, respectively, deaths) of the next year. In particular, it means that a statistical year in the STMF is equal to the statistical year in annual country-specific statistics.
sex: character Sex. m = Males. f = Females. b = Both combined.
split: double Indicates if data were split from aggregated age groups (0 if the original data has necessary detailed age scale). For example, if the original age scale was 0-4, 5-29, 30-65, 65+, then split will be equal to 1
split_sex: double Indicates if the original data are available by sex (0) or data are interpolated (1)
forecast: double Equals 1 for all years where forecasted population exposures were used to calculate weekly death rates.
approx_date: double Approximate date (derived from the year and week number).
age_group: character Age group for death counts and rates
death_count: double Weekly death count. This number need not be an integer, because the age categories may be aggregated or split across the source national data.
death_rate: double Weekly death rate.
deaths_total: double Count of deaths for all ages combined.
rate_total: double Crude death rate.

Details

For further details on the construction of this dataset see the codebook at https://www.mortality.org/Public/STMF_DOC/STMFNote.pdf. For the original input data files in standardized form, see https://www.mortality.org/Public/STMF/Inputs/STMFinput.zip.

Countries and years covered in the dataset:

cname	1990	1991	1992	1993	1994	1995	1996	1997	1998	1999	2000	2001	2002	2003	2004	2005	2006	2007	2008	2009	2010	2011	2012	2013	2014	2015	2016	2017	2018	2019	2020	2021	2022
Australia	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y
Austria	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Belgium	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Bulgaria	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Canada	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Chile	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y
Croatia	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Czech Republic	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Denmark	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
England and Wales	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Estonia	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Finland	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
France	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Germany	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Greece	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y
Hungary	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Iceland	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Israel	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Italy	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Korea, Republic of	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Latvia	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Lithuania	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Luxembourg	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Netherlands	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
New Zealand	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Northern Ireland	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y
Norway	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Poland	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Portugal	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Russian Federation	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	-	-
Scotland	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Slovakia	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Slovenia	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Spain	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Sweden	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Switzerland	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y
Taiwan, Province of China	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	Y	-
United States	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	Y	Y	Y	Y	Y	Y	Y	Y

Variables Table: Data summary


Name	stmf
Number of rows	580395
Number of columns	17
_______________________
Column type frequency:
Date	1
character	7
numeric	9
________________________
Group variables	None

Variable type: Date

skim_variable	n_missing	complete_rate	min	max	median	n_unique
approx_date	0	1	1990-01-07	2023-01-01	2012-10-07	1722

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
country_code	0	1.00	3	7	0	38	0
cname	0	1.00	5	25	0	38	0
iso2	34380	0.94	2	2	0	35	0
continent	35850	0.94	4	13	0	5	0
iso3	34380	0.94	3	3	0	35	0
sex	0	1.00	1	1	0	3	0
age_group	0	1.00	3	5	0	5	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
year	0	1	2011.58	6.88	1990	2006.00	2012.00	2017.00	2022.00	▁▂▆▆▇
week	0	1	26.50	15.03	1	13.00	26.00	39.00	53.00	▇▇▇▇▇
split	0	1	0.12	0.32	0	0.00	0.00	0.00	1.00	▇▁▁▁▁
split_sex	0	1	0.00	0.07	0	0.00	0.00	0.00	1.00	▇▁▁▁▁
forecast	0	1	0.10	0.30	0	0.00	0.00	0.00	1.00	▇▁▁▁▁
death_count	0	1	617.60	1585.49	0	39.00	162.00	449.75	26362.00	▇▁▁▁▁
death_rate	0	1	0.05	0.07	0	0.00	0.02	0.07	0.57	▇▂▁▁▁
deaths_total	0	1	3088.00	6498.29	2	472.00	998.00	2543.00	87413.00	▇▁▁▁▁
rate_total	0	1	0.01	0.00	0	0.01	0.01	0.01	0.04	▅▇▁▁▁

Author(s)

Kieran Healy

Source

Human Mortality Database, http://mortality.org

References

"Short-term Mortality Fluctuations Dataseries" n.d., https://www.mortality.org/Public/STMF_DOC/STMFNote.pdf

Make a table of stmf country years

Description

Make a table of stmf country years

Usage

stmf_country_years(df = stmf)
stmf_country_years(df = stmf)

Arguments

`df`	The stmf data frame

Details

Get a table of country x year coverage for stmf

Value

A tibble

Author(s)

Kieran Healy

Source

http://

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)
## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

tabular

Description

Make an Rd table from a data frame

Usage

tabular(df, ...)
tabular(df, ...)

Arguments

`df`	Data frame
`...`	Other args

Details

DETAILS

Value

Rd table

Author(s)

Kieran Healy

Source

http://

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)
## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

State population estimates for US States

Description

Population estimates for US States as of July 1st 2018

Usage

uspop
uspop

Format

A tibble with 459 rows and 17 variables:

state: character State Name
state_abbr: character State Abbreviation
statefips: character 2-digit FIPS code
region_name: character Census region
division_name: character Census Division
sex_id: character Sex id
sex: character Sex label
hisp_id: character Ethnicity: Hispanic id
hisp_label: character Hispanic label
fips: character Full FIPS code
pop: double Total population
white: double Race alone: White
black: double Race alone: Black or African-American
amind: double Race alone: American Indian and Alaska Native
asian: double Race alone: Asian
nhopi: double Race alone: Native Hawaiian and Other Pacific Islander
tom: double Race alone: Two or more races

Details

Table: Data summary


Name	uspop
Number of rows	459
Number of columns	17
_______________________
Column type frequency:
character	10
numeric	7
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
state	0	1.00	4	20	0	51	0
state_abbr	9	0.98	2	2	0	50	0
statefips	0	1.00	2	2	0	51	0
region_name	9	0.98	4	9	0	4	0
division_name	9	0.98	7	18	0	9	0
sex_id	0	1.00	4	6	0	3	0
sex	0	1.00	4	10	0	3	0
hisp_id	0	1.00	4	7	0	3	0
hisp_label	0	1.00	5	12	0	3	0
fips	0	1.00	11	11	0	51	0

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
pop	0	1	2851132.32	4198641.26	6154	386961.5	1349442	3558480.0	39557045	▇▁▁▁▁
white	0	1	2179861.40	3116129.25	5120	296294.0	1088503	2759335.5	28531740	▇▁▁▁▁
black	0	1	381736.98	644380.66	260	11907.0	80714	486281.5	3673855	▇▁▁▁▁
amind	0	1	36143.97	65036.83	161	6103.5	15273	35770.5	651076	▇▁▁▁▁
asian	0	1	168458.39	515557.14	79	5045.5	26484	140424.5	6063600	▇▁▁▁▁
nhopi	0	1	6966.61	18657.18	23	669.0	2029	5063.5	199872	▇▁▁▁▁
tom	0	1	77964.97	131251.16	455	12091.0	33757	98669.5	1554757	▇▁▁▁▁

U.S. Census estimates. Be aware of the US Census classifications of Race and Ethnicity. For the estimated total population for each State, jointly filter on totsex in sex_id and tothisp in hisp_id and then select pop.

Author(s)

Kieran Healy

Source

https://www.census.gov/data/datasets/time-series/demo/popest/2010s-state-detail.html

References

https://www2.census.gov/programs-surveys/popest/tables/2010-2018/state/asrh/PEPSR6H.pdf

Package 'covdata'

Help Index

⁠%nin%⁠

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Apple Mobility Data

Description

Usage

Format

Details

Author(s)

Source

References

CDC surveillance network and network catchment area

Description

Usage

Format

Details

Author(s)

Source

References

CDC Surveillance Network Death Counts by Age

Description

Usage

Format

Details

Author(s)

Source

References

CDC provisional death counts by sex

Description

Usage

Format

Details

Author(s)

Source

References

CDC provisional death counts by state

Description

Usage

Format

Details

Author(s)

References

CDC Provisional death counts by week

Description

Usage

Format

Details

Author(s)

Source

References

Country Names and ISO codes

Description

Usage

Format

Details

Author(s)

References

Daily international COVID-19 cases and deaths for 2020

Description

Usage

Format

Details

Source

Weekly International COVID-19 cases and deaths, current as of Sunday, January 22, 2023

Description

Usage

Format

Details

Source

COVID-19 data for the USA, current as of Sunday, January 22, 2023

Description

Usage

Format

`⁠%nin%⁠`