Data & Tools

Here's data from various federal organizations to help build context around your project (check out the Suggested Projects). All participants are also welcome to work on their own project, in teams or independently. Analysis of the Behavioral Risk Factor and Surveillance System must be incorporated in all projects. 

Behavioral Risk Factor and Surveillance System (BRFSS)

**This dataset must be incorporated in all submitted projects.**

Sample Population: adults across 50 states (via phone survey)

Time frame: 2011 - 2012

Aggregation levels: geography, age, sex, demographics

Measures:  behaviors resulting in leading causes of premature mortality and morbidity among adults:

  • cigarette smoking
  • alcohol use
  • physical activity
  • diet
  • hypertension
  • safety belt use


Medicare Provider Utilization and Payment Data

Sample Population: Medicare fee-for-service beneficiaries across 50 states

Time frame: 2011 - 2012

Aggregation levels: cost, geography, age, sex, demographics, procedures, medical services, physicians

Measures: utilization and payments for the 100 most common inpatient services, 30 common outpatient services, and physician and other supplier procedures and services performed on 11 or more Medicare beneficiaries

Contact: Kevin Hodges, Expert in CMS Data


Here's a link to an interactive data tool that helps to organize the data above.


U.S. Census Bureau 

Sample Population: communities across the US

Time frame: 2010-2012

Aggregation levels: demographic, socioeconomic, geography, age, families, health insurance, disability, fertility

Measures: information about entire populations of communities, including cross-tabulations of age, sex, households, families, relationship to householder, housing units, detailed race and Hispanic or Latino origin groups, and group quarters

Contact: Derick C. Moore, Logan Powell, experts in Census data



The Quickfacts Visualization Tool provides fast, easy access to facts about people, communities, business, and geography on Census Data.

Structured Product Labeling


The FDA Structured Product Labeling database - "DailyMed" - provides high quality information about marketed drugs. This information includes FDA labels (package inserts). This Web site provides health information providers and the public with a standard, comprehensive, up-to-date, look-up and download resource of medication content and labeling as found in medication package inserts. 

Aggregation levels: drug name, pharmacologic class, ingredients

Measures: indications and usage, contraindications, adverse reactions, warnings and precautions, boxed warning

Point of contact: Randy Levin, expert in Structured Product Labeling data

In addition to XML Files, Randy Levin (listed mentor) would be able to provide some reports of the data in CSV format. An example of a CSV format his team could produce can be found here. Feel free to reach out to him with requests or questions.


National Health and Nutrition Examination Survey (NHANES)

Sample Population: adults and children in the US

Time frame: 2004-2014

Aggregation levels: demographic, socioeconomic, geography

Measures: health and nutritional status, medical, dental, and physiological measurements, and laboratory tests

Contact: Yinong Chong, liaison for NHANES data


The Nationwide Emergency Department Sample (NEDS)

Sample Population: Adults and children admitted to Emergency Departments (EDs)

Time frame: 2011

Aggregation levels: demographic, socioeconomic, geography, age, cost


  • Discharge data for ED visits from over 950 hospitals located in 30 States, approximating a 20-percent stratified sample of U.S. hospital-based EDs
  • Demographic data such as hospital and patient characteristics, geographic area, and the nature of ED visits (e.g., common reasons for ED visits, including injuries)
  • ED charge information for over 85 percent of patients, including individuals covered by Medicare, Medicaid, or private insurance, as well as those who are uninsured

Point of contact: Raynard Washington, liaison for HCUP data

**This dataset is available in summarized form on HCUPnet. For questions on the data, and to access the complete NEDS dataset, please contact the HCUP data liason listed above.

National Inpatient Sample (NIS)


Sample Population: Inpatient adults and children 

Time frame: 2012

Aggregation levels: demographic, socioeconomic, geography, age, cost, hospital characteristics

Measures: clinical and resource-use information that is included in a typical discharge abstract, including:

  • Primary and secondary diagnoses and procedures
  • Expected payment source, Total charges
  • Discharge status, Length of stay

Point of contact: Raynard Washington, liaison for HCUP data

**This dataset is available in summarized form on HCUPnet To access the complete NIS dataset for free, please fill out these files and email your request as well as a 100-300 words stating why you're requesting this data to

Sample Population: School-age youth

Time frame: 1991-2013

Aggregation levels: demographic, socioeconomic, geography, age

Measures: behaviors that contribute to the leading causes of death and disability among youth and adults

  • Behaviors that contribute to unintentional injuries and violence
  • Sexual behaviors that contribute to unintended pregnancy and sexually transmitted diseases, including HIV infection
  • Alcohol and other drug use
  • Tobacco use
  • Unhealthy dietary behaviors
  • Inadequate physical activity



Sample Population: communities across the US

Time frame: 2000-2014

Aggregation levels: geography

Measures: environmental data on air, waste, facility, land, toxics, compliance, water, radiation, and more for regions across the U.S. There are several choices for downloading data. Users may customize datasets by content and build their own search to download the results to a .csv file. The Geospatial Download feature enables a user to download spatial data files for use in mapping and reporting applications.


The Eco-Health Relationship Browser illustrates scientific evidence for linkages between human health and ecosystems. This interactive tool provides information about several of our nation's major ecosystems, the services they provide, and how those services, or their degradation and loss, may affect people.

Click HERE for the Health Data Consortium's directory of data showing file formats, dataset significance, and more!