HCUP State Inpatient Databases (SID)

Availability and description of data elements, sampling frame, methods
Dataset Summary

The State Inpatient Databases (SID) are one in a family of databases and software tools developed as part of the Healthcare Cost and Utilization Project (HCUP).  The SID contain the universe of the inpatient discharge abstracts  in participating States, with data encoded in a uniform format that allows for analyses across and between States.  Together, the SID encompass more than 95 percent of all U.S. hospital discharges. Core data elements in the SID include patient demographics, insurance and cost information, diagnoses, procedures, admission and discharge status, and length of stay. The SID can be linked to the American Hospital Association Annual Survey File and the Area Resource File, except in those States that do not allow the release of hospital identifiers  (see elsewhere in the SGIM Research Dataset Compendium for description of these datasets).  Data can be accessed after completion of a Data Use Agreement, online training, and an application kit.  The SID range in price from $35 to more than $3,000 per State per year, depending on the State.

Expert comments
The HCUP SID are well suited for research that requires complete enumeration of hospitals and discharges within market areas or a particular State. Researchers and policymakers use the SID to investigate questions unique to one State, compare data from two or more States, conduct market area research or small area variation analyses, investigate hospitalizations for special populations, and to identify State-specific trends in inpatient care.  The SID provide the building blocks of the Nationwide Inpatient Sample (NIS), a nationwide database of hospital inpatient stays.

Dataset Details

Dataset owner / manager

Agency for Healthcare Research and Quality (AHRQ), under the Healthcare Cost and Utilization Project (HCUP) in collaboration with participating States

Study and sample characteristics 

The State Inpatient Databases are State-specific files that contain all inpatient care records in participating states. The SID are released on an annual basis, and are available beginning in 1990 through the present.

Participating States and number of discharges per State year:

Major foci

The State Inpatient Databases include data elements routinely captured in discharge abstracts from inpatient hospital stays, including:

• Principal and secondary diagnoses
• Principal and secondary procedures
• Admission and discharge status
• Patient demographics (e.g., gender, age, and, for some States, race)
• Expected payment source (e.g., Medicare, Medicaid, private insurance, self-pay; for some States, additional discrete payer categories, such as managed care)
• Total charges
• Length of stay


The SID data files also include measures of disease severity and information on diagnosis and procedures groups designed to facilitate analyses.

A detailed list of data elements can be found at:

Special supplements and resources

The HCUP Cost-to-Charge Ratio Files allow conversion of billed charges (available in the main SID database) to hospital costs.

The Hospital Market Structure File is a hospital-level file that contains measures of hospital market competition, allowing investigators to characterize the intensity of competition faced by specific hospitals under user-defined definitions of market area.

The HCUP Supplemental Files for Revisit Analyses are discharge-level files designed to facilitate analyses that need to track sequential visits for a patient within in a state and across facilities and hospitals settings (inpatient, emergency department, ambulatory surgery) while adhering to strict privacy guidelines.

Links to other datasets

The SID can be linked to the American Hospital Association Annual Survey File and the Area Resource File, except in those States that do not allow the release of hospital identifiers. Information on linking to AHA Annual Survey files is provided at:

Papers published

Examples of papers published using SID include:

Dataset accessibility and cost

Data are deidentified and available for purchase after completion of a Data Use Agreement, HCUP online training, and SID Application Kit.  Data are priced per State per year, with costs ranging from $35 to over $3000 per State per year (varying in part whether the investigator is a student, non-profit institution, or for-profit entity).  The investigator can select which States and which years they would like to purchase.  Prices are listed in the application kit.

