top of page

Social Determinants of Health

Analysis of healthcare costs and utilization focusing on inpatient samples

Introduction


The US Agency for Healthcare conducted a nationwide survey of hospital costs that focused on inpatient samples. The survey was restricted to the city of Wisconsin and the age group 0-17 years. The agency wants to analyze the data to research healthcare costs and their utilization.


Questions


  1. To record the patient statistics, the agency wants to find the age category of people who frequent the hospital and has the maximum expenditure.

  2. In order of severity of the diagnosis and treatments and to find out the expensive treatments, the agency wants to find the diagnosis-related group that has maximum hospitalization and expenditure.

  3. To make sure that there is no malpractice, the agency needs to analyze if the race of the patient is related to hospitalization costs.

  4. To properly utilize the costs, the agency has to analyze the severity of the hospital costs by age and gender for proper allocation of resources.

  5. Since the length of stay is a crucial factor for inpatients, the agency wants to find if the length of stay can be predicted by age, gender, and race.

  6. To perform a complete analysis, the agency wants to find the variable that mainly affects hospital costs.


Analysis Results


Data cleaning and analysis was conducted in R-Studio, with visualizations created using ggplot. Questions 1-3 were answered using visualizations and the remaining questions were addressed through null hypothesis testing with linear regression models.


  1. The analysis results were that the average length of stay decreased with age while the average expenditure increased, with the highest expenditure found in the age range of 6-15.

  2. Diagnosis 20195 had the highest expenditure, with an average cost of around 20,000.

  3. Visual analysis demonstrated that race 2 had an average cost of ~4,250, almost twice the average cost of all races, but null hypothesis testing found that race was not a significant predictor of cost.

  4. Visual analysis of hospitalization costs according to gender showed that while costs increased with age for males, they were lower for females, and increased with age until 16-20, where it reached its lowest point.

  5. Null hypothesis testing found that age and gender were both significant predictors of hospitalization costs (p-value ≤ 0.05), while gender and race were statistically insignificant predictors (p-value > 0.05).

  6. Length of stay and diagnosis were found to be the most significant predictors according to null hypothesis testing, with both p-values < 2e-16.



Project Gallery

bottom of page