Predicting length of stay in beneficiaries using county-level profiling and multiple linear regression on Medicare claims data

Nguyen, Thuy-Nhi

Background

In the United States of America, there have been questions around the variability in healthcare.¹ Many concerns revolve around cost differences based on the severity of illness.² From 2008 to 2010, total hospital expenditures for Medicare beneficiaries has risen from $205 billion to $220 billion according to the Centers of Medicare and Medicaid Services (CMS) historical national health expenditure data. As hospital expenditures increase through time, the rising and alarming numbers has emanated healthcare providers and policy makers to search for ways to minimize the excessive costs. A typical approach to refine costs is by looking at the hospital’s resource utilization.

Average inpatient hospital prices per admission rose from $12,747 to $15,111 in three years.³ A common resource utilization parameter is length of stay. Length of stay is defined as the difference between the date of admission to the date of discharge. Although providers cannot determine how long a beneficiary stays at a hospital due many various reasons, they can be supported with county-level profiling. County-level profiling is a method to measure counties’ performance and report the information to the public. In other words, the method can provide evidence and accountability on how well hospitals are performing at a specific county. Studies have shown that a modest approach to county-level profiling has proven to reduce length of stay without negatively impacting healthcare provider’s behaviors.⁴ Necessary resources needed to be made available to proceed with county-level profiling are claims and administrative data. With the repurposing of accessible and inexpensive claims and administrative data, data mining techniques such as the CRISP-DM methodology and predictive analyses can be performed to gain valuable insights in healthcare.⁵ These data mining techniques could consider how to efficiently manage bed occupancy at inpatient hospitals by predicting how long a beneficiary occupies a bed space based on their demographics, chronic conditions, or comorbidities. It is a cost-effective solution to minimizing resource utilization, healthcare expenditures, and severity of a chronic condition.⁶ Findings suggest that obtaining estimates of a beneficiary’s length of stay can help healthcare providers better manage hospital beds more efficiently and effectively by freeing up space in admission scheduling and bed occupancy.⁷

This study focused on county-level profiling in Texas by predicting average length of stay in Medicare beneficiaries while risk adjusting for respiratory diseases using the CMS 2008 – 2010 deidentified synthetic public use file claims data. The claims data will be referred as DE-SynPUF for simplicity. Variables used to predict average length of stay were comorbidities and demographic traits which were found in the inpatient claims and beneficiary summary data sets. To calculate the average length of stay or observed rates of a county, all beneficiaries’ length of stay was averaged based on their county. Profiling the counties allowed characterization of a typical length of stay behavior and identification of length of stay deviations across the state and counties. Out of all the states, Texas was chosen because it is a densely populated and comprises of 234 counties. The large number of counties help provide more reliable models in the end. To profile the typical length of stay of beneficiaries who were admitted to an inpatient hospital, the observed average length of stay was computed and used to assess the expected values. A multiple linear regression model was constructed to predict the expected values. Since the outcome of interest was a continuous variable, a multiple linear regression model was suitable for analyses though the model has its limitations being known and prone to high variance and overfitting. The predicted values were used to determine the expected rate which was the calculated weight that was most likely to occur.⁸ Observed and expected rates established the county-level profiling ratings.

It was important to note that the counties were formatted as numeric. CMS did not specify the names or details of the counties in their codebook as it was not recommended that the DE-SynPUF was used for research purposes. This study was used for mainly for educational purposes. Additionally, the study was performed on SAS studio, a web application software where data files can be stored and accessed in libraries as well as manipulated using programs. The SAS studio programs followed a CRISP-DM framework which encompasses six steps: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.⁹

As earlier stated, the business understanding phase determined the objective of the study which was county-level profiling and predicting average length of stay. The next phase, data understanding, addressed through descriptive statistics. Following that, the data was prepared and transformed to a final data set by risk adjusting. Multiple linear regression modeling was initiated in the modeling phase to predict expected rates. Lastly, the models were evaluated and deployed through intervention recommendations.

Methods

Following the CRISP-DM methodology, the study process was divided in three components using SAS studio. At the data preparation stage, Medicare data was standardized by risk-adjustment methods and transformed using the Clinical Classification Software (CCS), a commercialized grouper software. Second, the observed rates or actual values were computed by defining the numerator as the number of beneficiaries who stayed in hospitals after being admitted for a respiratory disease and the denominator as the population at risk of a respiratory disease. Exclusion rules include claims data from outpatient, carrier, and prescription drug events. Lastly at the modeling stage, the expected rates or predicted values were computed by splitting the data into training and validation data sets, creating and running predictive models and statistical analyses, and scoring the data. The statistical analyses include comparing the observed and expected rates using O/E ratio with upper and lower 95% confidence intervals to determine the risk adjusted rate and how well a county performed, computing the root mean squared error and standard error to measure the distance and accuracy from the observed to the expected values. All executed SAS studio codes can be referred in Appendix A.

The characteristics of the study population was shown through descriptive statistics using univariate and means procedures. Since this study focused on counties in Texas using the beneficiary summary and inpatient claims data, the SAS procedures displayed a reduction in the number of entries from 687,502 to 39,393 which included multiple events or duplicates of the same beneficiaries. The number of unique records went from 232,747 to 13,321. The merged summary and inpatient data were filtered using the CCS multiple diagnosis grouper to adjusted for patients with risk of respiratory system diseases which further reduced the data from 39,393 to 10,275 entries. To combine beneficiaries with multiple events and duplicates into one record, the data was narrowed down to a cohort of 986 unique records.

Five out of 81 variables listed in the DE-SynPUF inpatient claims data include beneficiary code, inpatient admission date, claim utilization day count, inpatient discharged date, and claim diagnosis code 1. Since claim utilization day count was analogous to length of stay, the variable was used as a precursor to the outcome of interest, average length of stay. Eighteen of the 32 variables listed in the DE-SynPUF beneficiary summary data include beneficiary code, date of birth, date of death, sex, beneficiary race code, state code, county code, and chronic conditions indicators such as Alzheimer’s or related disorders or senile, congestive heart failure (CHF), chronic kidney disease, cancer, chronic obstructive pulmonary disease, depression, diabetes, ischemic heart disease, osteoporosis, rheumatoid arthritis and osteoarthritis (RA/OA), and stroke/transient ischemic attack. Merging the datasets was achieved through assigning the beneficiary code as the primary key. In addition, the chronic condition and demographic variables from the beneficiary summary data were assigned as predictors in the regression analyses. Observed rates were computed by averaging all beneficiaries belonging to each county to compute an average county length of stay. To attain this, a table was created for each 234 counties. Each table was aggregated to display conveniently all the county’s averages and to later compute the state average for statistical analyses. Expected rates were determined from multiple linear regression modeling. A stepwise regression was performed as an automated method to add or delete variables in the model that either met statistical significance or not. After compiling the data to a cohort of 986 unique records, the data was checked for duplicates to confirm that no beneficiary had more than one record. After checking for duplicates, 50% of the samples went to a training data set and the other 50% of the samples went to a validation data set. Each data set held 493 samples each. The multiple linear regression model was performed on comorbidities and demographic traits using the regression procedure. The comorbidity model generated predicted values or expected rates of average length of stay for each county. Both the observed and expected rates were necessary to analyze the significance of the O/E ratio and difference. The O/E ratio was computed by dividing the observed rates over the expected rates.

Statistical analyses such as root mean squared error (RMSE), standard error (SE), and confidence intervals were performed manually by using the following four formulas:

(1) RMSE = County Risk Adjusted Rate x (1 – County Risk Adjusted Rate) (2) SE = RMSE/√Denominator (3) Upper Confidence Interval = Risk Adjusted Rate + 1.96(√(Risk Adjusted Rate(1-Risk Adjusted Rate)/500)) (4) Lower Confidence Interval = Risk Adjusted Rate – 1.96(√(Risk Adjusted Rate(1-Risk Adjusted Rate)/500))

The county risk adjusted rate was calculated by taking the O/E ratio and multiplying it by the state rate. The state rate was determined by averaging all the counties from the Texas county average table. The denominator was calculated by using the distinct count procedure.

Results

As part of the data preparation stage, descriptive statistics using univariate and means procedures were performed to compute the mean, median, mode, standard deviation, and range of the variables for the cohort at risk for respiratory diseases in Texas counties. Previously stated, the outcome of interest, average length of stay, for each beneficiary was derived from the claim utilization day count variable. The average length of stay variable had a mean of 6.16, median of 4.5, mode of 3.0, standard deviation of 6.42, and range of 77. The beneficiary race code had a mean of 1.48, median of 1, mode of 1, and range of 4. The beneficiary sex had a mean of 1.58, median of 2, mode of 2, standard deviation of 0.49, and range of 1. Congestive heart failure indicator had a mean of 1.08, median of 1, mode of 1, standard deviation of 0.27, and range of 1. Chronic obstructive pulmonary disease indicator had a mean of 1.19, median of 1, mode of 1, standard deviation of 0.39, and range of 1. Alzheimer’s indicator had a mean of 1.23, median of 1, mode of 1, standard deviation of 0.42, and range of 1. Chronic kidney disease indicator had a mean of 1.21, median of 1, mode of 1, standard deviation of 0.41, and range of 1. Cancer indicator had a mean of 1.7, median of 2, mode of 2, standard deviation of 0.46, and range of 1. Depression indicator had a mean of 1.22, median of 1, mode of 1, standard deviation of 0.42, and range of 1. Diabetes indicator had a mean of 1.07, median of 1, mode of 1, standard deviation of 0.25, and range of 1. Ischemic heart disease indicator had a mean of 1.04, median of 1, mode of 1, standard deviation of 0.21, and range of 1. Osteoporosis indicator had a mean of 1.37, median of 1, mode of 1, standard deviation of 0.48, and range of 1. Rheumatoid arthritis and osteoarthritis indicator had a mean of 1.43, median of 1, mode of 1, standard deviation of 0.49, and range of 1. Stroke/transient ischemic attack indicator had a mean of 1.7, median of 2, mode of 2, standard deviation of 0.46, and range of 1. Descriptive statistics were not performed for other provided variables in the study but were still important to discuss. Although beneficiary and payer reimbursement and responsibility amounts were not predicting factors in the study, their descriptive statistics were notable to understand the financial aspects of the data. Using the SAS means and univariate procedures, the average inpatient annual Medicare reimbursement amount was $2,184.02. The lowest reimbursement amount a beneficiary gained was $990 whereas the highest amount a beneficiary owed was $167,840. The average inpatient annual beneficiary responsibility amount was $255.49. The responsibility amount ranged from zero dollars but up to $22,200. The average inpatient annual primary payer reimbursement amount was $103.14. The reimbursement amount ranged from zero dollars and up to $68,000. Other variables such as beneficiary birth date and death date were not included in this study due to complications in computing age with beneficiary birth dates and the large amount of missing data in death dates. DE-SynPUF only had 611 out of 39,393 death dates available. If multiple entries were considered, then the number of beneficiary death date entries would lower drastically. Death dates only accounted for less than 2% of the data which was later reduced based on the beneficiary’s county and chronic condition; therefore, the variable would have minimal contributing power.

After further examination of the data, it was important to note the variations between counties. The top counties with the largest frequencies were 610, 130, 910, and 390. The multiple linear regression model for comorbidities reported rheumatoid arthritis and osteoarthritis (p=0.03) and Alzheimer’s disease (p=0.08) as the only comorbidities that had potential of predicting the average length of stay in beneficiaries with respiratory related disease shown in Table 1. No other variable met statistical significance at the 0.15 p-value level. The results for the multiple linear regression model for demographic traits indicated no variable that met statistical significance at the 0.15 p-value level shown in Table 2. The same comorbidities model was run again at a later date and resulted in statistical significance in diabetes (p=0.002) and osteoporosis (p=0.08). The contradiction of second run with the first run begged the question of the reliability of the modeling technique. In contrast, when the demographic traits model was run a second time, the variables remained the same in terms of statistical significance.

Table 1: Multiple linear regression model on comorbidities: CHF, chronic kidney disease, cancer, depression, ischemic heart disease, osteoporosis, stroke/transient ischemic attack, RA/OA, and Alzheimer’s disease.

Table 2: Multiple linear regression model on demographic traits: beneficiary sex and race. When comparing the expected length of stay rates with the observed length of stay rates, the frequency procedure was performed to determine the number of counties that had shorter or longer length of stay. The procedure displayed 46 out of 119 counties had longer length of stay than expected in Table 3a. Similarly, table 3b described the performance of the counties with longer length of stay than expected as worse than expected. There were no counties where the predicted values matched the expected length of stay; therefore, longer and shorter length of stay than would be expected were the only options available.

Table 3a: Frequency procedure on describing the number of counties with longer or shorter length of stay than expected.

Table 3b: Frequency procedure on describing the number of counties with better or worse than expected performance. Table 4 described each county’s O/E ratio, risk adjusted rate, expected and observed rates, beneficiary count, root mean squared error, standard error, and upper and lower 95% confidence intervals. Of all the counties, county 70 had a beneficiary with the highest observed length of stay of 42 days. County 610 remained the county with the largest number of beneficiaries of 116. Counties 180 and 190 had no reported observed values thus SAS output resulted in a missing value symbol denoted by a period. Counties with the lowest length of stay of zero were 312, 583, and 822 with one beneficiary each. The observed and expected rate difference was calculated to determine the top counties with the largest difference. Of the 122 counties, county 70 again ranked highest with a difference of 35. Other counties that had large differences between observed and expected rates were 391, 591, 751, and 790.

Table 4: Profiling each county in Texas. Statistical analyses were performed for confidence and accuracy in results.

Discussion

Minimizing resource utilization by reducing length of stay is a typical method used across hospitals. Predicting length of stay is important to estimate how long a beneficiary could stay at an inpatient hospital in order to reduce future healthcare expenditures and costs. Optimizing a beneficiary’s length of stay can ensue huge benefits in cost savings and avoid unnecessary treatments. It is also important to emphasize that a beneficiary’s quality of care should not be compromised due to reducing length of stay or costs. In order to achieve this, county-level profiling was necessary to provide healthcare providers and policy makers the information on their counties’ length of stay performance. The aim of this study was to predict the average length of stay for each county and provide feedback on counties with their ratings.

Results from the study indicated that counties with longer length of stay could have improper or unnecessary resource utilization. In fact, 39% of the counties in the study had longer than expected length of stay. These poor ratings should be further investigated. County 70 with a beneficiary with a length of stay of 42 days was the most noticeable poor rating. Though it is likely that the length of stay was an outlier or possible fraud case, more information about this outlier is needed to improve the county’s performance. Future studies regarding super utilizers and fraud should consider these outliers as part of their analyses as it can affect the average length of stay of each county and overall state average rate. County-level profiling wasn’t made possible without the aid of descriptive statistics.

Descriptive statistics summarized the data in a few terms: mean, median, mode, standard deviation, and range. Mean, median, and mode gauged what the typical value of the data would be and standard deviation and range determined how well and long the data was distributed. The descriptive statistics on average length of stay reported different values for its mean, median, and mode which meant there was high variation in the data. Additionally, the high standard deviation of 6.42 days and a range of 77 days indicated the normal distribution of length of stay can be from zero days to 12 days but some cases or outliers went as far as 77 days. Table 4 showed counties 70, 391, 591, 751, and 790 with observed average length of stay deviating from the normal distribution which should be further investigated and could contribute to the county’s poor performance. Descriptive statistics on race and sex identified the data was mostly comprised of the white race and females. Descriptive statistics were also studied on the comorbidity predictor variables. Stroke and cancer had statistics of 2 which meant that the comorbidities weren’t indicated or present in majority of the data meanwhile the rest of the independent variables were present. Subsequent to descriptive statistics, modeling was performed.

The multiple linear regression model inferred that the comorbidity variables, RA/OA and Alzheimer’s, had statistical significance meaning that there was at least 90% confidence that these variables can predict average length of stay. After further examination, multiple linear regression was used due its simplicity and applicability to other datasets. For this reason, other modeling techniques should be considered in future studies. The multiple linear regression model executed differently each time the model ran. Some cases showed more or less variables with statistical significance and differing comorbidity outcomes. As a result, multiple linear regression modeling was not considered reliable in this study due to the differences in outcomes; therefore, future studies should consider other modeling methods before any interventions implementations. To confirm the reliability of the models, statistical analyses were calculated.

O/E ratio, risk adjusted rates, confidence intervals, standard error, root mean square error, and observed and expected rate differences were manually calculated to assess each counties’ performance. Comparing the counties observed and expected rates were done by the O/E ratio. All counties had an O/E ratio close to zero which indicated a desirable outcome. But it was important to note that most of the data had low beneficiary count numbers so may contributed to the low O/E ratio. Next, O/E ratio was used to determine the risk adjusted rate. The purpose of the risk adjusted rates was to compare counties’ performance fairly based on a related disease. Risk adjusted rates were significant to producing reliable county-level profiles. Each county was risk adjusted to respiratory related diseases so they could be compared justly. Next, the upper and lower 95% confidence intervals were calculated to determine the confidence in the results. Using their risk adjusted rate, 32 counties were within their specified confidence intervals while the other 90 counties were not. Only 26% of the data were confidently supported by statistics which indicated the other 74% were not. Standard error stipulated the accuracy of the sample distribution relative to the sample population. In fact, counties 590 and 221 had values close to 1 which meant that they had wider spread in sample distribution compared to their counterparts. To measure error of the predictive nature of the data, root mean square error between 0.2 and 0.5 indicated good and accurate model prediction. Counties 582, 844, 420, 170, and 220 were the only counties that fit within the range thus having accurate model prediction power. Less than 5% of the counties had a good RMSE value which suggested that the multiple linear regression modeling technique may not be suitable at predicting average length of stay. The differences in observed and expected values also suggested the predictive power of the models. County 70 had a difference of 35 days. Recognizing county 70 could have a potential outlier, the observed and expected rate difference confirmed the discrepancy.

The significance of statistical analyses was to measure the likelihood of associations by either chance or a result of the outcome of interest. They quantify the predictions so decision makers can decide on what quality and financial improvements should be made in their county. Although the financial aspect of the data wasn’t studied thoroughly due to the complexity of continuous predictors in modeling, it is mentionable for future studies as valuable insights on cost containment methods can be recognized. Other difficulties besides the modeling approach surfaced in the process of the study. As previously stated, birth and death dates could not be used as predictors in this study due to complications with computing age and the lack of data. There were only 611 out of 39,393 death date entries in the DE-SynPUF. Less than 2% of the data had death dates meanwhile the other 98% were missing; therefore, implications on older age population and county comparisons on mortality could not be reasonably analyzed. There couldn’t be any evidence to show which counties had sicker patients due to missing data. Limitations in the data should be recognized before implementing interventions.

Counties could increase their ratings and lower their expenditures through evidence-based interventions. An intervention administered by a Canadian hospital achieved successful results through implementing a role into the healthcare system, patient navigator.¹⁰ The study described patient navigators as “dedicated patient-care facilitators.” In other words, these navigators act as a liaison for the patient, their families, and medical staff. The study showed improvement in communication and patient experience from implementation. Results indicated a return on investment outweighing the cost of having the patient navigator through the increase in patient satisfaction, decrease in length of stay, and lower than expected costs. Other related interventions suggested developing multidisciplinary team care, improving discharge planning, implementing early supported discharge programs and clinical care pathways.¹¹ These interventions could be established as part of the patient navigator’s role. More specifically, a patient navigator could be in charge of improving early discharge planning and coordinating with healthcare providers and patients to promote better structural processes. In order to successfully implement these interventions, county-level profiling should be coupled with process improvement initiatives.¹²

Conclusion

All in all, hospital expenditures continue to rise as time passes. It is necessary to take action based on these alarming numbers through county-level profiling and predictive modeling. County-level profiling could enhance hospital’s performances that they may have never knew existed until shown to them through data mining. Predictive modeling could help hospitals that want to better manage their resource utilization to reduce present and future expenditures. Although this study didn’t produce reliable models as expected, it did deliver on recognizing the counties with the highest variability in terms of the outcome of interest. These insights could inform decision makers on possible interventions poorly rated counties may need. Data mining provides evidence and suggestions to decision makers that the healthcare system is flawed due to its variability and rising expenditures but county-level profiling and predictive analyses techniques can supplement as actions to the road of improvement.

References

Goldfield, N., Gnani, S., Majeed, A. Profiling performance in primary care in the United States. BMJ. 2003 Apr; 326(7392):744-747.
Thomas, J. W. Should Episode-Based Economic Profiles Be Risk Adjusted to Account for Differences in Patients’ Health Risks? Health Serv Res. 2006 Apr; 41(2): 581-598.
Lemieux, J., Mulligan, T. Trends in Inpatient Hospital Prices, 2008 to 2010. The American Journal of Managed Care. 2013 Mar; 19: n3.
Zemencuk, J. K., Hofer, T.P., Hayward, R. A. et al. What effect does physician “profiling” have on inpatient physician satisfaction and hospital length of stay? BMC Health Serv. 2006; 6: 45.
Caetano, N., Cortez, P. Laureano, R. Using Data Mining for Prediction of Hospital Length of Stay: An Application of the CRISP-DM Methodology. Enterprise Information System. 2015; 227: 149-166.
Hachesu, P. R., Ahmadi, M., Somayyeh, A., Sadoughi, F. Use of Data Mining Techniques to Determine and Predict Length of Stay of Cardiac Patients. Healthc Inform Res. 2013 Jun; 19(2): 121-129.
Rinne, S. T., Graves, M. C., Bastian, L. A., Lindenauer, P. K., Wong, E. S. Hebert, P. L., Liu, C. Association Between Length of Stay and Readmission for COPD. Am J Manag Care. 2017 Aug; 23(8): e253-e258.
Provost, F., Fawcett, T. Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking. Sebastopol, CA. O’Reilly Media. 2013.
Niaksu, O. CRISP Data Mining Methodology Extension for Medical Domain. Baltic J. Modern Computing. 2015; 3(2):92-109.
Kwan, J. L., Morgan, M. W., Stewart, T. E., Bell, C. M. Impact of an Inpatient PN Program. J. Hosp. Med. 2015; 12:799-803.
Miani, C., Ball, S., Pitchforth, E., et al. Organisational interventions to reduce length of stay in hospital: a rapid evidence assessment. Southampton (UK): NIHR Journals Library; 2014 Dec; 2(52).
Evans, J. H., Hwang, Y., Nagarajan, N. Physicians’ Response to Length-of-Stay Profiling. Medical Care. 1995 Nov; 33(11): 1106-1119.

ihnguyen / hospitalbed Goto Github PK

hospitalbed's Introduction

Predicting length of stay in beneficiaries using county-level profiling and multiple linear regression on Medicare claims data

Nguyen, Thuy-Nhi

Background

Methods

Results

Discussion

Conclusion

References

hospitalbed's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent