This is my first capstone project of the Springboard curriculum My dataset comes from the adult 2017 California Health Information Survey: http://healthpolicy.ucla.edu/chis/Pages/default.aspx
My goal is to get a better understanding of the variables that cause psychological distress and to see how well it can predicted
The main code located within Capstone1_PsychologicalDistress.ipynb. This notebook contains all Data Wrangling, EDA, and machine learning models including Random Forest, Logistic Regression, K-Nearest Neighbors, and Gradient Boosting.
The main presentation is located in Capstone1_PredictingPsychologicalDistress.pdf
Other documentation:
- Capstone 1 Consolidated Report.docx describes in detail the project proposal, data wrangling steps, statistical analysis, and in-depth analysis
- CHIS 2017 ADULT (FINAL).pdf is the actual questionaire administered
- CHIS 2017 Data Dictionary_PUF_Adult Oct 2018.pdf is the data dictionary which describes collection methods and the the describes each question
- select_cols.csv are a list of columns I manually selected
Data not published, can be downloaded for free on website