Emma and I wanted to create a machine that will be able to predict a women's chance of dying given the country she is in, the country's GDP, it's expenditure on healthcare, the fertility rate of women of that country, as well as if women are provided for additional benefits such as paid maternal health care.
Our data came from worldbank data. We collected several csv files, compiled them into one for our baseline. Our target was predicting mortality rate with the rest being our independent variables.
After making a full CSV and converting it to a dataframe, we did some basic Explotarory Data Analysis. First we removed all null values, and cleaned up our data for any missing values. We also scaled and adjust to create a better fit model.
This is what our dataframe consisted of:
Here is also a correlational matrix:
_________For our first test, we chose to see the affect of fertility rate on mortality rate. We hyposthesized that there would be a strong positive correlation. This test proved that to be true:
Then we wanted to compare a country's GDP in relation to it's expendeture on healthcare and it's effect on maternal mortality rate. We found an inverse relationship (negative correlation) this time. This makes sense as we are supposed to observe lower maternal mortality rate on countries who can provide healthcare for the women.
Finaly, we made a model with some saling and feature tweak: We found that our model can predict if a woman is to survive childbirth in a given country with a confidence of adjusted 75%.
Of course there are few issues here, even after scaling. Our p-values are relatively high for some features. This is due to some features having less samples than the rest. In the future, we would like to run PCA and additional GridSearch based model to find a criteria selection that would improve this model even more.
Here is our last picture graphing our model's ability to predict mortality rate:
There are some outliers, but these are the extreme cases not the norm. As you can see, even with those issues our model came very close to predicting with high accuracy a woman's chance of survival given the country's gdp, healthcare expendature and fertility rate.
Thank you for reading.