Giter VIP home page Giter VIP logo

hadley-dixon / fertilitygdpregression Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.52 MB

Performs EDA on fertility rates and national income, fits a simple linear regression model, diagnoses its validity, and use it to make predictions about future fertility based on income

HTML 100.00%
r confidence-bands confidence-intervals data-transformation data-visualization diagnostic-plots hypothesis-testing matrix-manipulation simple-linear-regression

fertilitygdpregression's Introduction

Project Description

This project is to analyze a dataset, from start to finish, based on the simple linear regression model.

Data Description

The data in the file “UN.txt” contains PPgdp, the 2001 gross national product per person in US dollars, and Fertility, the birth rate per 1000 femals in the population in the year 2000. The data are for 184 localities, mostly UN member countries, but also other areas such as Hong Kong that are not independent countries. In this problem, we study the relationship between Fertility and PPgdp.

Data visualization and pre-processing

  1. Draw the scatterplot of Fertility on the vertical axis versus PPgdp on the horizontal axis and summarize the information in this graph. Does a simple linear regression model seem to be a plausible for a summary of this graph?
  2. In order to get a better fit, we seek to transform the variables. What transformations you would take so that a simple linear regression model is proper? State why you choose these transformations. Draw the scatter plot of the transformed variables. Comment on the plot.

Model fitting and diagnostics

  1. Fit the simple linear model on the transformed data through three ways. Report the least square estimates for the coefficients and R2. Add the fitted line to the scatter plot on the transformed data and comment on the fit.
  • Plain coding (not using the ‘lm’ function or matrix manipulation)
  • Using the ‘lm’ function
  • Through matrix manipulation
  1. Draw the diagnostic plots and comment.

Inference

  1. Test whether there is a linear relationship between the transformed variables.
  2. Provide a 99% confidence interval on the expected Fertility for a region with PPgdp 20,000 US dollars in 2001.
  3. Provide a 95% confidence band for the relation between the expected Fertility and PPgdp. Add the bands to the scatter plot of the original data.
  4. Assuming that the same relationship between Fertility and PPgdp holds, give a 99% prediction interval on Fertility for a region with PPgdp 25,000 US dollars in 20181.
  5. Based on the diagnostic plots in Part 4, do you have any concern on the above hypothesis testing and inferences? If so, what are the concerns?

fertilitygdpregression's People

Contributors

hadley-dixon avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.