Giter VIP home page Giter VIP logo

data_challenge's Introduction

DATA CHALLENGE

Thank you for your interest in Data Science at Brightside! The next step in the process is a data challenge. The goal for us is to get an understanding of how you approach and think about problems, and how you work with data. While the deliverable includes a machine learning model, the evaluation is much deeper than that -- we care about how you're getting to that final state, your logic, and your code.

This repository has 2 years worth of Lending Club loan files stored in the data/ directory. These files are quarterly, and have data on loans that Lending Club has issued (date, amount, term, interest rate), metadata about the customer who took them out (such as employment, annual income, FICO), and the loan status. There is a data dictionary stored in the docs/ directory.

Goal: build a model that predicts a new loan's probability of default, using the data provided.

Model Usage: this model will be used to determine which new loans an investor should invest in. This means: I am going to Lending Club and ready to invest $100. There is a list of loans (which have not yet been funded) that I get to choose from, and I want to know which ones are the best to invest in. Keep that goal in mind as you build your feature set and final solution.

To get started, fork this repository, make the repository you're working on private, and add me as a collaborator.

There is no time limit on this challenge -- it is up to you to balance between taking your time and trying various methods you choose, but not take too long, and allow other applicants to get the chance for a final interview first. When you have completed the data challenge, send me an email at [email protected] to let me know it's ready to be reviewed. You can use this same email address if you have any questions.

NOTE: the immediate need for our team is more data visualization work (building out core company dashboards, in an effort to limit our future ad-hoc requests). Therefore, it would be beneficial for you to focus more of your energy on the early stages of the data science flow (cleaning, EDA, etc), and less time on the modeling.

Have fun!

data_challenge's People

Contributors

bnebeker avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.