Giter VIP home page Giter VIP logo

-sb-capstone-project-ii-quora-insincere-questions's Introduction

-SB-Capstone-Project-II-Quora-insincere-questions

Capstone project II for Spring board - Kaggle Quora Insincere Question Challenge

Quora Insincere Questions Classification Spring Board Capstone Project II Overview Quora is a service that helps people learn from each other by asking and answering questions - and a key challenge in providing this type of service is filtering out insincere questions. Quora is attempting to filter out toxic and divisive content to uphold their policy of “Be Nice, Be Respectful”.

Data Source https://www.kaggle.com/c/quora-insincere-questions-classification/data

Goals Identify and flag insincere questions using machine learning. Maximize F1 score by accurately predicting whether a question is sincere or not. Specializations Advanced NLP TensorFlow and Keras Value of Solution An accurate solution can help Quora develop more scalable methods to detect toxic and misleading content and combat online trolls at scale

This solution will help Quora to uphold their policy of ‘Be Nice, Be Respectful”

Baseline Models Used Logistic Regression K means clustering XGBoost Voting Classifier Deep Learning Models Used Convolutional Neural Network

Self-Trained Embedding Google News Vectors Long Short Term Memory Network

Google News Vectors Here we allow the pre-trained embeddings to be updated during training and use a LSTM model. To date, this is the best performing model.

Evaluation of Models The best performing model is the LSTM using pre-trained embeddings that we continue to update during the training of the model. This was determined by comparing F1 scores. All neural network models, including CNN with self-trained and pre-trained embeddings, outperform the shallow learning scikit-learn and XGBoost models.

Production and Beyond In a production environment, this model can be used to evaluate new questions as they are asked. When the user submits a new question on Quora the model will be used to predict the sincerity of the question. If the question is determined to be sincere it will be posted online, if insincere the user will be prompted to edit their question and resubmit. To continue to improve the model going forward new questions and labels will be added to the training data and the model will be updated with the new information.

-sb-capstone-project-ii-quora-insincere-questions's People

Contributors

venkatchadalavada avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.