Giter VIP home page Giter VIP logo

prakharjadaun / feature-extraction-for-spam-email-detection Goto Github PK

View Code? Open in Web Editor NEW
8.0 1.0 2.0 19.49 MB

Implemented Preprocessing steps, Feature Extraction techniques and Naive Bayes Classifier in C++. Moreover, we have also implemented all the steps using python for comparative analysis.

License: MIT License

C++ 71.67% Jupyter Notebook 12.47% CMake 0.52% Objective-C 15.35%
stemming nlp-machine-learning text-classification naive-bayes-classifier-cpp email-spam-classifier bag-of-words-cpp

feature-extraction-for-spam-email-detection's Introduction

Feature Extraction for Spam Email Detection ๐Ÿ“ง

The proposed system of the project will effectively detect the spam mails and the system will extract the spam mails by using some machine learning algorithms and it gives the result with greater accuracy and with good performance. It will save the user's time and it destroys the risk of spam mails.

๐Ÿ“‹ Project Description

Emails are the popular and preferred way of writing communication in our everyday life. The problem with emails is spam. Over the past decade, unsolicited bulk emails have become a major problem for email users. A huge amount of spam flows into users' mailboxes every day.

The increasing amount of spam emails day by day is causing many important emails to be lost in the sea of junk mail. To reduce this issue, we are implementing ways in which spam email can be differentiated from important emails.

By doing this we can reduce the time spent to look for an important email which in turn reduces the hassle associated with the process. The results we are expecting are to perform filtering in the most accurate way to differentiate the spam emails from the ham.

๐Ÿ—ƒ๏ธ Project Feature

The main feature of our project is to determine if a received email is spam or ham. This feature will be very useful for students or working professionals who have to deal with emails every day. This project also aims in preventing phishing attempts by filtering the spam from ham emails.

A. Pre-processing

  • Removal of Special Characters
  • Removal of Numbers
  • Lowercase Conversion
  • Tokenization
  • Removal of Stop words
  • Stemming

B. Feature Extraction

  • Bag of words
  • Tf-Idf

C. Classification

  • Naive Bayes Algorithm (in C++ also)
  • Random Forest Classifier
  • Support Vector Machine
  • MLP Classifier

๐Ÿ“Š Dataset Preparation

Note: The datasets that are created in our project has been uploaded here : Datasets

feature-extraction-for-spam-email-detection's People

Contributors

prakharjadaun avatar prakratisingh avatar sanidhyajadaun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.