Giter VIP home page Giter VIP logo

spampy's Introduction

spampy

image

image

image

image

image

image

image

Spam filtering module with Machine Learning using SVM. spampy is a classifier that uses Support Vector Machines which tries to classify given raw emails if they are spam or not.

Support vector machines (SVMs) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier.

Many email services today provide spam filters that are able to classify emails into spam and non-spam email with high accuracy. spampy is a learning project that you can use filtering spam mails.

spampy uses two different datasets for classification. One of the datasets is already imported inside the project under spampy/datasets/ folder. Second dataset is enron-spam dataset and inside the spampy folder I created a shell script which downloads and extract it for you.

Project tree

  • email_processor Helper to collect features and labels from datasets.
  • spam_classifier Classifies given raw emails.
  • dataset_downloader Enron dataset downloader which uses dataset_downloader.sh

Dependency List

  • scikit_learn
  • scipy
  • numpy
  • nltk
  • click (for CLI)

Two main function of spam_classifier classifies given raw email.

  • classify_email
  • classify_email_with_enron

Installing

You can install spampy using Python Package Index:

$ pip install spampy

Install with conda from the Anaconda conda-forge channel:

$ conda install -c conda-forge spampy

Install from its source repository on GitHub:

$ pip install -e git+https://github.com/abdullahselek/spampy#egg=spampy

CLI

For available commands python -m spampy -h

Spam filtering module with Machine Learning using SVM.
Usage
  $ python spampy [<options>]
Options
  --help, -h              Display help message
  --download, -d          Download enron dataset
  --eclassify, -ec        Classify given raw email with enron dataset, prompts for raw email
  --classify, -c          Classify given raw email, prompts for raw email
  --version, -v           Display installed version
Examples
  $ python spampy --help
  $ python spampy --download
  $ python spampy --eclassify
  $ python spampy --classify

spampy's People

Contributors

abdullahselek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

spampy's Issues

classify & eclassify

While Running the command
python spampy --classify it prompts for
eg. Raw Mail : dob meos with hgh my energy level has gone up ! stukm
The output gives True
python spampy --classify it prompts for
eg. Raw Mail : enron online desk to desk id and password
The Output gives True

For every Input It is showing me True
Kindly suggest me a Raw Email Input for Getting a False Answer

Dataset Downloader

The dataset_downloader.py file is not able to create process to download instead of it by directly clicking on dataset_downloader.sh (Shell Script) it downloads the file.

But download is not working in dataset_downloader.py file

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.