Giter VIP home page Giter VIP logo

email_prediction's Introduction

Email Predictor

Given these 3 pieces of information:

  • An advisor's name
  • The domain name of the company he works for
  • A set of name's and emails of other advisors that work for the same company

For this scenario there are 4 potential patterns

  1. first_name_dot_last_name: "[email protected]"
  2. first_name_dot_last_initial: "[email protected]"
  3. first_initial_dot_last_name: "[email protected]"
  4. first_initial_dot_last_initial: "[email protected]"

And given a sample dataset:

{
  "John Ferguson" => "[email protected]",
  "Damon Aw" => "[email protected]",
  "Linda Li" => "[email protected]",
  "Larry Page" => "[email protected]",
  "Sergey Brin" => "[email protected]",
  "Steve Jobs" => "[email protected]"
}

We have to work out the most likely email address for the following:

  1. "Peter Wong", "alphasights.com"
  2. "Craig Silverstein", "google.com"
  3. "Steve Wozniak", "apple.com"
  4. "Barack Obama", "whitehouse.gov"

Usage

To run the test-suite that covers all 4 occurrences

$ bundle exec rspec

To log into the application's console, set CONSOLE to true when running the test suite

$ CONSOLE=true bundle exec rspec

which will boot pry with the app loaded. Sample execution:

prediction = EmailPredictor::Prediction.new("Peter Wong", "alphasights.com")
=> #<EmailPredictor::Prediction:0x00000101e295f8
 @domain=#<EmailPredictor::Domain:0x00000101e29580 @full_domain="alphasights.com">,
 @name=#<EmailPredictor::Name:0x00000101e295a8 @full_name="Peter Wong">>

email = prediction.predicted_email
=> #<EmailPredictor::PredictedEmail:0x00000101d2c2e0 @address="[email protected]", @pattern=:first_name_dot_last_name>

email.address
=> "[email protected]"
email.pattern
=> :first_name_dot_last_name

Pattern handling

  • For predominant patterns: the pattern used will always be the one with more ocurrences on the database

  • For multiple non-predominant patterns: any of them is used

  • For no previous pattern found a NotImplementedError is raised.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.