Giter VIP home page Giter VIP logo

jobfunnel's Introduction

JobFunnel Banner

Build Status

Automated tool for scraping job postings into a .csv file.

Benefits over job search sites:

  • Never see the same job twice!
  • Browse all search results at once, in an easy to read/sort spreadsheet.
  • Keep track of all explicitly new job postings in your area.
  • See jobs from multiple job search sites all in one place.

The spreadsheet for managing your job search:

masterlist.csv

Dependencies

JobFunnel requires Python 3.6 or later.
All dependencies are listed in setup.py, and can be installed automatically with pip when installing JobFunnel.

Installing JobFunnel

pip install git+https://github.com/PaulMcInnis/JobFunnel.git
funnel --help

If you want to develop JobFunnel, you may want to install it in-place:

git clone [email protected]:PaulMcInnis/JobFunnel.git jobfunnel
pip install -e ./jobfunnel
funnel --help

Using JobFunnel

  1. Set your job search preferences in the yaml configuration file (or use -kw).
  2. Run funnel to scrape all-available job listings.
  3. Review jobs in the master-list, update the job status to other values such as interview or offer.
  4. Set any undesired job status to archive, these jobs will be removed from the .csv next time you run funnel.
  5. Check out demo/readme.md if you want to try the demo.

Note: rejected jobs will be filtered out and will disappear from the output .csv.

Usage Notes

  • Custom Status
    Note that any custom states (i.e applied) are preserved in the spreadsheet.

  • Running Filters
    To update active filters and to see any new jobs going forwards, just run funnel again, and review the .csv file.

  • Recovering Lost Master-list
    If ever your master-list gets deleted you still have the historic pickle files.
    Simply run funnel --recover to generate a new master-list.

  • Managing Multiple Searches
    You can keep multiple search results across multiple .csv files:

    funnel -kw Python -o python_search
    funnel -kw AI Machine Learning -o ML_search
    
  • Filtering Undesired Companies
    Filter undesired companies by providing your own yaml configuration and adding them to the black list (see JobFunnel/jobfunnel/config/settings.yaml).

  • Automating Searches
    JobFunnel can be easily automated to run nightly with crontab
    For more information see the crontab document.

  • Reviewing Jobs in Terminal
    You can review the job list in the command line:

    column -s, -t < master_list.csv | less -#2 -N -S
    
  • Saving Duplicates
    You can save removed duplicates in a separate file, which is stored in the same place as your master list:

    funnel --save_dup
    
  • Respectful Delaying
    Respectfully scrape your job posts with our built-in delaying algorithm, which can be configured using a config file (see JobFunnel/jobfunnel/config/settings.yaml) or with command line arguments:

    • -d lets you set your max delay value: funnel -s demo/settings.yaml -kw AI -d 15
    • -r lets you specify if you want to use random delaying, and uses -d to control the range of randoms we pull from:
      funnel -s demo/settings.yaml -kw AI -r
    • -c specifies converging random delay, which is an alternative mode of random delay. Random delay needed to be turned on as well for it to work. Proper usage would look something like this:
      funnel -s demo/settings.yaml -kw AI -r -c
    • -md lets you set a minimum delay value:
      funnel -s demo/settings.yaml -d 15 -md 5
    • --fun can be used to set which mathematical function (constant, linear, or sigmoid) is used to calculate delay:
      funnel -s demo/settings.yaml --fun sigmoid
    • --no_delay Turns off delaying, but it's usage is not recommended.

    To better understand how to configure delaying, check out this Jupyter Notebook breaking down the algorithm step by step with code and visualizations.

jobfunnel's People

Contributors

bunsenmurder avatar cclauss avatar itseez avatar jacenfox avatar markkvdb avatar paulmcinnis avatar riyaagrahari avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.