Giter VIP home page Giter VIP logo

labonneboite's Introduction

 _       _                            _           _ _
| | __ _| |__   ___  _ __  _ __   ___| |__   ___ (_) |_ ___
| |/ _` | '_ \ / _ \| '_ \| '_ \ / _ \ '_ \ / _ \| | __/ _ \
| | (_| | |_) | (_) | | | | | | |  __/ |_) | (_) | | ||  __/
|_|\__,_|_.__/ \___/|_| |_|_| |_|\___|_.__/ \___/|_|\__\___|

Présentation du projet

La Bonne Boite (LBB) est un service lancé par Pôle emploi pour permettre aux chercheurs d’emploi de cibler plus efficacement leurs candidatures spontanées : l'utilisateur accède à la liste des entreprises à « haut potentiel d'embauche ». Le « potentiel d'embauche » est un indicateur exclusif inventé par Pôle emploi pour prédire le nombre de recrutements (CDI et CDD de plus de un mois) d’une entreprise donnée dans les 6 prochains mois.

En contactant des entreprises à « haut potentiel d'embauche », le chercheur d'emploi concentre ses efforts uniquement sur les entreprises qui sont le plus susceptibles de l'embaucher. La Bonne Boite lui permet ainsi de réduire drastiquement le nombre d'entreprises à contacter et d'être plus efficace dans sa recherche.

Le « potentiel d'embauche » est un indicateur basé une technique d'intelligence artificielle (apprentissage automatique ou "machine learning"), en l'occurence un algorithme de régression. Pour calculer un potentiel d’embauche, La Bonne Boite analyse des millions de recrutements de toutes les entreprises de France depuis plusieurs années.

La Bonne Boite a été déployée en France avec des premiers résultats encourageants, et est en cours de développement pour d'autres pays (Luxembourg).

La Bonne Boite c’est un site web mais aussi une API

La Bonne Boite, on en parle dans la presse

Project overview

La Bonne Boite (LBB) is a service launched by Pole Emploi (french national employment agency) to offer a new way for job seekers to look for a new job. Instead of searching for job offers, the job seeker can look directly for companies that have a high "hiring potential". The "hiring potential" is an algorithm exclusivity created by Pole Emploi that estimates how many contracts a given company is likely to hire in the next 6 months.

By only contacting companies with a high "hiring potential", job seekers can focus their efforts only on companies that are likely to hire them. Instead of targeting every and any company that might potentially be interested by their profile, La Bonne Boite drastically reduces the number of companies a job seeker needs to have in mind when looking for a job.

The "hiring potential" is an indicator based on a machine learning model, in this case a regression. La Bonne Boite processes millions of recrutements of all french companies over years to compute this "hiring potential".

It has already been deployed in France with early results that are very promising. Early development is being made for new countries (Luxembourg).

La Bonne Boite is a web site and an API.

Press Coverage on La Bonne Boite

Install a new development environment

  • Install Ansible:

    • either in your global site-packages, e.g. on macOS: brew install ansible

    • or create an isolated Python environments using virtualenvwrapper:

      $ mkvirtualenv --python=`which python2.7` lbb
      $ workon lbb
      $ pip install ansible==2.3.0.0
      
  • Install VirtualBox

  • Install Vagrant

  • Fetch labonneboite repository git clone https://github.com/StartupPoleEmploi/labonneboite.git

    • ensure your ssh public key has been properly setup

    • note that you'll be using the user git, unlike other services which use nominative usernames

  • cd labonneboite/vagrant

  • vagrant up

Go get coffee and look into the LBB code in the meantime, it's going to take a while.

Most likely, vagrant up will throw some error and not finish the first time, we need to finish the provisioning process by running:

  • vagrant provision

until it does.

Create a settings file

Create a labonneboite/labonneboite/conf/local_settings.py file with the following values (ask your colleagues for some sensitive values):

USER = 'labonneboite'
PASSWORD = 'labonneboite'
DB = 'labonneboite'
DEBUG = True

OFFICE_TABLE = 'etablissements'
MANDRILL_API_KEY = '<set it>'
LOCALE = 'fr_FR'

GLOBAL_STATIC_PATH = '/tmp'

FLASK_SECRET_KEY = 'bbbbbbbbbbbbbbbbbccccccccccccc'
ADMIN_EMAIL = '<set it>'
CONTACT_EMAIL = '<set it>'
FORM_EMAIL = '<set it>'
GMAIL_USER = '<set it>'
GMAIL_PASSWORD = '<set it>'
STAGING_SERVER_URL = 'http://localhost:5000'

API_KEYS = {
    'labonneboite': '1234567890abcdef',
    'emploi_store_dev': '1234567890abcdef',
}

LOG_LEVEL = 'DEBUG'

PEAM_CLIENT_ID = '<set it>'
PEAM_CLIENT_SECRET = '<set it>'
PEAM_AUTH_BASE_URL = '<set it>'
PEAM_API_BASE_URL = '<set it>'
PEAM_USERINFO_URL = '<set it>'

VERSION_PRO_ALLOWED_IPS = ['<set it>', '<set it>', '<set it>']
VERSION_PRO_ALLOWED_EMAILS = ['<set it>', '<set it>', '<set it>']
VERSION_PRO_ALLOWED_EMAIL_SUFFIXES = ['<set it>', '<set it>', '<set it>']
VERSION_PRO_ALLOWED_EMAIL_REGEXPS = ['<set it>', '<set it>', '<set it>']

# Values below are *fake* and should be used in development and test environments only.
# The real values are confidential, stored outside of github repository
# and are only used in production+staging.
SCORE_50_HIRINGS = 10.0
SCORE_60_HIRINGS = 50.0
SCORE_80_HIRINGS = 100.0
SCORE_100_HIRINGS = 500.0

Launch LBB web app

If your vagrant environment is not running, run:

make vagrant_start

Then run:

make serve_web_app

The app is available on port 8090 on host machine. Open a web browser, load http://localhost:8090 and start browsing.

Accessing your local MySQL

To access your local MySQL in your MySQL GUI, for example using Sequel Pro:

  • new connection / select "SSH" tab
  • MySQL host: 127.0.0.1
  • Username: root
  • Password: leave empty
  • Database: labonneboite
  • SSH Host: 127.0.0.1
  • SSH User: vagrant
  • SSH Password: vagrant
  • SSH Port: 2222

You can also access staging and production DBs using a similar way, however with great power comes great responsiblity...

Elasticsearch

Accessing your local Elasticsearch

Vagrant already forward port 9200 from your host to your guest VM.

Simply open http://localhost:9200 in your web browser, or, better, install the chrome extension "Sense".

You can also use curl to explore your cluster.

Examples:

Locally:

# Cluster health check.
curl 'localhost:9200/_cat/health?v'

# List of nodes in the cluster.
curl 'localhost:9200/_cat/nodes?v'

# List of all indexes (indices).
curl 'localhost:9200/_cat/indices?v'

# Get information about one index.
curl 'http://localhost:9200/labonneboite/?pretty'

# Retrieve mapping definitions for an index or type.
curl 'http://localhost:9200/labonneboite/_mapping/?pretty'
curl 'http://localhost:9200/labonneboite/_mapping/office?pretty'

# Search explicitly for documents of a given type within the labonneboite index.
curl 'http://localhost:9200/labonneboite/office/_search?pretty'
curl 'http://localhost:9200/labonneboite/ogr/_search?pretty'
curl 'http://localhost:9200/labonneboite/location/_search?pretty'

DB content in the development environment

Note that we only have data in Metz region.

Any search on another region than Metz will give zero results.

Running tests

We are using Nose.

Tests which can run in development:

$ make test_importer
$ make test_app
$ make test_api
$ make test_integration
$ make test_selenium

Tests which for some reason do not run in dev yet, only jenkins:

$ make test_integration

You can run all tests with:

$ make test_all

About selenium tests: note that your local server must be running for them to pass.

Running scripts

For example create_index:

$ make vagrant_ssh
$ python /srv/lbb/labonneboite/scripts/create_index.py

Running pylint

You can run pylint on the whole project:

$ make pylint_all

Or on a specific python file:

$ make pylint ARGS=labonneboite/web/app.py

We recommend you use a pylint git pre-commit hook:

$ pip install git-pylint-commit-hook
$ vim .git/hooks/pre-commit
#!/bin/bash
# (...) previous content which was already present (e.g. nosetests)
# add the following line at the end of your pre-commit hook file
git-pylint-commit-hook

Debugging

# anywhere in the code
logger.info("message")

# for an interactive debugger, use one of these,
# depending on which place of the code you are

# if you are inside the web app code
raise # then you can use the console on the error page web interface

# if you are inside a test code
from nose.tools import set_trace; set_trace()

# if you are inside a script code (e.g. scripts/create_city_file.py)
# also works inside the web app code
from IPython import embed; embed()
# and/or
import ipdb; ipdb.set_trace()

Process workflow

Here is the workflow of our data processes managed by jenkins:

check_etab => extract_etab => check_dpae => extract_dpae => compute_scores => validate_scores => geocode => populate_flags

Single-ROME vs Multi-ROME search

The company search on the frontend only allows searching for a single ROME (a.k.a. rome_code). However, historically the API allowed for multi-ROME search. This is no longer the case as of mid-2017, and the three reasons why we dropped support for multi-ROME search are:

  • so that frontend and API behaviors are as similar as possible.
  • tailoring search results to the requested rome_code adds some complexity and would be quite difficult to generalize to a multi-ROME search.
  • nobody was actually using API multi-ROME search anyway.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.