Giter VIP home page Giter VIP logo

resume_classifier's Introduction

Resume Classifier

This project involves the development of a resume classification system using a fine-tuned BERT model. The model classifies resumes into predefined job categories and is trained on a dataset of 2400+ resumes. The project also includes a web application for uploading resumes in PDF format and predicting the job category using Streamlit.

Table of Contents

Introduction

Accurate classification of resumes is crucial for efficient talent acquisition and human resource management. This project leverages a fine-tuned BERT model to classify resumes into specified job categories, providing a scalable solution integrated with a user-friendly web application.

Features

  • PDF Resume Upload: Upload resumes in PDF format.
  • Resume Classification: Predict job categories using a fine-tuned BERT model.
  • Web Application: Streamlit-based interface for easy interaction.
  • Real-time Processing: Handle resume uploads and provide instant classification results.

Installation

Follow these steps to set up the project locally:

  1. Clone the repository:
    git clone https://github.com/dvtushar/Resume-Classifier.git
    cd resume-classifier
  2. Create a virtual environment::
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install the required packages::
    pip install -r requirements.txt
    

Usage

To run the Streamlit web application:

  1. Activate the virtual environment::
     source venv/bin/activate  # On Windows: venv\Scripts\activate
  2. Start the Streamlit application::
    streamlit run app.py --server.enableXsrfProtection false
  3. Upload a PDF Resume:: Open your browser and go to http://localhost:8501, upload a PDF resume, and view the predicted job category.

Model Training and Evaluation

The model training process involves the following steps:

  1. Data Preparation::
  • Convert PDFs to text using the PyPDF2 library.
  • Encode job categories using a label encoder.
  • Tokenize the text data using BERT tokenizer.
  1. Fine-tuning BERT::
  • Use Hugging Face's Transformers library to fine-tune the BERT model on the dataset.
  • Utilize AdamW optimizer and a learning rate scheduler.
  1. Training Loop::
  • Train the model for 3 epochs with gradient accumulation.
  • Evaluate the model on validation and test datasets.

Results:

The model achieved the following performance metrics:

  • Validation Accuracy: 0.7882037533512064
  • Validation F1 Score: 0.79
  • Test Accuracy: 0.7828418230563002
  • Test F1 Score: 0.78 image image image

The Streamlit web application (app.py) allows users to upload resumes and get predictions for job categories.

Screenshot of the working application

The Streamlit web application (app.py) allows users to upload resumes and get predictions for job categories. image image

resume_classifier's People

Contributors

dvtushar avatar

Stargazers

Anushka Srivastava avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.