Giter VIP home page Giter VIP logo

Hi there!

I'm Shubham, a Data Scientist & Generative AI Enthusiast with Master's Degree from University Of Surrey, UK. Welcome to my GitHub portfolio! I'm a seasoned AI researcher specializing in NLP, computer vision, and speech processing, with a particular focus on large language models (LLMs). My expertise lies in developing innovative solutions that leverage cutting-edge technologies to address complex challenges. With a solid foundation in Python, TensorFlow, and PyTorch, I'm adept at implementing state-of-the-art algorithms and models. Additionally, my experience in MLOps ensures seamless deployment and management of machine learning workflows. Explore my projects across NLP, computer vision, and speech processing, and let's collaborate on transforming ideas into impactful solutions! See My full Resume here-> Link

Jump to Projects

🚀 Skills & Expertise:

Python Java MySQL Android

Python-Libraries & Frameworks:

Tensorflow Pytorch Keras OpenCV Flask Pandas scikit-learn NumPy

Deployment Tools:

Docker Amazon AWS Microsoft Azure Git GitHub Huggingface Gradio

Tools:

Pycharm Google Colab Android Studio Jupyter Postman Firebase Ubuntu Azure Data Studio Unity

💡 Projects:

Having a range of applications in multiple modalities, projects are categorised into NLP, CV, Speech and Machine learning.

🖼️ Computer Vision

  • Advanced Sparse-View CT Denoising: Applied Mir-Net and GAN based algorithms for correcting Sparse-View CT scans using image-to-image translation using University’s HPC (Condor’s System).
  • Comparative Study: Integrated Pix2Pix GANs with varied training methods, conducting a comprehensive comparative study on image generation techniques based on quantitative and qualitative analysis.
  • Publication Recognition: Abstract accepted at ICMLMI (International Conference of Machine Learning in Medical Imaging), London, 2023. Link
  • Currency Prediction App: Engineered a TensorFlow based application predicting Indian currency (85%) via pre-trained EfficientDet-Lite0 and Cloth Recognition models (76% on Cloth Patterns). Integrated TF Lite models into an Android app.
  • API Integration & Publication: Integrated ML-Kit's Object Detection, Handwritten Text Recognition APIs, and Google’s TextToSpeech API. Research on the project published in Dickensian Journal.

Vehicle Re-Identification: (TensorFlow)

  • Fine tuning for transfer learning on models like EfficientNet, ResNet-50, and MobileNet for vehicle re-identification.
  • Conducted extensive data pre-processing and augmentation, including digital image warping and rotations, to enhance dataset quality.
  • Demonstrated expertise in hyper-parameter tuning and experimental design, showcasing proficiency in data manipulation, and tackling complex problems as a data scientist.
  • Supervised Scene Classification: Engineered a ResNet-34 CNN on the "Places2 simp" dataset (40,000 128x128 images, 40 categories). Tailored ResNet-34, achieving ≥45% validation accuracy and ≥75% top-5 accuracy.
  • Training and Validation Optimization: Tuned hyperparameters, achieving superior performance, validated via confusion matrices, showcasing top-5 scores for tests.

📖 Natural Language Processing.

  • Constructed a multi-class classifier prototype utilizing the GoEmotions dataset consisting of 197,847 labeled Reddit comments.
  • Conducted four experiments involving preprocessing, N-Gram analysis, sentiment analysis using bi-directional LSTM models, achieving varied test accuracies ranging from 39.58% to 59.26%. Additionally, employed a CNN with LSTM architecture, achieving 41.09% test accuracy after 20 epochs.
  • Deployed the prototype on HuggingFace, developed a web app using Gradio, and conducted extensive API testing, integrating a CI/CD pipeline for continuous deployment and delivery.

📄 PDF Retrieval using OpenAI LLM: (RAG)

  • Leveraged OpenAI's Language Model (LLM) and Langchain library to develop a precise PDF retrieval system.
  • Integrated Pinecone Vector Database to optimize document storage and retrieval, enhancing search efficiency and accuracy.

💬 Q&A Chatbot based on LLAMA:

  • Optimized the LLAMA-7b parameter model for a supervised Q&A chatbot, refining its performance within the supervised learning paradigm.
  • Implemented PEFT (Parameter Efficient Fine tuning) , employing a LoRa-based approach to expedite training, yield superior results, and reduce computational demands.

Machine Learning & End-to-End Pipelines

  • ML Pipeline Development: Engineered machine learning pipelines encompassing feature engineering and hyperparameter tuning across multiple algorithms.
  • Azure Deployment: Deployed the pipeline on Microsoft Azure using CI/CD methodology, leveraging a Dockerization of Flask web application.

👩 Sport Celebrity Classification:

  • Technologies included Web scraping, Utilized OpenCV2 for image processing, NumPy, Pandas, PyWavelets for data manipulation, and Matplotlib for visualizing data and model performance.
  • Model Development & Deployment: Experimented with various models and parameters using GridSearchCV, achieving accuracies of 78% (logistic regression), 76% (SVM), and 70% (random forest). Developed a Flask web app to showcase the classification model, exhibiting skills in web scraping, image processing, data cleaning, optimization, and web development for ML applications.

〽️ Stock Market Analysis (Time Series)

  • Implemented Time-Series technique on the Stocks of Top companies i.e. Apple, Microsoft, Amazon, and Apple.
  • LSTM based network along with other Machine Learning Algorithms are used for this Regression Problem with qualitative analysis.
  • Processed continuous dataset and applied Regression based Machine learning models.
  • The Data science paradigms such as the Feature Engineering, Selection and Extraction is carried out on the dataset.

🗣️ Speech (Audio)

❤️ Heart Murmur Detection:

  • Heart Murmur Disease Detection: Conducted heart murmur disease detection using real patient audio samples, employing Digital Signal Processing techniques via the Librosa library.
  • EDA and Machine Learning: Applied Exploratory Data Analysis (EDA) techniques, feature engineering, and selection methods on Mel-Spectrograms and other physiological data. Utilized various machine learning models such as SVM, Random Forest Classifier, and Naïve Bayes Classifier.

📱Android Projects

  • Currency Prediction App: Engineered a TensorFlow-based application predicting Indian currency (85%) via pre-trained EfficientDet-Lite0 and Cloth Recognition models (76% on Cloth Patterns). Integrated TF Lite models into an Android app.
  • API Integration & Publication: Integrated ML-Kit's Object Detection, Handwritten Text Recognition APIs, and Google’s TextToSpeech API. Research on the project published in Dickensian Journal.
  • Application that serves as a marketplace to customize and shop the fashion wear.
  • Integrates Firestore authentication, Firebase Database and Storage.
  • API for online payment getway using RazorPay is used for digital Payments in India.
  • Task Scheduler App is an Android application in which the task can be scheduled and handled according to the priority of the task.
  • It consists of Room Database to handle all the task and set the Due Date of the same.
  • Basically the application is so-called ToDo App that saves the input into the Room Database.
  • The Project focuses on the Room Database and some of the concepts about Handling the listening events accross activities and fragments, enum class and ViewModel class for transfering the data across activites. All sorts of Data Manipulation and Basic CRUD i.e. (CREATE,READ, UPDATE, DELETE) are very well performed.

Data Analytics Project

  • Dashboard with Data analytics of current trends and respective Report of Attendance is created using PowerBI for Hybrid work culture Analysis.

📚 Certificates & Awards:

  • Master’s Dissertation project got nominated for the Electronic Engineering Industrial Advisory Board MSc Project Prize by University of Surrey. (results Awaiting)
  • "Excellence in AI Collaboration" Award: Recognized within SetSquare Surrey's Entrepreneurship programme (IKEEP and ITeK) for exceptional industry collaboration, ensuring effective completion of duties within the cohort including excellent communication abilities to achieve tasks.
  • Runners up in the GenAI hackathon, for the extensive solution to tackle the use of Generative AI for Assessments.

Github Stats

Visitor Badge

📫 Get in Touch:

Linkedin Badge Gmail Badge

🔭 Always exploring new technologies and excited to collaborate on innovative projects!

Shubham 's Projects

car-price-prediction icon car-price-prediction

This is the ML model using decision tree for prediction of prices of used cars using some of the Features.

emotionclassification icon emotionclassification

Go-Emotions Emotion Classification algorithm with hyperaparameter tuning with deployment on huginf face with Gradio for UI

journal-app icon journal-app

An Appilcation that saves the events that are memorable. It uses Firebase Authentication, Firestore Database, Firestore Storage for saving the title, thoughts and an image regarding the event. Moreover, It is backed by IBM Cloud's Natural Language Understanding API. It gives the information regarding the Parts of speech, Sentiment,Emotion and the Entities from the Paragragh present in thoughts section.

smile-detection- icon smile-detection-

Smile Detection application is the android application project that is powered by Firebase ML Kit for detection of face and smiling probability of the person.

tailorapp icon tailorapp

Tailor App to customize and shop the fashion wear.

task-scheduler-app icon task-scheduler-app

Task Scheduler App is an Android application in which the task can be scheduled and handled according to the priority of the task. It consists of Room Database to handle all the task and set the Due Date of the same.

video-streamer icon video-streamer

Video Streamer is the application project for streaming the videos on the app.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.