Giter VIP home page Giter VIP logo

crypto_research's Introduction

πŸš€ Welcome to my Github page

Background Image

---

Hello! πŸ–οΈ I am a Senior AI & Data Science Engineer with 5 years of experience. I excel in designing and implementing data-driven models tailored to client needs.

πŸ” Areas of Passion:

  • Advanced NLP techniques including:
    • LLMs πŸ€–
    • Qlora πŸ¦„
    • LORA 🦜
    • RAG (retrieval-augmented generation) πŸ•ΈοΈ
  • 🀿 Currently, I’m immersed in exploring augmented data generation techniques for NLP tasks.
  • πŸ’¬ Got queries about NLP, AI, or machine learning? Don't hesitate to ask! 🧠

🧠 Expertise Areas

  • Large Language Models (LLMs) πŸ€–
  • MLE | MLOPS πŸš€
  • Data Analysis πŸ“Š
  • Natural Language Processing πŸ“
  • Dashboard Realization πŸ“ˆ
  • Business Intelligence πŸ“‰
  • Data Management & Transformation βš™οΈ
  • Machine Learning 🦾
  • Deep Learning 🧬
  • Data Visualization πŸ–Ό

Whether you're looking to tell a compelling story with your data, develop a real-time dashboard with KPIs for monitoring your company's health, or explore natural language processing solutions πŸ—£, I can assist you in your next venture.

Over the years, I've successfully managed projects worth +€1M across 12+ countries 🌍 for major clients including the European Parliament, Kering, Atos, Renault Nissan Mitsubishi, Damart, and more. I wear multiple hats as a πŸ§‘β€πŸ”¬ Data Scientist, πŸ“Š Data Analyst, and πŸ§‘β€πŸ’Ό Project Manager with both functional and managerial expertise.

πŸ† Certifications

πŸ† Hackathons

πŸ† 1st Prize "Hack to Act" Kering

Location: Paris (October 2020)
Prize: €10,000

  • Objective: Development of a prediction/recommendation platform based on AI.
  • Description:
    • Prediction of the environmental impact of Kering's various activities throughout the supply chain.
    • Accurate evaluation of environmental impacts: resource depletion, biodiversity, greenhouse gases.
    • Decision support for designers, material researchers, and consumers in their choices to reduce the impact of luxury.
    • Automatic creation of predictive models from user-provided data or directly from Kering data and integrated native models.
    • Recommendation technique: Collaborative-based method, Content-based method, Hybrid method.
  • Results: The team won 1st place for the best platform for predicting the environmental footprint of Kering's products.
  • References:

πŸ† 1st Prize Accenture Hackathon

Location: Paris (February 2019)

πŸ§‘β€πŸŽ¨ My Recent Roles

πŸ“Š Senior Data Scientist | LLM expert - TOTAL ENERGIE, Paris

Objective: SQL Chatbot for Database Management

  • Led the development of an advanced SQL chatbot to enhance database querying and data visualization using NLP and LLMs.
  • Architected the SQL chatbot leveraging LangChain and OpenAI's GPT-4, enabling intuitive data visualizations and command translations.
  • Enhanced model efficiency & performance using LLMs.
  • Designed a full-stack solution hosted on Azure SQL Database, integrating Azure Bot Services and Azure Language Understanding (LUIS) for a dynamic user interface.
  • Optimized model performance with hyperparameter tuning.
  • Adapted the model to cater to different building types.
  • Result: Enhanced model efficiency & performance using advanced NLP techniques & LLMs.
  • Technical Stack: Python, Azure, PostgreSQL, LangChain, Streamlit, Gitlab, AzureDevOps

πŸ§‘β€πŸ’Ό IFACI – Expert LLM

Objective: AI GEN Assistant for Auditing Profession

  • Role: AI GEN Assistant for Auditing Profession
  • Developed a generative AI base and an assistant for field agents, enhancing natural language understanding capabilities using Spacy.
  • Utilized a combination of Azure, LanceDB, RAG, Vector Store, HNSW, and Hybrid search technologies to optimize performance.
  • Result: Improved efficiency and accuracy in the auditing process through the implementation of advanced AI techniques.
  • Technical Stack: Python, Azure, Neo4J, AzureDevOps, LanceDB, Chroma, Milvus, MLFlow, HNSW, Hybrid search

πŸ§‘β€πŸ’Ό GSF – Senior Data Scientist / MLE Architect

Objective: Predictive Maintenance for Cleaning Services

  • Role: Workplace Accident Prediction + Explainability
  • Spearheaded the implementation of CI/CD pipelines and developed a system for the evaluation of prediction explainability.
  • Utilized Azure Machine Learning, Azure Datafactory, Azure Pipelines, Azure Devops, and integrated with Snowflake and Control-M for workflow management.
  • Result: Improved workplace safety through accurate accident prediction and enhanced model explainability.
  • Technical Stack: Python, Azure, MLFlow, Databricks, Pandas, Terraform, TensorFlow, Scikit-learn, PyTest, Docker, Azure Machine Learning, Azure Datafactory, Azure Pipelines, Azure Devops, Snowflake, Control-M

πŸ“§ Expert NLP LLM / Senior Data Scientist - THUASNE

Objective: Email Order System (1K orders/day)

  • Created an email order management system and a multimodal model for information extraction employing BERT, Azure, ChatGPT, NLP, LLMs, and Melusine.
  • Enhanced the system with Azure Document AI for advanced document processing.
  • Improved order processing efficiency & anomaly detection.
  • Used explainability tools like LIMETextExplainer, ELI5NLP, SHAP, and AnchorsNLP.
  • Optimized system performance with hyperparameter tuning.
  • Adapted the system to cater to different orthopedic domains.
  • Result: Enhanced system efficiency & performance using advanced NLP techniques, LLMs, and Melusine tool.
  • Technical Stack: BERT, Azure, OpenAI, NLP, LLMs, Melusine, Azure Document AI, Scikitlearn,Docker

πŸ“¨ Senior Data Scientist - ADELAIDE

Objective: Automatic Email Processing (10K emails/day)

  • Developed an explainability module for email classification and automatic responses using open-source tools such as Melusine, LIMETextExplainer, ELI5NLP, SHAP, AnchorsNLP, and integrations with Hugging Face and RASA.
  • Managed version control and continuous integration using Git and CI/CD practices.
  • Result: Streamlined email processing and improved response accuracy through the implementation of advanced NLP techniques and explainability tools.
  • Technical Stack: Melusine, LIMETextExplainer, CNN, ELI5NLP, SHAP, AnchorsNLP, Hugging Face, RASA, Git, CI/CD

πŸ•΅οΈβ€β™‚οΈ Expert LLM - ACOSS/URSSAF/CNAF/CNAM

Objective: Documentary AI for Social Fraud Prevention

  • Developed a demonstrator for multimodal processing of large data volumes using Transformers, LayoutLM, OCR, NLP, and Topic Modeling to detect fraud.
  • Result: Enhanced fraud detection capabilities through the implementation of advanced AI techniques for multimodal data processing.
  • Technical Stack: Transformers, OpenCV, PyTorch, CNN LayoutLM, OCR, NLP, Topic Modeling

πŸ•΅οΈβ€β™€οΈ Senior Data Scientist - Quantmetry, Paris

Objective: Documentary AI Fraud Demonstrator

  • Designed a demonstrator for document processing (insurance invoices).
  • Detected fraudulent patterns & document falsification.
  • Extracted key invoice fields & verified their consistency.
  • Identified potentially suspicious overbilling cases.
  • Result: Significant improvement in fraud detection using AI, outperforming traditional OCR techniques.
  • Technical Stack: Python, Azure, OCR, OpenCV, PyTorch, CNN, NLP, Machine Learning, Scikitlearn, Docker

πŸš— Senior Data Scientist - Stellantis, Paris

Objective: Part Forecasting: PFO – Technical Lead/ Technical Expert

  • Provided 18-month forecasts to suppliers, mitigating semiconductor crisis impact.
  • Developed PFO architecture as a Streamlit web app hosted in Azure.
  • Integrated data from Oracle Exadata Database.
  • Used Azure Data Factory for file transfer & processing tasks.
  • Containerized the PFO app using Docker & deployed to Azure Container Registry.
  • Result: Enhanced inventory management & supplier collaboration, improving part prediction accuracy.
  • Technical Stack: Streamlit, MLFlow, Airflow, Terraform, PyTest, Databricks, Azure, Oracle Exadata Database, Azure Data Factory, Docker, Azure Container Registry

πŸ’Έ Project Manager / Technical Expert - ATOS, Grenoble

Objective: Travel & Expense Dashboard Atos (€10M+/ year)

  • Developed a KPI dashboard to monitor Atos' expenses in real-time with geolocation and carbon footprint of travel.
  • Recovered +10% VAT + billable expense reports (+€1 million annual gain).
  • Gained more than 3214 hours of work per year.
  • Result: Automation of weekly reports, significant cost savings, and improved efficiency through real-time expense monitoring and analysis.
  • Technical Stack: Pandas, Matplotlib, Numpy, Scikit-learn, Jupyter Notebook, Power BI

πŸ—£οΈ Lead Data Scientist - ATOS, Grenoble

Objective: R&D – Expressive TTS System

  • Collected and adapted a large corpus of interactive behaviors in English (LJ Speech) and French (MAILABS).
  • Developed and trained an expressive TTS system based on the Tacotron2 model by NVIDIA.
  • Implemented a methodology for evaluating the learning quality of the prototype based on the distribution of lengths (number of spectrogram frames) of the predicted clips compared to the originals.
  • Prepared a scientific paper: "Linking Utterances via Punctuations for Improved End-to-End Speech Synthesis".
  • Captured the variability of styles and emotional state and their syntheses to the user profile for better prediction of speech synthesis applied to the text-to-speech (TTS) system.
  • Result: Improved robustness and accuracy of TTS e-spectrogram generation, control and generation of styles, verbal behaviors, and prosody based on the user.
  • Technical Stack: Pytorch, Tensorflow, Python, LSTM, Transformers, Attention Mechanism

πŸš— Data Scientist - Renault Nissan Mitsubishi, Paris

Objective: Industry Automobile – Home to Car Next Generation Alliance

  • Realized prototypes and developed the first generation of voice assistants of the Renault Nissan Mitsubishi alliance.
  • Integrated the Google Assistant with Nissan cars to receive information from the car and control it remotely from your phone or from a Google Home.
  • Connected to the authentication servers of the RNM Alliance and complied with cybersecurity specifications.
  • Deployed Alexa and Google Actions project environments fully configured and ready to use.
  • Documented user journey to configure the service.
  • Result: Launched these features with the Nissan Juke at the 2019 Frankfurt Motor Show.
  • Technical Stack: Python, Azure, Tensorflow, Keras, Dialogflow, Luis, Reddit, Alexa Skill, Bot Framework

🏨 Data Scientist - ATOS (European Parliament), Grenoble

Objective: Service – SAMBOT an intelligent conversational agent for room reservation (+3000 users)

  • Developed a multilingual chatbot for room reservation in natural language (text and voice).
  • Implemented a recommendation system based on user habits, locations, and room occupancy.
  • Paired with Outlook calendars & email systems (Skype).
  • Documented functional, technical, and user journey aspects.
  • Result: Realized room reservations in record time considering user habits.
    Technical Stack: Python, Azure, OCR, OpenCV, PyTorch, CNN, NLP, Machine Learning, Scikitlearn, Docker

πŸ‘¨β€πŸ’» Languages

Python Pyspark R C++

☁️ Cloud Services

Microsoft Azure AWS Google Cloud

🐳 Containers & Orchestration

Docker Kubernetes

πŸ€– Machine Learning

TensorFlow PyTorch Scikit-Learn

🌐 NLP

NLTK spaCy Hugging Face Lambda chatgpt Llama

πŸ“Š Data Science

Jupyter Notebook LlamaIndex NumPy Pandas Matplotlib TensorFlow PyTorch Scikit-learn OpenCV Power BI Tableau Databricks Qlik SAS Microsoft Azure Google Cloud

🌐 Web Development

Flask Django

πŸ”§ Infrastructure

Terraform GitLab Jenkins Kafka Linux

πŸ—ƒοΈ Databases

MySQL PostgreSQL MongoDB Microsoft SQL Server Oracle

πŸ› οΈ Engineering

Git MLflow Terraform Ansible Puppet Chef

πŸ“ˆ Big Data

Apache Kafka Apache Spark Apache Hadoop Apache Cassandra Apache Hive Apache Storm Apache NiFi

πŸ“‹ Functional/Managerial Skills πŸ“‹

  • Project Management πŸ“…
    • Preparation πŸ“
    • Planning πŸ—“οΈ
    • Management πŸ“Š
    • Evaluation πŸ“ˆ
    • Monitoring and Control of:
      • Resources πŸ’°
      • Calendar πŸ“†
      • Costs πŸ’Έ
      • Scope 🎯
      • Risk 🚨
      • Quality 🌟
      • Requirements πŸ“‘
      • Value πŸ’Ž
      • Satisfaction 😊
  • Tools: TFS, MS Project, GANT, PERT

Github repositories stats

Mathematicator

Let's connect and collaborate! πŸš€

crypto_research's People

Contributors

mathematicator avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.