This repository includes the slides and some of the notebooks that are used in my Evaluation workshops.
Some of the notebooks do require an OpenAI API key.
These notebooks are intended for explaining key points of the talk, please don't try to bring them to production use. If you want to dig deeper or have issues, go to the source for each of these projects.
Prompting a Chatbot: Colab notebook
Testing Properties of a System: Guidance AI
Langtest tutorials from John Snow Labs: Colab Notebooks
LLM Evaluation Harness from EleutherAI: Github or Colab notebook
Ragas showing Model as an evaluator: Github or Colab notebook
Evaluate LLMs and RAG a practical example using Langchain and Hugging Face: Github
Argilla for Annotation: Spaces login: admin password: 12345678
Generative AI Summit, Austin (Oct 2023) - Slides
ODSC West, San Francisco (Nov 2023) - Slides
Josh Tobin's Evaluation talk YouTube
Mahesh Deshwal's LLM Evaluation Google Doc