Giter VIP home page Giter VIP logo

ai-powered-search's Introduction

AI-Powered Search

Code examples for the book AI-Powered Search by Trey Grainger, Doug Turnbull, and Max Irwin. Published by Manning Publications.

Book Overview

AI-Powered Search teaches you the latest machine learning techniques to build search engines that continuously learn from your users and your content to drive more domain-aware and intelligent search.

Search engine technology is rapidly evolving, with Artificial Intelligence (AI) driving much of that innovation. Crowdsourced relevance and the integration of large language models (LLMs) like GPT and other foundation models are massively accelerating the capabilities and expectations of search technology.

AI-Powered Search will teach you modern, data-science-driven search techniques like:

  • Semantic search using dense vector embeddings from foundation models
  • Retrieval Augmented Generation
  • Question answering and summarization combining search and LLMs
  • Fine-tuning transformer-based LLMs
  • Personalized search based on user signals and vector embeddings
  • Collecting user behavioral signals and building signals boosting models
  • Semantic knowledge graphs for domain-specific learning
  • Implementing machine-learned ranking models (learning to rank)
  • Building click models to automate machine-learned ranking
  • Generative search, hybrid search, and the search frontier

Today’s search engines are expected to be smart, understanding the nuances of natural language queries, as well as each user’s preferences and context. This book empowers you to build search engines that take advantage of user interactions and the hidden semantic relationships in your content to automatically deliver better, more relevant search experiences.

How to run

For simplicity of setup, all code is shipped in Jupyter Notebooks and packaged in Docker containers. This means that installing Docker and then pulling (or building) and running the book's Docker containers is the only necessary setup.

Appendix A of the book provides full step-by-step instructions for running the code examples. To get up and running quickly, however, pull the book's source code and run:

cd docker
docker-compose up

Once the containers are built and running (may take a while, especially on the first build), visit: http://localhost:8888 to pull up all the Jupyter notebooks and run the code examples live.

Supported Technologies

AI-Powered Search teaches many modern search techniques leveraging machine learning approaches. While we utilize specific technologies to demonstrate concepts, most techniques are applicable to many modern search engines and vector databases.

Throughout the book, all code examples are in Python, with PySpark (the Python interface to Apache Spark) being utilized heavily for data processing tasks. The default search engine leveraged by the book's examples is Apache Solr, but most examples are abstracted away from the particular search engine, and swappable implementation will be soon available for the most popular search engines and vector databases.

[ Note: if you work for a search engine / vector database company, project, or hosting provider and want to work with us on getting your engine supported, please reach out to [email protected] ]

Questions and help

Your purchase of AI-Powered Search includes online access to Manning's LiveBook forum. This allows you to provide comments and ask questions about any parts of the book. Additionally, feel free to submit pull requests, Github issues, or comments on the project's official Github repo at https://github.com/treygrainger/ai-powered-search.

License

All code in this repository is open source under the Apache License, Version 2.0 (ASL 2.0), unless otherwise specified.

Note that when executing the code, it may pull additional dependencies that follow alternate licenses, so please be sure to inspect those licenses before using them in your projects to ensure they are suitable. The code may also pull in datasets subject to various licenses, some of which may be derived from AI models and some of which may be derived from web crawls of data subject to fair use under the copyright laws in the country of publication (the USA). Any such datasets are published "as-is", for the sole purpose of demonstrating the concepts in the book, and these datasets and their associated licenses may be subject to change over time.

Grab a copy of the book

If you don't yet have a copy, please support the authors and the publisher by purchasing a copy of AI-Powered Search. It will walk you step by step through the concepts and techniques shown in the code examples in this repository, providing needed context and insights to help you better understand the techniques.

ai-powered-search's People

Contributors

treygrainger avatar dcrouch26 avatar alexott avatar softwaredoug avatar dependabot[bot] avatar binarymax avatar toxicafunk avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.