Giter VIP home page Giter VIP logo

llmbb's Introduction

Machine Learning - Large Language Models

The current cutting edge machine learning models are Large Language Models otherwise known as LLMs. LLMs made a commercial and public breakthrough with the release of OpenAI's GPT-3, which has opened the envolope for a new era of machine learning and the creation of several different LLMs from many different companies including Google's BERT which infamously became known for providing false information during its live debut.

These models are trained on large datasets and are capable of generating human-like text. They are used in a wide variety of applications including chatbots, language translation, and text generation.

How are Large Language Models Trained?

LLMs are trained using a technique called unsupervised learning. This means that the model is trained on a large dataset of text without any human supervision or labeled data. The model learns to generate human-like text by predicting the next word in a sentence based on the words that came before it. This is done using a technique called autoregressive language modeling. The model is trained to generate text that is similar to the text in the training dataset, and it learns to do this by adjusting its internal parameters to minimize the difference between the generated text and the real text. This process is repeated millions of times on a large dataset of text, and the model gradually learns to generate human-like text. The training process is computationally intensive and requires a large amount of data and computational resources. For example, OpenAI's GPT-3 was trained on 175 billion parameters and required thousands of GPUs to train.

An important note here is the initial training process- otherwise known as the pre-training process- is incredibly computationaly expensive and time consuming. This is why many companies have started to offer pre-trained models that can be fine-tuned on a smaller dataset for a specific task. This is known as transfer learning and it allows users to leverage the overall model's performance and while also being able to better fine tune the model on specific topics or tasks.

Deploying Large Language Models Locally

  1. LM Studio - LM Studio is a platform that allows you to deploy large language models locally. As of today, LM Studio is considered the state of the art solution for local deployments with the ability to not only deploy models on local hardware but also the ability to download models within the UI.
  2. Model Selection - Once installing LM Studio the next task is selecting a model to download and deploy. What you model you select depends on your task, your computer's computional resources, and the size of the model. Something to note here is that often models provided several different version each of which being a different size and providing varying levels of performance.
  3. Model Deployment - Once you have selected a model you can deploy it locally. This process has been made incredibly simple via tools like LM Studio, and once deployed can be tasked to do things such as generate text based on a prompt or act as an ai chatbot.

Resources

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.