Giter VIP home page Giter VIP logo

llama-flask's Introduction

LLaMA Text Generation API Readme

Welcome to LLaMA Text Generation API! This API is implemented in Python using Flask and utilizes a pre-trained LLaMA model for generating text based on user input.

Table of Contents

Setting Up Virtual Environment

  1. Install Virtual Environment:

    Ensure that Python 3 and pip are installed and then run:

    pip install virtualenv
  2. Create Virtual Environment:

    Navigate to the project directory and run:

    virtualenv venv
  3. Activate Virtual Environment:

    • Windows:
      .\venv\Scripts\activate
    • Linux/Mac:
      source venv/bin/activate

Setup

Clone the repo

git clone https://github.com/Lightning-AI/lit-llama
cd lit-llama

install dependencies

pip install -r requirements.txt

You are all set! ๐ŸŽ‰

ย 

Choosing a GPU

  • LLaMA 7B and 13B Models: Sufficiently run on an A500 with 24GB VRAM.
  • LLaMA 30B Model: Requires a more powerful GPU such as A40 with 48GB VRAM.

Running API with Torch Command

You should navigate to the project directory and run:

python app.py

The API will be hosted on http://0.0.0.0:5000/complete.

Running API using Gunicorn

  1. Install Gunicorn: Ensure the virtual environment is activated and then run:

    pip install gunicorn
  2. Run API: Use gunicorn to serve the Flask app:

    gunicorn -w 4 -b 0.0.0.0:5000 app:app
  3. Set Up Gunicorn as a Service:

    • Create a gunicorn systemd service file:
      sudo nano /etc/systemd/system/llama-api.service
    • Add the following content and adjust paths accordingly:
      [Unit]
      Description=Gunicorn instance to serve LLaMA API
      After=network.target
      
      [Service]
      User=your_user
      Group=www-data
      WorkingDirectory=/path/to/your/project
      Environment="PATH=/path/to/your/project/venv/bin"
      ExecStart=/path/to/your/project/venv/bin/gunicorn --workers 4 --bind 0.0.0.0:5000 app:app
      
      [Install]
      WantedBy=multi-user.target
      
    • Start and enable the gunicorn service:
      sudo systemctl start llama-api
      sudo systemctl enable llama-api

API Usage with cURL

Example cURL request:

curl -X POST http://0.0.0.0:5000/complete \
-H "Content-Type: application/json" \
-d '{"text": "Once upon a time,", "top_p": 0.9, "top_k": 50, "temperature": 0.8, "length": 30}'

Example response:

{
   "completion":{
      "generation_time":"0.8679995536804199s",
      "text":["Once upon a time, the kingdom was ruled by a wise and just king..."]
   }
}

Request/Response Objects

  • Request:

    • text: The input text (string).
    • top_p: Probability for nucleus sampling (float).
    • top_k: The number of top most probable tokens to consider (integer).
    • temperature: Controls the randomness of the sampling process (float).
    • length: The number of new tokens to generate (integer).
  • Response:

    • text: The generated text based on the input (string).
    • generation_time: Time taken to generate the text (string, formatted as seconds).

Using Postman

  1. Set Up Postman: Download and install Postman from Postman's official site.

  2. Send Request:

    • Set the request type to POST.
    • Enter the request URL: http://0.0.0.0:5000/complete.
    • Navigate to the "Body" tab, select "raw" and "JSON (application/json)".
    • Enter the JSON payload:
      {
         "text": "Once upon a time,",
         "top_p": 0.9,
         "top_k": 50,
         "temperature": 0.8,
         "length": 30
      }
    • Click "Send" and view the API's response in the section below.

And that concludes our README guide! Feel free to adapt this guide as per additional requirements for your API.

llama-flask's People

Contributors

ahmedjawedaj avatar

Stargazers

Ashish K Sahoo avatar Adeel Rauf avatar

Watchers

 avatar

Forkers

ashish1981

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.