Giter VIP home page Giter VIP logo

llm_inference_tuto's Introduction

server llm compatible openai

Pre-requis

sudo apt-get update ; 
sudo apt-get upgrade -y ; 
sudo apt-get install curl wget software-properties-common python3 pip -y ; 

installer cuda

issue de la page de telechargement de Nvidia : https://developer.nvidia.com/cuda-downloads

wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda-repo-debian11-12-2-local_12.2.2-535.104.05-1_amd64.deb;
sudo dpkg -i cuda-repo-debian11-12-2-local_12.2.2-535.104.05-1_amd64.deb;
sudo cp /var/cuda-repo-debian11-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/;
sudo add-apt-repository contrib ; 
sudo apt-get update ; 
sudo apt-get -y install cuda ;

installer le serveur de LLM

CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.2/bin/nvcc " FORCE_CMAKE=1 pip install --break-system-packages  --upgrade --force-reinstall llama-cpp-python[server] --no-cache-dir ;

start serveur with model

export GGML_CUDA_NO_PINNED=1 && python3 -m llama_cpp.server --model /mnt/c/Users/veka/AppData/Roaming/faraday/models/openorca-platypus2-13b.Q3_K_L.gguf --host=0.0.0.0 --port=1234 --use_mlock=false --n_ctx 12000 --n_gpu_layers=43 
export GGML_CUDA_NO_PINNED=1 && python3 -m llama_cpp.server --model "/mnt/g/Ressources AI/models-llm/codellama-13b-instruct.Q3_K_L.gguf" --host=0.0.0.0 --port=1234 --use_mlock=false --n_ctx 12000 --n_gpu_layers=43 

Swagger

Then just navigate to http://localhost:8000/docs to start playing around with it using the Swagger UI.

API

http://localhost:1234/v1

llm_inference_tuto's People

Contributors

veka-server avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.