Giter VIP home page Giter VIP logo

Comments (11)

ShishirPatil avatar ShishirPatil commented on July 16, 2024 3

@fire we have the mpt-ggml and the llama--ggml models up on Huggingface!
gorilla-llm/gorilla-7b-hf-v1-ggml
gorilla-llm/gorilla-mpt-7b-hf-v0-ggml

from gorilla.

fire avatar fire commented on July 16, 2024 2

I am excited about a possible integration using ggml and mpt.

https://github.com/ggerganov/ggml/tree/master/examples/mpt

How much code is specific to gorilla that needs to port from python to c++?

How much functionality is finetuning the llm?

from gorilla.

ShishirPatil avatar ShishirPatil commented on July 16, 2024 2

Hey @fire for the first cut, we don't have to use any gorilla specific code nor any finetuning. It would just be inference - and there is no change in the architecture of either llama or MPT, so the port should be pretty straightforward. The model weights are here https://huggingface.co/gorilla-llm

from gorilla.

fire avatar fire commented on July 16, 2024 2

The links are 404'ing not found.

from gorilla.

pranramesh avatar pranramesh commented on July 16, 2024 1

@ShishirPatil I believe the model here (https://huggingface.co/gorilla-llm/gorilla-7b-hf-v1-ggml) is a quantized version of the delta weights model (if this is the model quantized from the llama script posted above by @fire). I tried running inference and it was poor, probably due to the fact that it wasn't merged with llama first.

from gorilla.

fire avatar fire commented on July 16, 2024

As of today which model should I be using? (weights)

from gorilla.

fire avatar fire commented on July 16, 2024

Can someone help me quantize? I'm currently using mobile internet.

# get the repo and build it
git clone https://github.com/ggerganov/ggml
cd ggml
mkdir build && cd build
cmake ..
make -j

# get the model from HuggingFace
# be sure to have git-lfs installed
git clone https://huggingface.co/gorilla-llm/gorilla-mpt-7b-hf-v0

# convert model to FP16
python3 ../examples/mpt/convert-h5-to-ggml.py ./gorilla-mpt-7b-hf-v0 1

# run inference using FP16 precision
./bin/mpt -m ./gorilla-mpt-7b-hf-v0/ggml-model-f16.bin -p "I would like to translate 'I feel very good today.' from English to Chinese." -t 8 -n 64

# quantize the model to 5-bits using Q5_0 quantization
./bin/mpt-quantize ./gorilla-mpt-7b-hf-v0/ggml-model-f16.bin ./gorilla-mpt-7b-hf-v0/ggml-model-q5_0.bin q5_0

# run inference using FP16 precision
./bin/mpt -m ./gorilla-mpt-7b-hf-v0/ggml-model-q5_0.bin -p "I would like to translate 'I feel very good today.' from English to Chinese." -t 8 -n 64

from gorilla.

ShishirPatil avatar ShishirPatil commented on July 16, 2024

@fire good question re: models. gorilla-7b-hf-delta-v1 and gorilla-mpt-7b-hf-v0 are good models to get started with. The first is a diff of the model with llama base, and the second is MPT based.

re: quantize. How do you want to access the quantized model? 👀

from gorilla.

fire avatar fire commented on July 16, 2024

I was expecting to be put next to https://huggingface.co/gorilla-llm but with a ggml tag and the tag q5 I think.

I've also written up instructions for llama

gorilla-llm/gorilla-7b-hf-delta-v1

# get the repo and build it
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build && cd build
cmake ..
make -j

# get the model from HuggingFace
# be sure to have git-lfs installed
git clone https://huggingface.co/gorilla-llm/gorilla-7b-hf-delta-v1

# convert model to FP16
python3 convert.py ~/gorilla-7b-hf-delta-v1

# run inference using FP16 precision
./bin/main -m ./gorilla-7b-hf-delta-v1/ggml-model-f16.bin -p "I would like to translate 'I feel very good today.' from English to Chinese." -t 8 -n 64

# quantize the model to 5-bits using Q5_0 quantization
./bin/quantize ./gorilla-7b-hf-delta-v1/ggml-model-f16.bin ./gorilla-7b-hf-delta-v1/ggml-model-q5_0.bin q5_0

# run inference using FP16 precision
./bin/main -m ./gorilla-7b-hf-delta-v1/ggml-model-q5_0.bin -p "I would like to translate 'I feel very good today.' from English to Chinese." -t 8 -n 64

Llama evaluates poorly. No idea why.

from gorilla.

ShishirPatil avatar ShishirPatil commented on July 16, 2024

Yikes, I think they were private! Made it public. Let me know if it works! Also, feel free to raise a PR for updates to README or anything you want to put into the HF models repo!

from gorilla.

CHIRU98 avatar CHIRU98 commented on July 16, 2024

Hi ShishirPatil still this is "gorilla-llm/gorilla-7b-hf-v1-ggml" is private.can you check ones.still its getting the 401 Client Error.

from gorilla.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.