Comments (11)
@fire we have the mpt-ggml and the llama--ggml models up on Huggingface!
gorilla-llm/gorilla-7b-hf-v1-ggml
gorilla-llm/gorilla-mpt-7b-hf-v0-ggml
from gorilla.
I am excited about a possible integration using ggml and mpt.
https://github.com/ggerganov/ggml/tree/master/examples/mpt
How much code is specific to gorilla that needs to port from python to c++?
How much functionality is finetuning the llm?
from gorilla.
Hey @fire for the first cut, we don't have to use any gorilla specific code nor any finetuning. It would just be inference - and there is no change in the architecture of either llama or MPT, so the port should be pretty straightforward. The model weights are here https://huggingface.co/gorilla-llm
from gorilla.
The links are 404'ing not found.
from gorilla.
@ShishirPatil I believe the model here (https://huggingface.co/gorilla-llm/gorilla-7b-hf-v1-ggml) is a quantized version of the delta weights model (if this is the model quantized from the llama script posted above by @fire). I tried running inference and it was poor, probably due to the fact that it wasn't merged with llama first.
from gorilla.
As of today which model should I be using? (weights)
from gorilla.
Can someone help me quantize? I'm currently using mobile internet.
# get the repo and build it
git clone https://github.com/ggerganov/ggml
cd ggml
mkdir build && cd build
cmake ..
make -j
# get the model from HuggingFace
# be sure to have git-lfs installed
git clone https://huggingface.co/gorilla-llm/gorilla-mpt-7b-hf-v0
# convert model to FP16
python3 ../examples/mpt/convert-h5-to-ggml.py ./gorilla-mpt-7b-hf-v0 1
# run inference using FP16 precision
./bin/mpt -m ./gorilla-mpt-7b-hf-v0/ggml-model-f16.bin -p "I would like to translate 'I feel very good today.' from English to Chinese." -t 8 -n 64
# quantize the model to 5-bits using Q5_0 quantization
./bin/mpt-quantize ./gorilla-mpt-7b-hf-v0/ggml-model-f16.bin ./gorilla-mpt-7b-hf-v0/ggml-model-q5_0.bin q5_0
# run inference using FP16 precision
./bin/mpt -m ./gorilla-mpt-7b-hf-v0/ggml-model-q5_0.bin -p "I would like to translate 'I feel very good today.' from English to Chinese." -t 8 -n 64
from gorilla.
@fire good question re: models. gorilla-7b-hf-delta-v1 and gorilla-mpt-7b-hf-v0 are good models to get started with. The first is a diff of the model with llama base, and the second is MPT based.
re: quantize. How do you want to access the quantized model? 👀
from gorilla.
I was expecting to be put next to https://huggingface.co/gorilla-llm but with a ggml tag and the tag q5 I think.
I've also written up instructions for llama
gorilla-llm/gorilla-7b-hf-delta-v1
# get the repo and build it
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
mkdir build && cd build
cmake ..
make -j
# get the model from HuggingFace
# be sure to have git-lfs installed
git clone https://huggingface.co/gorilla-llm/gorilla-7b-hf-delta-v1
# convert model to FP16
python3 convert.py ~/gorilla-7b-hf-delta-v1
# run inference using FP16 precision
./bin/main -m ./gorilla-7b-hf-delta-v1/ggml-model-f16.bin -p "I would like to translate 'I feel very good today.' from English to Chinese." -t 8 -n 64
# quantize the model to 5-bits using Q5_0 quantization
./bin/quantize ./gorilla-7b-hf-delta-v1/ggml-model-f16.bin ./gorilla-7b-hf-delta-v1/ggml-model-q5_0.bin q5_0
# run inference using FP16 precision
./bin/main -m ./gorilla-7b-hf-delta-v1/ggml-model-q5_0.bin -p "I would like to translate 'I feel very good today.' from English to Chinese." -t 8 -n 64
Llama evaluates poorly. No idea why.
from gorilla.
Yikes, I think they were private! Made it public. Let me know if it works! Also, feel free to raise a PR for updates to README or anything you want to put into the HF models repo!
from gorilla.
Hi ShishirPatil still this is "gorilla-llm/gorilla-7b-hf-v1-ggml" is private.can you check ones.still its getting the 401 Client Error.
from gorilla.
Related Issues (20)
- how to test new model on BFCL? HOT 2
- [bug] openfunctions-v2 default chat template
- [feature] Add multi-turn conversational function calling category for benchmarking HOT 2
- the evaluation of class relevance in BFCL maybe unfair HOT 1
- What format was used for the final fine-tuning of LLaMA2-7B in RAFT? HOT 1
- [bug] Hosted Gorilla: <Issue> HOT 6
- The Urban Dictionary from the RapidAPI is not serving, can't evaluate execution data
- auto fill missed mandatory param is a nightmare HOT 3
- [bug] Hosted Gorilla: <Issue> HOT 2
- [bug] Hosted Gorilla: <Issue> HOT 1
- [bug] Hosted Gorilla: <Issue> HOT 2
- Rapid API error (Yahoo Finance, https://rapidapi.com/sparior/api/yahoo-finance15) is inaccessible HOT 6
- Local CUDA Support for RAFT
- Revamp Landing README HOT 3
- [bug] OpenFunctions-v2: <Issue> HOT 1
- [bug] OpenFunctions-v2: <HTTP code 502> HOT 1
- When [Evaluate the Response with AST tree matching]: TypeError: __init__() takes exactly 1 argument (2 given)
- Data issue HOT 1
- Question about AST evaluation for Java and JavaScript HOT 1
- [RAFT] Publish Pypi package with raft, eval and format scripts
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gorilla.