Giter VIP home page Giter VIP logo

Comments (2)

BBC-Esq avatar BBC-Esq commented on June 13, 2024

Here is yet another badass model @minhthuc2502 . Would love to help create a converter but am not an expert. It's the 1.6b version of Zephyr:

https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b

It kicks ass for its size. The only other small models with a context size of over 4,000 is gemma, which, at least in my testing, royally sucks. (referring to Gemma 2b, newest version 1.1 included).

Currently, the only reasonable option to build a chat application with ctranslate2 that uses a model smaller than 7b requires using gemma. I use the term "reasonable" because the phi converter is currently broke due to changes in phi2, and, at any rate, phi2 only has a context of 2048.

Zephyr 3b and Zephyr 1.6b are the best in their class, way better than gemma 2b. Other viable options are creating a converter for Qwen, which has a .5B model, actually.

Here are tests for gemma and others on a basic RAG question. gemma 2b only got half the question right no matter how many beams I used. HOWEVER, even the zephyr 1.6B model gave a 100% correct answer at beam size of 1.

In short, gemma 2b is fast, but sucks, while zephyr is only slightly less fast, but IS ABSOLUTELY AWESOME.

NOTE: The models in the legend with "ct2" in their name are obviously ctranslate2 models. The other models were tested using transformers along with bitsandbytes (using 4-bit), just FYI.

Lastly, llama.cpp already supports zephyr, qwen and others, but I'd rather not switch due to the additional dependency...Let me know @minhthuc2502 if you'll reconsider making this a higher priority. I know you're busy...thanks dude.

image

from ctranslate2.

BBC-Esq avatar BBC-Esq commented on June 13, 2024

To maybe save you a few minutes..I've gathered the following information for someone/anyone:

  1. The config.json states that the architecture is "StableLmForCausalLM"

  2. I think this is it https://huggingface.co/docs/transformers/v4.40.0/en/model_doc/stablelm

  3. Additional info: https://stability.wandb.io/stability-llm/stable-lm/reports/StableLM-3B-4E1T--VmlldzoyMjU4?accessToken=u3zujipenkx5g7rtcj9qojjgxpconyjktjkli2po09nffrffdhhchq045vp0wyfo

Based on this snippet, hopefully it wouldn't be too complicated to create a converter for it...

image

from ctranslate2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.