Giter VIP home page Giter VIP logo

moellama's Introduction

説明

TinyLlama/TinyLlama-1.1B-Chat-v0.6をMoEにして試したものです。 Geforce 2080Tiでとりあえず動くようにしてあります。

やり方

ロギングにazure-mlを使用しているので要修正です。

tar xvf o2chcorpus.tar.xz 

python convert_to_moellama.py TinyLlama/TinyLlama-1.1B-Chat-v0.6 moellama_orig --num-experts 1
python moe_train_part1.py
python moe_train_part2.py
python slice_n_splice_experts.py splice moellama_orig/ --expert moe-part1-f1 moe-merged
python slice_n_splice_experts.py splice moe-merged --expert moe-part2-f1 moe-merged
python moe_train_calib.py
python test.py --model moe-finish-f1

データセット

ベンチマーク

未実施

文献情報

@online{DatabricksBlog2023DollyV2, author = {Mike Conover and Matt Hayes and Ankit Mathur and Jianwei Xie and Jun Wan and Sam Shah and Ali Ghodsi and Patrick Wendell and Matei Zaharia and Reynold Xin}, title = {Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM}, year = {2023}, url = {https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm}, urldate = {2023-06-30} }

@inproceedings{open2chdlc2019, title={おーぷん2ちゃんねる対話コーパスを用いた用例ベース対話システム}, author={稲葉 通将}, booktitle={第87回言語・音声理解と対話処理研究会(第10回対話システムシンポジウム), 人工知能学会研究会資料 SIG-SLUD-B902-33}, pages={129--132}, year={2019} }

moellama's People

Contributors

ebisuke avatar

Stargazers

ce-lery avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.