Giter VIP home page Giter VIP logo

rwkv_pytorch's Introduction

RWKV_Pytorch

这是一个用纯Pytorch原生实现的RWKV大语言模型的推理框架,官方的原生实现过于复杂且无法拓展生态,让我们加入灵活的Pytorch阵营,一起开源起来吧!

This is an inference framework for the RWKV large language model implemented purely in native PyTorch. The official native implementation is overly complex and lacks extensibility. Let's join the flexible PyTorch ecosystem and open-source it together!


特性

  • 原生pytorch实现!
  • 支持batch推理!
  • 支持并行推理!充分发挥RWKV优势!
  • 代码整洁,容易阅读和二次开发!
  • 支持导出并推理onnx格式模型!

Features

  • Native PyTorch implementation!
  • Supports batch inference!
  • Support parallel inference! Fully leverage the advantages of RWKV!
  • Clean codebase, easy to read and extend!
  • Supports exporting and inference with ONNX format models!

使用方法

  1. 克隆仓库 git clone -b dev https://github.com/yuunnn-w/RWKV_Pytorch.git
  2. 执行 cd RWKV_Pytorch 进入仓库目录,执行 pip install -r requirements.txt 安装依赖。
  3. 下载 RWKV6 模型,官方仓库地址:BlinkDL/rwkv-6-world,将模型权重放置在weight文件夹中。
  4. 修改 main.py 文件的 MODEL_NAME 参数。
  5. 执行 python main.py,即可看到batch推理效果。

流水并行(pipeline parallel)使用方法

  1. 克隆仓库 git clone -b pipeline https://github.com/yuunnn-w/RWKV_Pytorch.git
  2. 执行 cd RWKV_Pytorch 进入仓库目录,执行 pip install -r requirements.txt 安装依赖。
  3. 下载 RWKV6 模型,官方仓库地址:BlinkDL/rwkv-6-world,将模型权重放置在weight文件夹中。
  4. 修改 train/params.json 文件的 MODEL_NAME 参数。
  5. 执行 torchrun --nproc-per-node 3 train/train-parallel.py开始训练。

Usage

  1. Clone the repository: git clone https://github.com/yuunnn-w/RWKV_Pytorch.git
  2. Navigate to the repository directory: cd RWKV_Pytorch, then install the dependencies: pip install -r requirements.txt.
  3. Download the RWKV6 model from the official repository: BlinkDL/rwkv-6-world, and place the model weights in the weight directory.
  4. Modify the MODEL_NAME parameter in the main.py file.
  5. Run python main.py to see the batch inference results.

导出onnx方法

  1. 修改 onnx_export.py 文件参数为你想导出的模型。
  2. 执行 python onnx_export.py 即可导出到./onnx路径。
  3. (可选)执行 mkdir ONNX_Simplified 创建一个用于存放简化算子模型的目录。
  4. (可选)执行 python simplify_large_onnx.py -m onnx/{model name}.onnx -o ONNX_Simplified/{model name}.onnx 来简化模型,简化后的模型将存放在ONNX_Simplified目录。
  5. (可选)修改 onnx_infer.py 文件内的模型路径参数,执行 python onnx_infer.py 即可推理onnx格式模型。

ONNX Export Method

  1. Modify the parameters in the onnx_export.py file to specify the model you want to export.
  2. Run python onnx_export.py to export the model to the ./onnx directory.
  3. (Optional) Create a directory for storing simplified operator models by running mkdir ONNX_Simplified.
  4. (Optional) Simplify the model by running python simplify_large_onnx.py -m onnx/{model name}.onnx -o ONNX_Simplified/{model name}.onnx. The simplified model will be stored in the ONNX_Simplified directory.
  5. (Optional) Modify the model path parameter in the onnx_infer.py file, then run python onnx_infer.py to perform inference on the ONNX format model.

本地部署体验

  1. 修改 openai_api.py 文件中的模型配置参数。
  2. 执行 python openai_api.py 即可启动后端。
  3. 用任意符合 OpenAI API 规范的客户端,填入 http://127.0.0.1:8848 作为 API_URL 参数,即可体验。

Local Deployment Experience

  1. Modify the model configuration parameters in the openai_api.py file.
  2. Execute python openai_api.py to start the backend.
  3. Use any client that conforms to the OpenAI API specifications, and fill in http://127.0.0.1:8848 as the API_URL parameter to experience it.

已知的问题:

  • 已知op17版本才支持LayerNorm算子,op18版本才支持GroupNorm算子,目前torch的preview版本支持op18,但是无法导出,current版本只支持op17,能够正常导出含LayerNorm算子的模型。你可以参照main.py 使用opset参数指定

Known Issues:

  • LayerNorm operators are supported in op17 version, while GroupNorm operators are supported in op18 version. The current torch preview version supports op18 but cannot be exported. The current version only supports op17 and can export models containing LayerNorm operators. You can use parameter similar in main.py to support lower op_set versions.

注意,本框架目前仅支持RWKV v6模型,具体版本号为x060

Please note that this framework currently only supports RWKV v6 models, specifically version x060.


预计未来基于本项目适配香橙派推出的AI Pro开发板,实现在昇腾的生态上推理国产大语言模型RWKV!!!

In the future, based on this project, adaptation for the AI Pro development board launched by Xunlong Orange Pi is planned to enable inference of the domestic large language model RWKV on the Ascend ecosystem!!!


另外,经过测试,v6 1.6B导出并优化后的onnx模型含有如下算子:

Additionally, after testing, the ONNX model exported and optimized from v6 1.6B contains the following operators:

  • Operator Type: Gather, Count: 145
  • Operator Type: Squeeze, Count: 121
  • Operator Type: ReduceMean, Count: 148
  • Operator Type: Sub, Count: 122
  • Operator Type: Mul, Count: 484
  • Operator Type: Add, Count: 675
  • Operator Type: Sqrt, Count: 74
  • Operator Type: Div, Count: 74
  • Operator Type: Shape, Count: 240
  • Operator Type: Expand, Count: 240
  • Operator Type: Range, Count: 72
  • Operator Type: Reshape, Count: 384
  • Operator Type: Equal, Count: 72
  • Operator Type: Where, Count: 72
  • Operator Type: Unsqueeze, Count: 192
  • Operator Type: Concat, Count: 192
  • Operator Type: ScatterND, Count: 72
  • Operator Type: MatMul, Count: 337
  • Operator Type: Tanh, Count: 48
  • Operator Type: Split, Count: 24
  • Operator Type: Exp, Count: 48
  • Operator Type: Neg, Count: 24
  • Operator Type: Sigmoid, Count: 48
  • Operator Type: Slice, Count: 24
  • Operator Type: Flatten, Count: 24
  • Operator Type: Relu, Count: 24

优化模型用到的仓库:onnxsim_large_model

贡献者 (Contributors)

yuunnn-w
Yuunnn_w
WuTianyi321
WuTianyi
uniartisan
Zhiyuan Li
jiamingkong
Null

技术交流群 (Technical exchange group)

QQ交流群


感谢各位大佬做出的贡献!欢迎各路大神为本项目提PR和Issue!你们的贡献对本项目十分有价值!!!

We warmly invite everyone to contribute to the project by submitting PRs and raising Issues! Your input and contributions are highly valued and play a vital role in improving the project for the entire community. Let's collaborate and make this project even better together!

rwkv_pytorch's People

Contributors

yuunnn-w avatar wutianyi321 avatar uniartisan avatar jiamingkong avatar aknifejackzhmolong avatar

Stargazers

辻回十六夜 avatar 晴岚Horizon avatar aurae avatar Kaicheng Yang avatar ChengSixiang avatar chen gang avatar 000Szppppz avatar Fredo Guan avatar  avatar Jun Xu avatar  avatar  avatar YuChuXi avatar  avatar Churnie HXCN avatar  avatar  avatar lithium avatar  avatar  avatar Eliwii_Keeya avatar  avatar  avatar  avatar yinguo avatar  avatar Zhe Lin avatar neromous avatar zhang cheng avatar yulin li avatar Somebody avatar FanqingM avatar 傻木 avatar  avatar  avatar  avatar  avatar Haopeng Li avatar  avatar  avatar blameitonme avatar  avatar  avatar Faych Chen avatar 雨尽荼蘼 avatar superplus avatar averyyan2010 avatar Shida Wang avatar  avatar Lambda Shi  avatar Felipe Menegazzi avatar Songlin Yang avatar  avatar Shellexy avatar luoqiqi avatar ifling avatar  avatar  avatar Jostar_Lin avatar OpenMOSE avatar Alic-Li avatar  avatar 庵中十三居士 avatar Huadong Xiong avatar  avatar  avatar  avatar  avatar TimRambo avatar  avatar king avatar wdc avatar Hugo_Liu avatar  avatar  avatar  avatar 研究社交 avatar Yang Cao avatar  avatar Leon Zou avatar  avatar Jerry avatar lxrmido avatar Bin avatar 关注桃几OvO喵 avatar Fyrik Ren avatar kewsky avatar 布客飞龙 avatar C. MX avatar  avatar  avatar  avatar  avatar zc_zhu avatar Jerry Yin avatar josc146 avatar 睡觉型学渣 avatar  avatar  avatar  avatar

Watchers

Eric Field avatar Kostas Georgiou avatar lhy avatar  avatar

rwkv_pytorch's Issues

请问一下代码中的并行forward 是什么意思呢

   def forward(self, x: torch.Tensor, state: torch.Tensor, i: int) -> torch.Tensor:
        """
        模型的前向传播。
        Args:
            x (torch.Tensor): 输入张量,形状为[Batch, N_embd]。
            state (torch.Tensor): 隐藏状态张量,形状为[Batch, State Size, N_embd]。
            i (int): 时间索引。
        Returns:
            torch.Tensor: 前向传播结果张量,形状与输入的x相同。
        """
        if self.onnx_opset >= 17:
            x = x + self.time_mixing(self.ln1(x), state, i)
            x = x + self.channel_mixing(self.ln2(x), state, i)
        else:
            x = x + self.time_mixing(self.manual_layer_norm(x, self.ln1_weight, self.ln1_bias, 1e-5), state, i)
            x = x + self.channel_mixing(self.manual_layer_norm(x, self.ln2_weight, self.ln2_bias, 1e-5), state, i)
        return x
        
    def forward_parallel(self, x: torch.Tensor, state: torch.Tensor, i: int) -> torch.Tensor:
        """
        模型的并行前向传播。
        Args:
            x (torch.Tensor): 输入张量,形状为[Batch, L, N_embd]。
            state (torch.Tensor): 隐藏状态张量,形状为[Batch, State Size, N_embd]。
            i (int): 时间索引。
        Returns:
            torch.Tensor: 前向传播结果张量,形状与输入的x相同。
        """
        if self.onnx_opset >= 17:
            x = x + self.time_mixing_parallel(self.ln1(x), state, i)
            x = x + self.channel_mixing_parallel(self.ln2(x), state, i)
        else:
            x = x + self.time_mixing_parallel(self.manual_layer_norm(x, self.ln1_weight, self.ln1_bias, 1e-5), state, i)
            x = x + self.channel_mixing_parallel(self.manual_layer_norm(x, self.ln2_weight, self.ln2_bias, 1e-5), state, i)
        return x

我发现这里面的输入一个是[Batch, N_embd],另外一个是[Batch, L, N_embd],请问这里面的 L是什么意思呢

Is there a way to export using torch.jit.script ?

Thanks for this great repository!

I was wondering if there a way to export using torchscript? I tried a simple approach with torch.jit.script(model), but I get:

RuntimeError: 
Module 'RWKV_Block' has no attribute 'att_group_norm' :
  File "/data/workspaces/jp/LLMs/RWKV_Pytorch/src/model.py", line 229
        # 展平x并应用组归一化和门控
        if self.onnx_opset >= 18:
            x = self.att_group_norm(x.flatten(start_dim=1)) * g
                ~~~~~~~~~~~~~~~~~~~ <--- HERE
        else:
            x = x.flatten(start_dim=1) 
'RWKV_Block.time_mixing' is being compiled since it was called from 'RWKV_Block.forward'
  File "/data/workspaces/jp/LLMs/RWKV_Pytorch/src/model.py", line 319
        """
        if self.onnx_opset >= 17:
            x = x + self.time_mixing(self.ln1(x), state, i)
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            x = x + self.channel_mixing(self.ln2(x), state, i)
        else:

关于在香橙派上部署的一些问题

我还真的试了一下在香橙派ai pro 16G上推理,有以下问题:

  1. 香橙派不支持bf16,只能用fp16和fp32
  2. fp16会nan, 要每隔6层把x/2, 然后attention用fp32

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.