Comments (1)
我们将 PiSSA 的适配器转换成了普通的 LoRA 适配器,因此 r 就会乘以 2,这是特性:
https://github.com/huggingface/peft/tree/main/examples/pissa_finetuning#convert-pissa-to-lora
from llama-factory.
Related Issues (20)
- 用在微调阶段扩展了词表但未导出的大模型推理时在inference文件夹中的配置文件中加入new_special_tokens: "宋浩"(这是我引入的新token)后跟模型对话模型无法产生任何输出
- API启动,P2P问题。
- 训练 loss 始终为 0 HOT 5
- Gemma2-9B SFT微调学习率设置问题请教
- LLaMA-Factory pretrain不支持DeepSpeed zero-stage3 HOT 1
- How to fine tune 405B HOT 1
- 可以支持meta的新模型Meta-Llama-3.1-8B-Instruct吗? HOT 2
- Qwen2-7B-instruct训练参数含义请教 HOT 1
- 默认参数建议:sft阶段是否考虑将学习率调小 HOT 1
- 微调glm-4-9b-chat模型过程是正常的,但是推理的时候报错 ,Transformers是最低版本了 HOT 7
- NPU Qwen2 PPO 出现错误:RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. HOT 1
- 未训练之前,测试加载的glm-4-9b模型,模型回答重复 HOT 1
- 最新版本代码Qwen2-72B-Instruct LORA训练后,推理爆显存
- 使用2bit跑70b模型报npu显存溢出 HOT 2
- 我在腾讯云中全量微调8B的llama架构128k上下文模型,我很确信JSON解析错误不来源于数据集和模型config,仍有json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)。 HOT 1
- 微调glm4-9b,在运行llamafactory-cli train examples/train_lora/glm4_9b_lora_predict.yaml报错 HOT 3
- 请问README中的硬件要求部分,是在多长的序列上测试的 HOT 1
- AttributeError: module 'torch' has no attribute 'float8_e4m3fn'是什么原因? HOT 3
- 建议增加internLM2.5系列模型
- vllm inference "Bfloat16 is only supported on GPUs with compute capability of at least 8.0", is it possible to set dtype? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama-factory.