Comments (2)
我也有个类似的问题,lora微调qwen的时候,数据集不大,但是总是GPU报错OOM 就失败了
from llama-factory.
per_device_train_batch_size 都等于1了,为啥还能OOM,我是24G内存 微调qwen2-7B-instruct 这怎么算能OOM啊?
from llama-factory.
Related Issues (20)
- 无法通过pip install -e ".[torch-npu,metrics]"安装 HOT 3
- 导出模型的时候发生了错误
- 微调 glaive-function-calling-v2-sharegpt 数据集时遇到错误`TypeError: argument of type 'bool' is not iterable` HOT 2
- 微调 glaive-function-calling-v2-sharegpt 数据集时遇到错误TypeError: argument of type 'bool' is not iterable HOT 3
- LLaMA-Factory-0.8.2,微调以后,怎么加载微调后的模型在web页面测试新模型呢? HOT 2
- why For the "base" models, the template argument can be chosen from default, alpaca, vicuna etc. But make sure to use the corresponding template for the "instruct/chat" models.
- vllm多卡推理遇到的问题 HOT 4
- 同样条件下推理的速度与差异 ..
- merge合并多个lora时,怎么平均各个lora再合并
- ppo训练完成后推理报错
- 页面训练 HOT 2
- 关于运用llama-factory中的deepseed全参数微调qwen2-7b-instruct面临问题 HOT 1
- 请问微调阶段能否只通过配置文件扩展词表,如果可以应该添加什么样的参数?我将新的token存放在new_tokens.txt中
- 您好,请问下,一个纯文本的txt文档来做预训练的话,dataset_info.json该如何添加这个新数据集?我需要将这个txt的内容转换成这种格式吗[ {"text": "document"}, {"text": "document"} ]?如果我不想转,就是想使用一个书本txt做预训练该如何做 HOT 1
- Qwen2 使用bf16微调,loss训练一会后变成nan HOT 2
- 为什么识别了我的测试集,还是会报错,说文件不存在 HOT 1
- PPO的reward model训练卡住
- freeze微调报警告None of the inputs have requires_grad=True. Gradients will be None HOT 1
- 执行官方dockerfile报错 HOT 1
- 执行官方dockerfile报错
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama-factory.