Comments (4)
我们还没有基于Q-LoRA做过实验
from o-lora.
我们还没有基于Q-LoRA做过实验
有做支持的计划或打算嘛~感觉O-Lora很实用,会像QLora那样成为标配。
之后确实可以研究一下~☺
from o-lora.
我们还没有基于Q-LoRA做过实验
有做支持的计划或打算嘛~感觉O-Lora很实用,会像QLora那样成为标配。
from o-lora.
照着peft中的lora尝试修改新版的peft的lora,遇到了下面的错误
Traceback (most recent call last):
File "/data/miniconda3/envs/axolotl/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/data/miniconda3/envs/axolotl/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/data/repos/axolotl/src/axolotl/cli/train.py", line 38, in
fire.Fire(do_cli)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/data/repos/axolotl/src/axolotl/cli/train.py", line 34, in do_cli
train(cfg=parsed_cfg, cli_args=parsed_cli_args, dataset_meta=dataset_meta)
File "/data/repos/axolotl/src/axolotl/train.py", line 124, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1591, in train
return inner_training_loop(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1729, in _inner_training_loop
model, self.optimizer, self.lr_scheduler = self.accelerator.prepare(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare
result = self._prepare_deepspeed(*args)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/accelerator.py", line 1662, in _prepare_deepspeed
engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/init.py", line 171, in initialize
engine = DeepSpeedEngine(args=args,
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 304, in init
self._configure_optimizer(optimizer, model_parameters)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1212, in _configure_optimizer
self.optimizer = self._configure_zero_optimizer(basic_optimizer)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1473, in _configure_zero_optimizer
optimizer = DeepSpeedZeroOptimizer(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 484, in init
self.initialize_gradient_partitioning_data_structures()
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 709, in initialize_gradient_partitioning_data_structures
self.first_param_index_in_partition[i][partition_id] = self.get_first_param_index(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 684, in get_first_param_index
if partition_id in self.param_to_partition_ids[group_id][param_id]:
KeyError: 0
Traceback (most recent call last):
File "/data/miniconda3/envs/axolotl/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/data/miniconda3/envs/axolotl/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/data/repos/axolotl/src/axolotl/cli/train.py", line 38, in
fire.Fire(do_cli)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/data/repos/axolotl/src/axolotl/cli/train.py", line 34, in do_cli
train(cfg=parsed_cfg, cli_args=parsed_cli_args, dataset_meta=dataset_meta)
File "/data/repos/axolotl/src/axolotl/train.py", line 124, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1591, in train
return inner_training_loop(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py", line 1729, in _inner_training_loop
model, self.optimizer, self.lr_scheduler = self.accelerator.prepare(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare
result = self._prepare_deepspeed(*args)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/accelerator.py", line 1662, in _prepare_deepspeed
engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/init.py", line 171, in initialize
engine = DeepSpeedEngine(args=args,
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 304, in init
self._configure_optimizer(optimizer, model_parameters)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1212, in _configure_optimizer
self.optimizer = self._configure_zero_optimizer(basic_optimizer)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1473, in _configure_zero_optimizer
optimizer = DeepSpeedZeroOptimizer(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 484, in init
self.initialize_gradient_partitioning_data_structures()
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 709, in initialize_gradient_partitioning_data_structures
self.first_param_index_in_partition[i][partition_id] = self.get_first_param_index(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 684, in get_first_param_index
if partition_id in self.param_to_partition_ids[group_id][param_id]:
KeyError: 0
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 70298) of binary: /data/miniconda3/envs/axolotl/bin/python
Traceback (most recent call last):
File "/data/miniconda3/envs/axolotl/bin/accelerate", line 8, in
sys.exit(main())
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/commands/launch.py", line 977, in launch_command
multi_gpu_launcher(args)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/accelerate/commands/launch.py", line 646, in multi_gpu_launcher
distrib_run.run(args)
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/data/miniconda3/envs/axolotl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
按照下面的修改之后错误消失,请问是否有影响?
// self.lora_A[adapter_name] = nn.Linear(self.in_features, r_sum, bias=False) # modified
// self.lora_B[adapter_name] = nn.Linear(r_sum, self.out_features, bias=False) # modified
self.lora_A[adapter_name] = nn.Linear(self.in_features, r_sum if r_sum>0 else r, bias=False) # modified
self.lora_B[adapter_name] = nn.Linear(r_sum if r_sum>0 else r, self.out_features, bias=False) # modified
from o-lora.
Related Issues (20)
- ModuleNotFoundError: No module named 'datasets'
- 关于长序列任务上的讨论 HOT 2
- 关于数据集 HOT 6
- 关于数据集加载的报错 HOT 1
- Loss在yahoo数据集上骤降为0
- 代码里的UIE是什么的缩写? HOT 1
- Missing the parameter `r_sum` in class Linear8bitLt in lora.py
- 作者你好~ 请问一下为什么lora矩阵的形状中有0呢 HOT 5
- 代码运行时在loss那里报错
- Thanks! Great Projects! Could you please provide the example to finetune llama-3-instruct-8B with O-LoRA?
- 有order4,5,6的脚本吗? HOT 3
- 运行报错
- llama2的结果比论文中的llama1的结果低 HOT 1
- llama2 结果复现 HOT 4
- 关于standard benchmark
- 如何解决task-id问题
- 关于loss_mask
- 关于SeqLoRA和IncLoRA HOT 2
- 关于lora_b矩阵的更新问题 HOT 4
- 关于MTL
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from o-lora.