thudm / visualglm-6b Goto Github PK
View Code? Open in Web Editor NEWChinese and English multimodal conversational language model | 多模态中英双语对话语言模型
License: Apache License 2.0
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
License: Apache License 2.0
输入的图片:https://picdl.sunbangyan.cn/2023/05/18/sue5dn.jpg
输入的文字:描述这张图片。
输出:20,36,48,57,69,76,87,95,104,113,122,131,140,149,158,167,176,185,194,203,212,221,230,239,248,257,266,275,284,293,302,311,320,329,338,347,356,365,374,383,392,401,410,419,428,437,446,455,464,473,482,491,500,509,518,527,536,545,554,563,572,581,590,599,608,617,626,635,644,653,662,671,680,689,698,707,716,725,734,743,752,761,770,779,788,797,806,815,824,833,842,851,860,869,878.
I have 2 conflict problems and I found their corresponding solutions. They ask me to upgrade/downgrade transformers to either 2.26.1
or 2.27.1
. This is a problem because whichever one I choose, the other traceback comes up.
For this traceback, people say I should go for 2.27.1
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\gradio\routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\gradio\blocks.py", line 1075, in process_api
result = await self.call_function(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\gradio\blocks.py", line 898, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\gradio\utils.py", line 549, in async_iteration
return next(iterator)
File "web_demo_hf.py", line 63, in predict
for response, history in model.stream_chat(tokenizer, image_path, input, history, max_length=max_length, top_p=top_p,
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\autograd\grad_mode.py", line 43, in generator_context
response = gen.send(None)
File "C:\Users\xxx/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 1439, in stream_chat
for outputs in self.stream_generate(**inputs, **gen_kwargs):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\autograd\grad_mode.py", line 43, in generator_context
response = gen.send(None)
File "C:\Users\xxx/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 1291, in stream_generate
outputs = self(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\xxx/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 1469, in forward return super().forward(
File "C:\Users\xxx/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 1095, in forward transformer_outputs = self.transformer(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\xxx/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 871, in forward
logger.warning_once("Specify both input_ids and inputs_embeds at the same time, will use inputs_embeds")
AttributeError: 'Logger' object has no attribute 'warning_once'
And for this one, people say I should go for 2.26.1
Traceback (most recent call last):
File "web_demo_hf.py", line 5, in <module>
tokenizer = AutoTokenizer.from_pretrained("./vglm-6b", trust_remote_code=True)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 663, in from_pretrained
tokenizer_class = get_class_from_dynamic_module(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\dynamic_module_utils.py", line 399, in get_class_from_dynamic_module
return get_class_in_module(class_name, final_module.replace(".py", ""))
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\dynamic_module_utils.py", line 177, in get_class_in_module
module = importlib.import_module(module_path)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\importlib\__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'transformers_modules.'
在回车后使用文本时,触发"addmm_impl_cpu_" not implemented for 'Half'
输入图像后触发:"slow_conv2d_cpu" not implemented for 'Half'
比如准备一些包含中文的图片, 用这些图片对模型进行微调后得到的模型会具有中文OCR能力吗?
使用web_demo.py启动一个web应用,传不同的图片上去提问,发现显存占用只增不减,最终有可能会触发显存oom
如果在模型的架构中加入一个储存长期对话记忆的embedding space,是否可以解决上下文长度的问题?
[ERROR] Unable to pre-compile async_io
请问下会公开训练数据和训练细节吗?
Is there will have a paper or technical report?
请问大佬,按照read me走流程,换了两个环境尝试 都报错这个是什么问题呢,应该怎么处理?
报错:
Traceback (most recent call last):
File "", line 1, in
File "D:\soft\python\lib\site-packages\torch\nn\modules\module.py", line 905, in cuda
return self._apply(lambda t: t.cuda(device))
File "D:\soft\python\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "D:\soft\python\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "D:\soft\python\lib\site-packages\torch\nn\modules\module.py", line 820, in _apply
param_applied = fn(param)
File "D:\soft\python\lib\site-packages\torch\nn\modules\module.py", line 905, in
return self.apply(lambda t: t.cuda(device))
File "D:\soft\python\lib\site-packages\torch\cuda_init.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
InstructBLIP 论文中指出,即使他们没有针对视频进行训练和微调,他们在VideoQA测试集上,将Video切帧后直接拼接输入Q-Former,亦有一定的理解能力。想问VisualGLM是否进行过类似实验?
I had download the latest version of VisualGLM-6B. I used the following commands to setup the development environment:
conda create -n glm python=3.9
conda activate glm
git clone https://github.com/THUDM/VisualGLM-6B.git
cd VisualGLM-6B
pip install -i https://mirrors.aliyun.com/pypi/simple/ -r requirements.txt
# edit finetune/finetune_visualglm.sh to set NUM_GPUS_PER_WORKER=2 which is the number of GPU in my server
unzip fewshot-data.zip
bash finetune/finetune_visualglm.sh
It reported errors as below:
Traceback (most recent call last):
File "/media/zjkj/2t/yantao/VisualGLM-6B/finetune_visualglm.py", line 188, in <module>
training_main(args, model_cls=model, forward_step_function=forward_step, create_dataset_function=create_dataset_function, collate_fn=data_collator)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/training/deepspeed_training.py", line 130, in training_main
iteration, skipped = train(model, optimizer,
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/training/deepspeed_training.py", line 274, in train
lm_loss, skipped_iter, metrics = train_step(train_data_iterator,
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/training/deepspeed_training.py", line 348, in train_step
forward_ret = forward_step(data_iterator, model, args, timers, **kwargs)
File "/media/zjkj/2t/yantao/VisualGLM-6B/finetune_visualglm.py", line 84, in forward_step
logits = model(input_ids=tokens, image=image, pre_image=pre_image)[0]
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1724, in forward
loss = self.module(*inputs, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/official/chatglm_model.py", line 192, in forward
return super().forward(input_ids=input_ids, attention_mask=attention_mask, position_ids=position_ids, past_key_values=past_key_values, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/base_model.py", line 144, in forward
return self.transformer(*args, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/transformer.py", line 451, in forward
hidden_states = self.hooks['word_embedding_forward'](input_ids, output_cross_layer=output_cross_layer, **kw_args)
File "/media/zjkj/2t/yantao/VisualGLM-6B/model/visualglm.py", line 20, in word_embedding_forward
image_emb = self.model(**kw_args)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/media/zjkj/2t/yantao/VisualGLM-6B/model/blip2.py", line 65, in forward
enc = self.vit(image)[0]
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/media/zjkj/2t/yantao/VisualGLM-6B/model/blip2.py", line 29, in forward
return super().forward(input_ids=input_ids, position_ids=None, attention_mask=attention_mask, image=image)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/base_model.py", line 144, in forward
return self.transformer(*args, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/transformer.py", line 451, in forward
hidden_states = self.hooks['word_embedding_forward'](input_ids, output_cross_layer=output_cross_layer, **kw_args)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/sat/model/official/vit_model.py", line 55, in word_embedding_forward
embeddings = self.proj(images)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/media/zjkj/2t/yantao/software/anaconda3/envs/vglm/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: FIND was unable to find an engine to execute this computation
Please note that I found the version of my pytorch is 2.0. Dose VisualGLM-6B have something wrong with Pytorch 2.0?
[2023-05-19 14:50:31,777] [INFO] [RANK 0] > successfully loaded /home/tony/.sat_models/visualglm-6b/1/mp_rank_00_model_states.pt
欢迎使用 VisualGLM-6B 模型,输入图像URL或本地路径读图,继续输入内容对话,clear 重新开始,stop 终止程序
请输入图像路径或URL(回车进入纯文本对话): https://img.caixin.com/2023-05-13/168394947268597_480_320.jpg
cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
$ nvidia-smi
Fri May 19 14:52:34 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 L... On | 00000000:01:00.0 Off | N/A |
| N/A 42C P8 7W / 150W| 1MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
I'm trying to run the API mode. Copied model data from hugging face. Added the following to the api.py:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("data", trust_remote_code=True)
model = AutoModel.from_pretrained("data", trust_remote_code=True).half().cuda()
model = model.eval()
app = FastAPI()
*all HF model files are in local ./data/
after running the server, request it from curl:
curl -X POST -H "Content-Type: application/json" -d @temp.json http://127.0.0.1:8080
here's the error I got when trying to submit a sample request:
INFO: 127.0.0.1:35234 - "POST / HTTP/1.1" 500 Internal Server Error
Internal Server Errorroot@291f83eb6f53:/VisualGLM-6B# ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/h11_impl.py", line 428, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
return await self.app(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 276, in __call__
await super().__call__(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 122, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
raise exc
File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
await self.app(scope, receive, sender)
File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
raise e
File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
await self.app(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 718, in __call__
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 276, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 66, in app
response = await func(request)
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 163, in run_endpoint_function
return await dependant.call(**values)
File "/VisualGLM-6B/api.py", line 36, in visual_glm
answer, history, _ = chat(None, model, tokenizer, input_text, history=history, image=input_image, \
File "/VisualGLM-6B/model/chat.py", line 141, in chat
output = filling_sequence(
File "/usr/local/lib/python3.10/dist-packages/sat/generation/autoregressive_sampling.py", line 108, in filling_sequence
logits, *output_per_layers = model(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
TypeError: ChatGLMForConditionalGenerationWithImage.forward() got an unexpected keyword argument 'mems'
What did I do wrong? How can I get the API up and running?
Thanks
(visualGLM) root@iZbp1ewp3ew1qt4u8bdh0iZ:~/ai/VisualGLM-6B# bash finetune/finetune_visualglm.sh
finetune/finetune_visualglm.sh: line 5: $'\r': command not found
finetune/finetune_visualglm.sh: line 14: $'\r': command not found
finetune/finetune_visualglm.sh: line 19: $'\r': command not found
finetune/finetune_visualglm.sh: line 22: $'\r': command not found
finetune/finetune_visualglm.sh: line 23: $'\r': command not found
finetune/finetune_visualglm.sh: line 50: $'\r': command not found
finetune/finetune_visualglm.sh: line 51: $'\r': command not found
finetune/finetune_visualglm.sh: line 52: $'\r': command not found
--use_lorat \20 \ 8 \\s \ \dataset.json hostfile_single
[2023-05-23 17:22:18,395] [WARNING] [runner.py:191:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only.
[2023-05-23 17:22:18,412] [INFO] [runner.py:541:main] cmd = /usr/bin/python3 -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMF19 --m --use_lorat 20 e 8 ns l /dataset.json--enable_each_rank_log=None finetune_visualglm.py
[2023-05-23 17:22:21,237] [INFO] [launch.py:222:main] 0 NCCL_IB_DISABLE=0
[2023-05-23 17:22:21,237] [INFO] [launch.py:222:main] 0 NCCL_DEBUG=info
[2023-05-23 17:22:21,237] [INFO] [launch.py:222:main] 0 NCCL_NET_GDR_LEVEL=2
[2023-05-23 17:22:21,237] [INFO] [launch.py:229:main] WORLD INFO DICT: {'localhost': [0]}
[2023-05-23 17:22:21,237] [INFO] [launch.py:235:main] nnodes=1, num_local_procs=1, node_rank=0
[2023-05-23 17:22:21,237] [INFO] [launch.py:246:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0]})
[2023-05-23 17:22:21,237] [INFO] [launch.py:247:main] dist_world_size=1
[2023-05-23 17:22:21,237] [INFO] [launch.py:249:main] Setting CUDA_VISIBLE_DEVICES=0
usage: finetune_visualglm.py [-h] [--num-layers NUM_LAYERS] [--hidden-size HIDDEN_SIZE] [--num-attention-heads NUM_ATTENTION_HEADS]
[--vocab-size VOCAB_SIZE] [--max-sequence-length MAX_SEQUENCE_LENGTH] [--layernorm-order {post,pre,sandwich}]
[--inner-hidden-size INNER_HIDDEN_SIZE] [--hidden-size-per-attention-head HIDDEN_SIZE_PER_ATTENTION_HEAD]
[--model-parallel-size MODEL_PARALLEL_SIZE] [--skip-init] [--use-gpu-initialization]
[--layernorm-epsilon LAYERNORM_EPSILON] [--hidden-dropout HIDDEN_DROPOUT] [--attention-dropout ATTENTION_DROPOUT]
[--make-vocab-size-divisible-by MAKE_VOCAB_SIZE_DIVISIBLE_BY] [--experiment-name EXPERIMENT_NAME]
[--train-iters TRAIN_ITERS] [--batch-size BATCH_SIZE] [--lr LR] [--mode {pretrain,finetune,inference}] [--seed SEED]
[--zero-stage {0,1,2}] [--checkpoint-activations] [--checkpoint-num-layers CHECKPOINT_NUM_LAYERS] [--fp16] [--bf16]
[--gradient-accumulation-steps GRADIENT_ACCUMULATION_STEPS] [--epochs EPOCHS] [--log-interval LOG_INTERVAL]
[--summary-dir SUMMARY_DIR] [--save-args] [--lr-decay-iters LR_DECAY_ITERS]
[--lr-decay-style {constant,linear,cosine,exponential}] [--lr-decay-ratio LR_DECAY_RATIO] [--warmup WARMUP]
[--weight-decay WEIGHT_DECAY] [--save SAVE] [--load LOAD] [--save-interval SAVE_INTERVAL] [--no-save-rng]
[--no-load-rng] [--resume-dataloader] [--distributed-backend DISTRIBUTED_BACKEND] [--local_rank LOCAL_RANK]
[--exit-interval EXIT_INTERVAL] [--eval-batch-size EVAL_BATCH_SIZE] [--eval-iters EVAL_ITERS]
[--eval-interval EVAL_INTERVAL] [--strict-eval] [--train-data TRAIN_DATA [TRAIN_DATA ...]]
[--train-data-weights TRAIN_DATA_WEIGHTS [TRAIN_DATA_WEIGHTS ...]] [--iterable-dataset] [--valid-data [VALID_DATA ...]]
[--test-data [TEST_DATA ...]] [--split SPLIT] [--num-workers NUM_WORKERS] [--block-size BLOCK_SIZE]
[--tokenizer-type TOKENIZER_TYPE] [--temperature TEMPERATURE] [--top_p TOP_P] [--top_k TOP_K] [--num-beams NUM_BEAMS]
[--length-penalty LENGTH_PENALTY] [--no-repeat-ngram-size NO_REPEAT_NGRAM_SIZE] [--min-tgt-length MIN_TGT_LENGTH]
[--out-seq-length OUT_SEQ_LENGTH] [--input-source INPUT_SOURCE] [--output-path OUTPUT_PATH] [--with-id]
[--max-inference-batch-size MAX_INFERENCE_BATCH_SIZE] [--device DEVICE] [--deepspeed]
[--deepspeed_config DEEPSPEED_CONFIG] [--deepscale] [--deepscale_config DEEPSCALE_CONFIG] [--deepspeed_mpi]
--use_lorasualglm.py: error: unrecognized arguments:
[2023-05-23 17:22:26,242] [INFO] [launch.py:428:sigkill_handler] Killing subprocess 128438
[2023-05-23 17:22:26,243] [ERROR] [launch.py:434:sigkill_handler] ['/usr/bin/python3', '-u', 'finetune_visualglm.py', '--local_rank=0', '\r', '--experiment-name', 'finetune-visualglm-6b\r', '\r', '--model-parallel-size', '1\r', '\r', '--mode', 'finetune', '\r', '--train-iters', '300', '\r', '--resume-dataloader', '\r', '--max_source_length', '64', '\r', '--max_target_length', '256', '\r', '--lora_rank', '10\r', '--pre_seq_len', '4\r', '\r', '--train-data', './fewshot-data/dataset.json\r', '\r', '--valid-data', './fewshot-data/dataset.json\r', '\r', '--distributed-backend', 'nccl', '\r', '--lr-decay-style', 'cosine', '\r', '--warmup', '.02', '\r', '--checkpoint-activations', '\r', '--save-interval', '300', '\r', '--eval-interval', '10000', '\r', '--save', './checkpoints', '\r', '--split', '1', '\r', '--eval-iters', '10', '\r', '--eval-batch-size', '8', '\r', '--zero-stage', '1', '\r', '--lr', '0.0001', '\r', '--batch-size', '20', '\r', '--skip-init', '\r', '--fp16', '\r', '--use_lora\r', '\r\r\r'] exits with return code = 2
finetune/finetune_visualglm.sh: line 56: $'\r': command not found
: invalid optione_visualglm.sh: line 57: set: +
set: usage: set [-abefhkmnptuvxBCHP] [-o option-name] [--] [arg ...]
$ python web_demo_hf.py
Traceback (most recent call last):
File "web_demo_hf.py", line 6, in
model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda()
File "/home/good/anaconda3/envs/visualglm/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 459, in from_pretrained
model_class = get_class_from_dynamic_module(
File "/home/good/anaconda3/envs/visualglm/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 425, in get_class_from_dynamic_module
final_module = get_cached_module_file(
File "/home/good/anaconda3/envs/visualglm/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 305, in get_cached_module_file
get_cached_module_file(
File "/home/good/anaconda3/envs/visualglm/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 267, in get_cached_module_file
modules_needed = check_imports(resolved_module_file)
File "/home/good/anaconda3/envs/visualglm/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 150, in check_imports
raise ImportError(
ImportError: This modeling file requires the following packages that were not found in your environment: sat. Run pip install sat
web_demo_hf.py 推理速度很慢,比web_demo.py推理速度慢很多;
于是我直接在 Jupyter 配置同样的代码跑hugging face的模型,去掉网页后推理速度是没问题的,但是在网页上推理速度就很慢,实在不知道为什么了,对gradio不是非常熟悉
Traceback (most recent call last):
File "finetune_visualglm.py", line 170, in
args = get_args(args_list)
File "/root/miniconda3/lib/python3.8/site-packages/sat/arguments.py", line 417, in get_args
initialize_distributed(args)
File "/root/miniconda3/lib/python3.8/site-packages/sat/arguments.py", line 500, in initialize_distributed
deepspeed.init_distributed(
TypeError: init_distributed() got an unexpected keyword argument 'world_size'
❯ python web_demo.py
[2023-05-21 21:29:01,122] [INFO] DeepSpeed/CUDA is not installed, fallback to Pytorch checkpointing.
[2023-05-21 21:29:01,599] [WARNING] Failed to load cpm_kernels:Unknown platform: darwin
[2023-05-21 21:29:01,601] [INFO] building VisualGLMModel model ...
59203
[2023-05-21 21:29:01,625] [INFO] [RANK 0] > initializing model parallel with size 1
[2023-05-21 21:29:01,627] [INFO] [RANK 0] You are using model-only mode.
For torch.distributed users or loading model parallel models, set environment variables RANK, WORLD_SIZE and LOCAL_RANK.
/Users/z/git/VisualGLM-6B/.direnv/python-3.10.11/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn("Initializing zero-element tensors is a no-op")
[2023-05-21 21:29:13,787] [INFO] [RANK 0] > number of parameters on model parallel rank 0: 7810582016
[2023-05-21 21:29:14,203] [INFO] [RANK 0] Torch not compiled with CUDA enabled
[2023-05-21 21:29:14,203] [INFO] [RANK 0] global rank 0 is loading checkpoint /Users/z/.sat_models/visualglm-6b/1/mp_rank_00_model_states.pt
[2023-05-21 21:29:28,809] [INFO] [RANK 0] > successfully loaded /Users/z/.sat_models/visualglm-6b/1/mp_rank_00_model_states.pt
Traceback (most recent call last):
File "/Users/z/git/VisualGLM-6B/web_demo.py", line 128, in <module>
main(args)
File "/Users/z/git/VisualGLM-6B/web_demo.py", line 81, in main
model, tokenizer = get_infer_setting(gpu_device=0, quant=args.quant)
File "/Users/z/git/VisualGLM-6B/model/infer_util.py", line 27, in get_infer_setting
model = model.cuda()
File "/Users/z/git/VisualGLM-6B/.direnv/python-3.10.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 905, in cuda
return self._apply(lambda t: t.cuda(device))
File "/Users/z/git/VisualGLM-6B/.direnv/python-3.10.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/Users/z/git/VisualGLM-6B/.direnv/python-3.10.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/Users/z/git/VisualGLM-6B/.direnv/python-3.10.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/Users/z/git/VisualGLM-6B/.direnv/python-3.10.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 820, in _apply
param_applied = fn(param)
File "/Users/z/git/VisualGLM-6B/.direnv/python-3.10.11/lib/python3.10/site-packages/torch/nn/modules/module.py", line 905, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/Users/z/git/VisualGLM-6B/.direnv/python-3.10.11/lib/python3.10/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
我尝试用colab运行 web_demo_hf.py, 执行到 Loading checkpoint shards: 0% 0/5 [00:00<?, ?it/s]^C
直接退出了,看系统内存有个尖峰,应该是超过了默认的12.7GB,我想了解下运行这个模型最低的硬件要求是多少,比如类似下面的具体描述,这样我好找具体的机器来部署,谢谢.
vCPU:
RAM:
GPU RAM:
linux下运行python cli_demo.py,直接下载到/root/.sat_models下面去了,我想修改下载路径,可以指定吗?
您好,我最近在用visualglm做reward model的训练,在修改和查看代码的时候发现modeling_chatglm.py里有一行:torch_image = torch_image.to(self.dtype).to(self.device),请问这个self.dtype具体是指?我在代码里没有找到相关的定义
在上传图片之后,运行报错,错误信息如下:
Traceback (most recent call last):
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/gradio/routes.py", line 412, in run_predict
output = await app.get_blocks().process_api(
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/gradio/blocks.py", line 1299, in process_api
result = await self.call_function(
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/gradio/blocks.py", line 1035, in call_function
prediction = await anyio.to_thread.run_sync(
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/gradio/utils.py", line 491, in async_iteration
return next(iterator)
File "/mnt/amj/VisualGLM-6B/web_demo_hf.py", line 63, in predict
for response, history in model.stream_chat(tokenizer, image_path, input, history, max_length=max_length, top_p=top_p,
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "/root/.cache/huggingface/modules/transformers_modules/visualglm-6b/modeling_chatglm.py", line 1439, in stream_chat
for outputs in self.stream_generate(**inputs, **gen_kwargs):
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "/root/.cache/huggingface/modules/transformers_modules/visualglm-6b/modeling_chatglm.py", line 1291, in stream_generate
outputs = self(
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/visualglm-6b/modeling_chatglm.py", line 1462, in forward
image_embeds = self.image_encoder(images)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/visualglm-6b/visual.py", line 69, in forward
enc = self.vit(image)[0]
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/visualglm-6b/visual.py", line 28, in forward
return super().forward(input_ids=input_ids, position_ids=None, attention_mask=attention_mask, image=image)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/sat/model/base_model.py", line 144, in forward
return self.transformer(*args, **kwargs)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/sat/model/transformer.py", line 451, in forward
hidden_states = self.hooks['word_embedding_forward'](input_ids, output_cross_layer=output_cross_layer, **kw_args)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/sat/model/official/vit_model.py", line 55, in word_embedding_forward
embeddings = self.proj(images)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/mnt/amj/conda/envs/lora/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: GET was unable to find an engine to execute this computation
下载了模型文件之后,运行web_demo_hf.py,加载模型文件这一步就各种报错
tuning 时用了默认的指令,出现下了如下错误
2023-05-22 16:51:07,239] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False
Using /root/.cache/torch_extensions/py39_cu117 as PyTorch extensions root...
Creating extension directory /root/.cache/torch_extensions/py39_cu117/fused_adam...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/py39_cu117/fused_adam/build.ninja...
Building extension module fused_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/includes -I/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/adam -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/TH -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/THC -isystem /root/anaconda3/envs/torch20/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -lineinfo --use_fast_math -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_70,code=compute_70 -std=c++17 -c /root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/adam/multi_tensor_adam.cu -o multi_tensor_adam.cuda.o
FAILED: multi_tensor_adam.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/includes -I/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/adam -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/TH -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/THC -isystem /root/anaconda3/envs/torch20/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -lineinfo --use_fast_math -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_70,code=compute_70 -std=c++17 -c /root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/adam/multi_tensor_adam.cu -o multi_tensor_adam.cuda.o
nvcc fatal : Value 'c++17' is not defined for option 'std'
[2/3] c++ -MMD -MF fused_adam_frontend.o.d -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -I/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/includes -I/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/adam -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/TH -isystem /root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/include/THC -isystem /root/anaconda3/envs/torch20/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++14 -g -Wno-reorder -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -c /root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/csrc/adam/fused_adam_frontend.cpp -o fused_adam_frontend.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
subprocess.run(
File "/root/anaconda3/envs/torch20/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/nfs_data/VisualGLM-6B-main/finetune_visualglm.py", line 188, in
training_main(args, model_cls=model, forward_step_function=forward_step, create_dataset_function=create_dataset_function, collate_fn=data_collator)
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/sat/training/deepspeed_training.py", line 98, in training_main
model, optimizer = setup_model_untrainable_params_and_optimizer(args, model)
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/sat/training/deepspeed_training.py", line 161, in setup_model_untrainable_params_and_optimizer
model, optimizer, _, _ = deepspeed.initialize(
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/init.py", line 165, in initialize
engine = DeepSpeedEngine(args=args,
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 308, in init
self._configure_optimizer(optimizer, model_parameters)
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1162, in _configure_optimizer
basic_optimizer = self._configure_basic_optimizer(model_parameters)
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1224, in _configure_basic_optimizer
optimizer = FusedAdam(
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/adam/fused_adam.py", line 71, in init
fused_adam_cuda = FusedAdamBuilder().load()
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/op_builder/builder.py", line 445, in load
return self.jit_load(verbose)
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/deepspeed/ops/op_builder/builder.py", line 480, in jit_load
op_module = load(name=self.name,
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
_write_ninja_file_and_build_library(
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/root/anaconda3/envs/torch20/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'fused_adam'
VM-3-158-ubuntu:1785083:1800122 [0] NCCL INFO [Service thread] Connection closed by localRank 0
VM-3-158-ubuntu:1785083:1785083 [0] NCCL INFO comm 0x8abbc410 rank 0 nranks 1 cudaDev 0 busId 80 - Abort COMPLETE
VM-3-158-ubuntu:1785083:1800126 [0] NCCL INFO [Service thread] Connection closed by localRank 0
VM-3-158-ubuntu:1785083:1785083 [0] NCCL INFO comm 0x8abc35b0 rank 0 nranks 1 cudaDev 0 busId 80 - Abort COMPLETE
[2023-05-22 16:51:50,540] [INFO] [launch.py:428:sigkill_handler] Killing subprocess 1785083
[2023-05-22 16:51:50,540] [ERROR] [launch.py:434:sigkill_handler] ['/root/anaconda3/envs/torch20/bin/python', '-u', 'finetune_visualglm.py', '--local_rank=0', '--experiment-name', 'finetune-visualglm-6b', '--model-parallel-size', '1', '--mode', 'finetune', '--train-iters', '300', '--resume-dataloader', '--max_source_length', '64', '--max_target_length', '256', '--lora_rank', '10', '--pre_seq_len', '4', '--train-data', './fewshot-data/dataset.json', '--valid-data', './fewshot-data/dataset.json', '--distributed-backend', 'nccl', '--lr-decay-style', 'cosine', '--warmup', '.02', '--checkpoint-activations', '--save-interval', '300', '--eval-interval', '10000', '--save', './checkpoints', '--split', '1', '--eval-iters', '10', '--eval-batch-size', '8', '--zero-stage', '1', '--lr', '0.0001', '--batch-size', '20', '--skip-init', '--fp16', '--use_lora'] exits with return code = 1
请问,会公布VisualGLM-6B 的训练代码吗?
您好,想用VisualGLM-6b进行reward model的训练,目前输入数据是纯文本,自己照着deepspeed_chat改了一下,发现在计算时总出错,具体log如下:
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/rlhf_tw_test/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x130344 and 4096x1)
不是显存不够,是内存也不够,无法先加载模型再进行quantize.
有没有quantize 8bit或者4bit的教程?谢谢
As the title, is this model only support conversations with images? If I want to just chat, do I need to run the chatglm model separately? The code seems require image
field in the input. Setting it to empty or ignore will generate error.
NCCL_DEBUG=info NCCL_IB_DISABLE=0 NCCL_NET_GDR_LEVEL=2 deepspeed --master_port 16666 --hostfile hostfile_single finetune_visualglm.py --experiment-name finetune-visualglm-6b --model-parallel-size 1 --mode finetune --train-iters 300 --resume-dataloader --max_source_length 64 --max_target_length 256 --lora_rank 10 --pre_seq_len 4 --train-data ./fewshot-data/dataset.json --valid-data ./fewshot-data/dataset.json --distributed-backend nccl --lr-decay-style cosine --warmup .02 --checkpoint-activations --save-interval 300 --eval-interval 10000 --save ./checkpoints --split 1 --eval-iters 10 --eval-batch-size 8 --zero-stage 1 --lr 0.0001 --batch-size 20 --skip-init --fp16 --use_lora
finetune/finetune_visualglm.sh: line 56: deepspeed: command not found
已经尝试过升级deepspeed,还是报错
目前deepspeed版本 0.9.2
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\envs\chatglm\lib\site-packages\gradio\routes.py", line 401, in run_predict
output = await app.get_blocks().process_api(
File "C:\ProgramData\anaconda3\envs\chatglm\lib\site-packages\gradio\blocks.py", line 1302, in process_api
result = await self.call_function(
File "C:\ProgramData\anaconda3\envs\chatglm\lib\site-packages\gradio\blocks.py", line 1039, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\ProgramData\anaconda3\envs\chatglm\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\ProgramData\anaconda3\envs\chatglm\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\ProgramData\anaconda3\envs\chatglm\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\ProgramData\anaconda3\envs\chatglm\lib\site-packages\gradio\utils.py", line 491, in async_iteration
return next(iterator)
File "D:\Code\VisualGLM-6B-main\web_demo_hf.py", line 56, in predict
chatbot.append((parse_text(input), ""))
AttributeError: 'NoneType' object has no attribute 'append'求大神解决,感觉CPU部署还是有问题
如题,使用的虚拟环境,已经安装所需依赖。在 HuggingFace 上下载的模型,放置到本地文件夹 /data/models/THUDM/visualglm-6b 中。修改 cli_demo.py:
def main():
...
# load model
model, model_args = VisualGLMModel.from_pretrained(
"/data/models/THUDM/visualglm-6b",
args=argparse.Namespace(
fp16=True,
skip_init=True,
use_gpu_initialization=True if (torch.cuda.is_available() and args.quant is None) else False,
device='cuda' if (torch.cuda.is_available() and args.quant is None) else 'cpu',
local_files_only=1,
))
...
tokenizer = AutoTokenizer.from_pretrained("/data/models/THUDM/visualglm-6b", local_files_only=1, trust_remote_code=True)
...
运行 cli_demo.py 时报错如下:
Traceback (most recent call last):
File "/data/VisualGLM-6B/cli_demo.py", line 100, in <module>
main()
File "/data/VisualGLM-6B/cli_demo.py", line 25, in main
model, model_args = VisualGLMModel.from_pretrained(
File "/data/VisualGLM-6B/.venv/lib/python3.9/site-packages/sat/model/base_model.py", line 212, in from_pretrained
args = update_args_with_file(args, path=os.path.join(model_path, 'model_config.json'))
File "/data/VisualGLM-6B/.venv/lib/python3.9/site-packages/sat/arguments.py", line 423, in update_args_with_file
with open(path, 'r', encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/models/THUDM/visualglm-6b/model_config.json'
这块有训练代码吗?从头开始训练的版本,而不是finetune版本
最近想模仿微软的 LLaMA 结构训练一个多模态语言模型,也就是要把图像的 token 向量和文本的 embedding 向量拼成一段话输入 chatglm。目前看网上的微调代码都是基于 token 的编码 input_ids 输入模型的,大致看了仓库的代码,这个项目貌似是把图像和文本通过向量输入的,如果是 chatglm,怎么使用嵌入向量输入模型呀,因为我看预测 token 也是预测的 token id,然后再把预测的 id 和输入拼起来再预测下一个。向量输入的话,是把预测 id 的嵌入向量得到再和输入向量拼起来吗?
My last issue is ambiguous. Sorry about that. Basically, I get this traceback for several transformers
versions that meet transformers>=2.27.1
. So to not trigger this error, which transformers
version on earth did you use? Or are there any other conflicts with package versions? Or something else that originally trigger this error?
Traceback (most recent call last):
File "web_demo_hf.py", line 5, in <module>
tokenizer = AutoTokenizer.from_pretrained("./vglm-6b", trust_remote_code=True)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 663, in from_pretrained
tokenizer_class = get_class_from_dynamic_module(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\dynamic_module_utils.py", line 399, in get_class_from_dynamic_module
return get_class_in_module(class_name, final_module.replace(".py", ""))
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\dynamic_module_utils.py", line 177, in get_class_in_module
module = importlib.import_module(module_path)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\importlib\__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'transformers_modules.'
specific both input_ids and inputs_embeds at the same time,will use inputs_embeds
Input length of input_ids is 2232,but ‘max_length’ is set to 2048。this can lead to unexpected behavior。you should consider increading ‘max_new_tokens’
How I can finetune in other language , maybe vietnamese, ? thanks guys, awesome project
ERROR: Command errored out with exit status 1: command: 'c:\program files\python\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Administrator\AppData\Local\Temp\pip-install-iozsj5rb\deepspeed_cd9c08b77eaf40568b910542cdc41a19\setup.py'"'"'; file='"'"'C:\Users\Administrator\AppData\Local\Temp\pip-install-iozsj5rb\deepspeed_cd9c08b77eaf40568b910542cdc41a19\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\Administrator\AppData\Local\Temp\pip-pip-egg-info-sdfey28f' cwd: C:\Users\Administrator\AppData\Local\Temp\pip-install-iozsj5rb\deepspeed_cd9c08b77eaf40568b910542cdc41a19\ Complete output (14 lines): test.c LINK : fatal error LNK1181: cannot open input file 'aio.lib' Traceback (most recent call last): File "", line 1, in File "C:\Users\Administrator\AppData\Local\Temp\pip-install-iozsj5rb\deepspeed_cd9c08b77eaf40568b910542cdc41a19\setup.py", line 162, in abort(f"Unable to pre-compile {op_name}") File "C:\Users\Administrator\AppData\Local\Temp\pip-install-iozsj5rb\deepspeed_cd9c08b77eaf40568b910542cdc41a19\setup.py", line 51, in abort assert False, msg AssertionError: Unable to pre-compile async_io DS_BUILD_OPS=1
求问 visualGLM后面接入的 chatGLM 是base哪个版本的~ 是最早释放的v0.1.0版本吗
我看到chatGLM 最近才 release 了 v1.1 版本 的checkpoint~
另外请问下什么时候会考虑开放全部的训练数据吗~
This is my code:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda()
It runs failed, and returned "ValueError: Unknown arg use_final_layernorm."
What's the problem?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.