Comments (3)
可能是因为一个模块就超过了16G了,只测试了24G的卡
from cogvlm2.
可能是因为一个模块就超过了16G了,只测试了24G的卡
您好,我试了报另一个错误,这是我的代码
import torch
from PIL import Image
from transformers import AutoModelForCausalLM, AutoTokenizer
from accelerate import init_empty_weights, load_checkpoint_and_dispatch, infer_auto_device_map
MODEL_PATH = "/mnt/data/spdi-code/paddlechat/cogvlm2-llama3-chat-19B"
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
TORCH_TYPE = torch.bfloat16 if torch.cuda.is_available() and torch.cuda.get_device_capability()[0] >= 8 else torch.float16
tokenizer = AutoTokenizer.from_pretrained(
MODEL_PATH,
trust_remote_code=True
)
with init_empty_weights():
model = AutoModelForCausalLM.from_pretrained(
MODEL_PATH,
trust_remote_code=True,
)
num_gpus = torch.cuda.device_count()
max_memory_per_gpu = "16GiB"
if num_gpus > 2:
max_memory_per_gpu = f"{round(42 / num_gpus)}"
device_map = infer_auto_device_map(
model=model,
max_memory={i: max_memory_per_gpu for i in range(num_gpus)},
no_split_module_classes=["CogVLMDecoderLayer"]
)
model = load_checkpoint_and_dispatch(model, MODEL_PATH, device_map=device_map, dtype=TORCH_TYPE)
model = model.eval()
text_only_template = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {} ASSISTANT:"
query = text_only_template.format('您好')
history = []
input_by_model = model.build_conversation_input_ids(
tokenizer,
query=query,
history=history,
template_version='chat'
)
inputs = {
'input_ids': input_by_model['input_ids'].unsqueeze(0).to(DEVICE),
'token_type_ids': input_by_model['token_type_ids'].unsqueeze(0).to(DEVICE),
'attention_mask': input_by_model['attention_mask'].unsqueeze(0).to(DEVICE),
'image': None
}
gen_kwargs = {
"max_new_tokens": 2048,
"pad_token_id": 128002,
}
with torch.no_grad():
outputs = model.generate(**inputs, **gen_kwargs)
outputs = outputs[:, inputs['input_ids'].shape[1]:]
response = tokenizer.decode(outputs[0])
response = response.split("")[0]
print("\nCogVLM2:", response)
history.append((query, response))
from cogvlm2.
用最新的代码也是这个问题吗,直接用我们切分的办法
from cogvlm2.
Related Issues (20)
- CogVLM2_grounding 近期有训练或发布的打算吗 HOT 1
- Please remove triton dependency for Windows users HOT 5
- cli_demo.py is broken tested on ubuntu pip freeze included - AttributeError: 'str' object has no attribute 'shape' HOT 2
- basic_demo/cli_demo.py _issue HOT 7
- We are making CogVLM 2 work on Windows with disabling Triton but it is working very slow can you help with code?
- About the release of temporal question-answering datasets
- 使用TGI推理cogvlm2,url调用报错
- raise RuntimeError("No GPU found. A GPU is needed for quantization.") HOT 1
- 当我在用A100运行微调代码的时候,出现torch.distributed.DistBackendError: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1691, unhandled system error (run with NCCL_DEBUG=INFO for details), NCCL version 2.19.3 ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error.
- 两张A100(共80G显存)测试openai_api_demo.py时报错OOM HOT 2
- 是否支持全参微调? HOT 2
- TypeError: GenerationMixin._extract_past_from_model_output() got an unexpected keyword argument 'standardize_cache_format with transformers==4.44.0 HOT 1
- 为了方便cogvlm2技术交流,拉了一个多模态大模型技术交流群,有需要的大家可以加入
- 您好,请问我该如何在CogVLM2-video-chat中设置system message HOT 1
- finetune,TypeError: jit() got an unexpected keyword argument 'debug' HOT 1
- 输入的图片尺寸过大,微调后的结果会有什么影响?
- Is there a plan to release the TQA dataset used to train CogVLM2-Video? HOT 3
- CogVLM grounding training data
- Failure in loading QUANT4 model HOT 1
- Question about face-specific action analysis in video input
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cogvlm2.