Comments (3)
xformers是有支持v100的版本的,直接换成支持v100版本的xformers试试?
from cogvlm2.
我之前在测试cogvlm1的时候,也有遇到同样的问题,替换为self.attention(q, k, v)结果完全不正确;
cogvlm2的这块Attention应该和cogvlm1是一致的;
可以参考xformers的文档:https://facebookresearch.github.io/xformers/components/ops.html#:~:text=Equivalent%20pytorch%20code
进行下列替换,我在cogvlm1上的测试结果是正常的:
class Attention(nn.Module):
def __init__(self, config):
super().__init__()
self.num_heads = config.num_heads
head_dim = config.hidden_size // config.num_heads
self.scale = head_dim ** -0.5
self.query_key_value = nn.Linear(config.hidden_size, config.hidden_size * 3)
self.dense = nn.Linear(config.hidden_size, config.hidden_size)
self.output_dropout = torch.nn.Dropout(config.dropout_prob)
self.dropout = torch.nn.Dropout(config.dropout_prob)
def forward(self, x: "tensor(B, L, D)") -> "tensor(B, L, D)":
B, L, _ = x.shape
qkv = self.query_key_value(x)
qkv = qkv.reshape(B, L, 3, self.num_heads, -1).permute(2, 0, 1, 3, 4) # 3, B, L, H, D
query, key, value = qkv[0], qkv[1], qkv[2]
# out = xops.memory_efficient_attention(
# q, k, v, scale=self.scale,
# )
# out = self.attention(query, key, value)
scale = 1.0 / query.shape[-1] ** 0.5
query = query * scale
query = query.transpose(1, 2)
key = key.transpose(1, 2)
value = value.transpose(1, 2)
attn = query @ key.transpose(-2, -1)
attn = attn.softmax(-1)
attn = self.dropout(attn)
attn = attn @ value
out = attn.transpose(1, 2)
output = self.dense(out.contiguous().view(B, L, -1))
output = self.output_dropout(output)
return output
from cogvlm2.
@Easily2 谢谢,问题解决了
from cogvlm2.
Related Issues (20)
- CogVLM2_grounding 近期有训练或发布的打算吗 HOT 1
- Please remove triton dependency for Windows users HOT 5
- cli_demo.py is broken tested on ubuntu pip freeze included - AttributeError: 'str' object has no attribute 'shape' HOT 2
- basic_demo/cli_demo.py _issue HOT 7
- We are making CogVLM 2 work on Windows with disabling Triton but it is working very slow can you help with code?
- About the release of temporal question-answering datasets
- 使用TGI推理cogvlm2,url调用报错
- raise RuntimeError("No GPU found. A GPU is needed for quantization.") HOT 1
- 当我在用A100运行微调代码的时候,出现torch.distributed.DistBackendError: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1691, unhandled system error (run with NCCL_DEBUG=INFO for details), NCCL version 2.19.3 ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error.
- 两张A100(共80G显存)测试openai_api_demo.py时报错OOM HOT 2
- 是否支持全参微调? HOT 2
- TypeError: GenerationMixin._extract_past_from_model_output() got an unexpected keyword argument 'standardize_cache_format with transformers==4.44.0 HOT 1
- 为了方便cogvlm2技术交流,拉了一个多模态大模型技术交流群,有需要的大家可以加入
- 您好,请问我该如何在CogVLM2-video-chat中设置system message HOT 1
- finetune,TypeError: jit() got an unexpected keyword argument 'debug' HOT 1
- 输入的图片尺寸过大,微调后的结果会有什么影响?
- Is there a plan to release the TQA dataset used to train CogVLM2-Video? HOT 3
- CogVLM grounding training data
- Failure in loading QUANT4 model HOT 1
- Question about face-specific action analysis in video input
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cogvlm2.