mmgcn's People
Forkers
psqqq dingning123 china-just-1202 vandana-rajan lyiwei-0715 jocelyn1981 fmr2324 yousha806 shrikar17 jackleiaaaaaa dangkh cjj2923 xiaoheng-zhang99 lm2233 1547386368 andyguo666 showde1 hbym ccnu-jreamy airhorizonsmmgcn's Issues
Cannot reproduce MELD result
I can only got f1-score of 0.5760, here is the confusion matrix from stdout output log
precision recall f1-score support
0 0.7017 0.8503 0.7689 1256
1 0.4492 0.4875 0.4676 281
2 0.0000 0.0000 0.0000 50
3 0.4138 0.1154 0.1805 208
4 0.5358 0.5398 0.5378 402
5 0.0000 0.0000 0.0000 68
6 0.4594 0.4261 0.4421 345
accuracy 0.6103 2610
macro avg 0.3657 0.3456 0.3424 2610
weighted avg 0.5623 0.6103 0.5760 2610
[[1068 57 0 19 78 0 34]
[ 74 137 0 0 35 0 35]
[ 28 5 0 0 8 0 9]
[ 121 16 0 24 13 0 34]
[ 109 31 0 4 217 0 41]
[ 33 10 0 2 3 0 20]
[ 89 49 0 9 51 0 147]]
The command is:
python train.py --base-model LSTM --graph-model --nodal-attention --dropout 0.4 --lr 0.001 --batch-size 16 --l2 0.0 --graph_type=MMGCN --epochs=60 --graph_construct=direct --multi_modal --mm_fusion_mthd=concat_subsequently --modals=avl --Dataset=MELD --Deep_GCN_nlayers 4 --use_speaker
The processed data meaning
I'm currently reviewing the processed dataset you've provided in your code. In MELD_features_raw1.pkl, I understand that the second part contains speaker information. Each identifier corresponds to a segment of speech, and for each segment, there are as many vectors as there are sentences. However, I noticed that each vector has a dimension of 9. Does the value 9 have any significance? Thank you so much.
adj matrix problem
在计算adj矩阵的时候
adj = D.mm(adj).mm(D)
会出现大量甚至所有的nan 数据 是怎么回事呢
The requirements of the experiment environment
hello, can you share the pip requirements?
IEMOCAP audio feature dimention is 1582 which should be 100
When it comes to audio feature, you said the acoustic raw features are extracted using the OpenSmile toolkit with IS10 configuration, which should be 100 dimention. This configuration was also used in paper "COGMEN COntextualized GNN based Multimodal Emotion recognitioN".
Your code runs well, but when I print the audio feature shape, I got 1582 dimention instead of 100.
torch.Size([50, 1582])
torch.Size([44, 1582])
torch.Size([40, 1582])
torch.Size([27, 1582])
torch.Size([38, 1582])
torch.Size([26, 1582])
torch.Size([47, 1582])
torch.Size([60, 1582])
May I ask how do you get the acoustic feature?
Is modality feature extractor available?
Thank you for your great work!
And may I ask if you could provide scripts of the feature extractor that produce .pkl
files in your repo? Thank you in advance.
the tensor U in the forward method of model LSTM has given shape torch.Size([94, 32, 100]).
I print the U.shape and getting this
torch.Size([94, 32, 100])
while it should be 2024 in place of 100, could you please explain this?
Class MMGCN2
MMGCN2 这个类为什么会维度报错呢?
MELD Speakers Mapping
This work is nice! I wonder what is the mapping for the speaker indices in the MELD features file?
How can you extract vision features?
Hi, Thanks to share good code.
I have some questions.
- How can you get a DenseNet model with trained FER+ datasets? Did you fine-tuning your own? If you do it, can you share extraction model?
- How to extract vision feature in video data?. In paper, you use densenet to extract vision feature. So I wondering about how to extract in video datasets. Did you use only one sample data to get feature? or use time series frame data?
- Is this any plan to share code about extract all(text, vision, audio) feature?
Thank you
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.