yaoing / dan Goto Github PK

Official implementation of DAN

License: MIT License

Python 100.00%

dan's Introduction

Hey 👋, I'm Zhengyao Wen!

A graduate student in artificial intelligence, also I like to do some front-end development, including vue.js and flutter Apps. There are many open-source projects on Github that have helped me a lot, which I'm very grateful for, and I will keep contributing my codes for the open-source community.

🧐 More About Me:

👁️ I've done little computer vision work before
👄 Currently, I am working on a natural language processing task

📊 Github Stats

dan's People

Contributors

Stargazers

Watchers

dan's Issues

some problems about t-sne

Hello, I have read your article and thank you for sharing such an excellent method. Regarding the output results of using t-sne visualization in the article, I have reproduced it, but the effect is quite different from that in the article. I am not sure if there is a problem with the parameter settings or other aspects. If it is convenient, I would like to refer to your code implementation on the t-sne part. Your positive reply will be highly appreciated. Email: [email protected]

feature vector

Hello there. I have a question how can I extract feature vector from dan model ?

My aim is to to extract the feature vector from this model to concatenate with another vector of a speech model to make a fusion.

Thanks a lot.

Trained cluster centers

Hi,
Is there any plan to also release the learned cluster centers of the FCN? One would think they are a part of the model. I understand they are not required for inference but may be handy for fine tuning if someone wants to use the affinity loss for fine tuning.
Thanks.

About pre-trained model of MSCeleb

Thanks for your work which inspire me a lot! There are a few questions I would like to ask.

Could you tell me whether the MSCeleb-resnet18 pre-training model weights are trained by your own?
When I used the MSCeleb pre-training weights directly on resnet18 and trained the raf-db dataset according to your training strategy, I was able to achieve 88.75% accuracy, which I think is quite high. But in your paper, the accuracy of baseline resnet18 is just 86.25%, which confusing me a lot.

I‘m looking forward to your reply. Thank you!

How to get 82.75%

Hello, can I ask you how Avg.Accuracy got here? Is it changed to 82.75 in the back?

GRADCAM++需要替换哪些变量

Why Batchnorm at the final network output?

Hi,
Thanks for this excellent repository. Very easy to follow. A question about the implementation though:
I don't often see a batchnorm layer before the softmax loss in classifier networks. Any specific reason you have it? What if you train without the last batchnorm? I had a quick check without the BN and the cross-entropy loss after the first training batch was : (on a very small private dataset with 4 classes) ~313. When I did the same with the BN the value was ~1.94, which is more typical of cross entropy loss at the beginning of the training.

baidu driver

hello, thanks for your excellent work, can you offer a baidu driver for downloading the pretrained model?

Question in demo.py

First i think your work is great. I have a question hope you can help me.
When I run demo.py, and i use an image it can detect one face and print one emotion label.
How can it print all the faces in the image( f.e. 9 faces so 9 different emotions ) ?
I try to input a for loop but i didnt see some results .

networks文件夹里面缺少dacl.py文件，能上传一下吗？

求问AffectNet数据集的预处理方法

请问作者，如何把AffectNet数据集处理成您在csv文件中给出类似datasets/AffectNet/train_set/images/245999这样的路径

Grad Cam visualization

First thanks for your great work? I have a question hope you can help me.
When I finished training model, I run run_grad_cam.py script and got very bad result. Attention module seem don't attent to important region

Some questions about the dataset

Thank you so much for your wonderful work! I'm now trying to reproduce the results, but I'm having some issues dividing the dataset. Are you using the AffectNet dataset, the 120G version or the 4G version? I am partitioning the data with some missing data.

recommendation for program robustness

hello, thanks for your work, I think you can add the following codes to the definition of PartitionLoss() function for program robustness:

        eps = sys.float_info.epsilon
        loss = torch.log(1+num_head/(var + eps))

Re-produce the result.

Hi,

I was running your script rafdb.py with exact settings, however, I could only achieve 88.98 in accuracy.

Does your method depend on randomness?

关于Grad_Cam

您好，在执行run_grad_cam.py报错AttributeError: 'tuple' object has no attribute 'cpu'，如何解决？期待您的回复，谢谢！

Inference

Number reproduce failed

Hi, really nice work.

But when I try to reproduce the number 89.7 in raf dataset.
I failed and only get the 89.0 result.

Can you give any ideas?

Thanks for your kind help,

请问一下当运行rafdb.py时，在Affinity Loss中默认的num_classes为8类，可对于RAF_DB数据集而言只有7类，请问一下这8类如何解释呢

作者，您好！请问一下当运行rafdb.py时，在Affinity Loss中默认的num_classes为8类，可对于RAF_DB数据集而言只有7类，请问一下这8类如何解释呢，是把最后第8类当作背景类吗？
非常感谢作者的回答

Wrong average accuracy report.

Hi,

I found the wrong code about validating average accuracy in your rafdb.py #231 and #239

Mathematically, your calculation is not correct. I was writing a new one and your average accuracy is only 83.76.

#!/usr/bin/env python
# coding: utf-8

# In[12]:


import os

from PIL import Image

import torch
from torchvision import transforms

from networks.dan import DAN
import torch.utils.data as data
import numpy as np
import pandas as pd


# In[2]:


device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu")


# In[5]:


model = DAN(num_head=4, num_class=7, pretrained=False)
checkpoint = torch.load('rafdb_epoch21_acc0.897_bacc0.8532.pth', map_location=device)
model.load_state_dict(checkpoint['model_state_dict'],strict=True)
model.to(device)
model.eval()


# In[10]:


class RafDataSet(data.Dataset):
    def __init__(self, raf_path, phase, transform = None):
        self.phase = phase
        self.transform = transform
        self.raf_path = raf_path

        df = pd.read_csv(os.path.join(self.raf_path, 'EmoLabel/list_patition_label.txt'), sep=' ', header=None,names=['name','label'])

        if phase == 'train':
            self.data = df[df['name'].str.startswith('train')]
        else:
            self.data = df[df['name'].str.startswith('test')]

        file_names = self.data.loc[:, 'name'].values
        self.label = self.data.loc[:, 'label'].values - 1 # 0:Surprise, 1:Fear, 2:Disgust, 3:Happiness, 4:Sadness, 5:Anger, 6:Neutral

        _, self.sample_counts = np.unique(self.label, return_counts=True)
        # print(f' distribution of {phase} samples: {self.sample_counts}')

        self.file_paths = []
        for f in file_names:
            f = f.split(".")[0]
            f = f +"_aligned.jpg"
            path = os.path.join(self.raf_path, 'Image/aligned', f)
            self.file_paths.append(path)

    def __len__(self):
        return len(self.file_paths)

    def __getitem__(self, idx):
        path = self.file_paths[idx]
        image = Image.open(path).convert('RGB')
        label = self.label[idx]

        if self.transform is not None:
            image = self.transform(image)
        
        return image, label


# In[13]:


data_transforms_val = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])])   

val_dataset = RafDataSet('datasets/', phase = 'test', transform = data_transforms_val)  


# In[14]:


val_loader = torch.utils.data.DataLoader(val_dataset,
                                           batch_size = 64,
                                           num_workers = 1,
                                           shuffle = False,  
                                           pin_memory = True)


# In[16]:


y_true = []
y_pred = []
with torch.no_grad():

    model.eval()
    for (imgs, targets) in val_loader:
        imgs = imgs.to(device)
        targets = targets.to(device)

        out,feat,heads = model(imgs)

        _, predicts = torch.max(out, 1)
        y_true.append(predicts.cpu().numpy())
        y_pred.append(targets.cpu().numpy())


# In[20]:


y_true = np.concatenate(y_true)
y_pred = np.concatenate(y_pred)


# In[21]:


from sklearn.metrics import confusion_matrix, accuracy_score, ConfusionMatrixDisplay, balanced_accuracy_score


# In[23]:


print('Acc', accuracy_score(y_true, y_pred))
print('Mean Acc', balanced_accuracy_score(y_true, y_pred))

Acc 0.8970013037809648
Mean Acc 0.8376120557760152

Question in demo.py and run_grad_cam.py

question 1:
python run_grad_cam.py
Traceback (most recent call last):
File "run_grad_cam.py", line 94, in
eigen_smooth=False)
File "/home/a123/anaconda3/envs/fer/lib/python3.7/site-packages/pytorch_grad_cam/base_cam.py", line 130, in call
target_category, eigen_smooth)
File "/home/a123/anaconda3/envs/fer/lib/python3.7/site-packages/pytorch_grad_cam/base_cam.py", line 66, in forward
target_category = np.argmax(output.cpu().data.numpy(), axis=-1)
AttributeError: 'tuple' object has no attribute 'cpu'

Need to replace several variables manually? What are they？
question 2:
when I test an image ,I find that the emotion label can't match the category .What's the problem?

A mistake in rafdb.py

Hi,

In rafdb.py, line 235
y_pred.append(targets.cpu().numpy()) should be
y_pred.append(predicts.cpu().numpy())

about Gad-CAM++

impressive work, however, can you provide codes about how to visualization the attention map using grad-cam, we encounter some problems during reproducing it.

SFEW

Thank you for your work.
Could you please share your training code for SFEW 2.0?

MS-celeb预训练的下载地址无效

作者您好，论文的工作对我很有帮助，您提供的在ms-celeb上预训练模型的下载地址失效了，请问下能否更新一下呢，非常感谢🙏

Question about conversion of original Affectnet dataset

Hello, how do I convert the original manually annotated Affectnet dataset I applied from the official website to the CSV format you provided?

Some questions about the result of your paper

Hello, I have read your paper. And I have some questions about the experimental result in your paper. In your paper, table 3, you give the baseline result 86.25%. Do you just use resnet18? Why cann't I get such high result with the network resnet50?

Is this method also applicable for fer2013 data set? Why did the author not verify it on the most common fer2013 data set

What is a licence for this repo?

train AffectNet and get best acc:0.6094. What did I do wrong?

There is my settings,
bs : 256
dataset_root_path : /root/data/dataset/AffectNet_NEW/
lr : 0.0001
num_class : 8
num_head : 4
num_workers : 24
rand_seed : 63079
save_path : checkpoint/train_DAN_ORIGN/63079
test_list : ['process_data/affectNet/AffectNet_Validation.list']
total_epoch : 40
train_list : ['process_data/affectNet/AffectNet_Train.list']
[Epoch 40] Training accuracy: 0.6057. Loss: 1.850. LR 0.000000
[Epoch 40] Validation accuracy:0.6036. Loss:1.867
best_acc:0.6094

The only code different is that I do not using datasets.ImageFolder to read data.
In your implementation details, i find that "On RAF-DB and AffectNet datasets, we use the official aligned images samples directly.". I train Affectnet data using "Manually_Annotated" dir.
I find that your affectnet.csv is different from mine on image file name and your affectnet.csv is preprocessed.
I test affecnet8_epoch5_acc0.6209.pth using my validation list, but get acc is 0.6039.
so, What dataset are you using? how do you process affectnet?
What did I do wrong?