raymin0223 / patch-mix_contrastive_learning Goto Github PK

Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification (INTERSPEECH 2023)

Python 96.25% Shell 3.75%

audio-spectrogram-transformer contrastive-learning icbhi-dataset patch-mix respiratory-sounds

patch-mix_contrastive_learning's People

Contributors

Stargazers

Watchers

Forkers

wa976 kevintherainmaker aparna024 anilgavade chiant1 zwxdxcm folkartist mizilu33 audiowiz michiel-j

patch-mix_contrastive_learning's Issues

Questions about the PatchMixLoss

Hi, @raymin0223 ,

 May  i  have  ask you  2 questions about this  class;

class PatchMixConLoss(nn.Module):
    def __init__(self, temperature=0.06):
        super().__init__()
        self.temperature = temperature

    def forward(self, projection1, projection2, labels_a, labels_b, lam, index, args):
        batch_size = projection1.shape[0]
        # proj: (bt, embed_dim)
        projection1, projection2 = F.normalize(projection1), F.normalize(projection2)

        # (bt, bt )
        anchor_dot_contrast = torch.div(torch.matmul(projection2, projection1.T), self.temperature)

        mask_a = torch.eye(batch_size).cuda() #  initial the  diag  matrix；
        mask_b = torch.zeros(batch_size, batch_size).cuda()  # zero mask；
        mask_b[torch.arange(batch_size).unsqueeze(1), index.view(-1, 1)] = 1 


        mask = lam * mask_a + (1 - lam) * mask_b


        logits_max, _ = torch.max(anchor_dot_contrast, dim=1, keepdim=True) #(bt, 1)
        logits = anchor_dot_contrast - logits_max.detach() # for numerical stability  (bt, bt)


        exp_logits = torch.exp(logits)  #(bt, bt)
        if args.negative_pair == 'diff_label':
            labels_a = labels_a.contiguous().view(-1, 1)
            logits_mask = torch.ne(labels_a, labels_a.T).cuda() + (mask_a.bool() + mask_b.bool())
            exp_logits *= logits_mask.float()

     
        log_prob = logits - torch.log(exp_logits.sum(1, keepdim=True))
        mean_log_prob_pos = (mask * log_prob).sum(1) / mask.sum(1)

   
        loss = -mean_log_prob_pos
        loss = loss.view(1, batch_size)

        loss = loss.mean()   
        return loss

In this class, the labels_b has not been used. And in the total project, the args.negative_pair == 'all ' default use all instead the differ label; why the labels_b and the args.negative_pair == 'differ label' has not been used;
As I understand it, the i th row of the variable mask matrix represents the i th sample of the generated patch, with the diagonal representing the percentage of patches that are kept from the original sample, and the other positions representing the percentage of patches that are used from another sample; But the code after the variable mask, I really don't understand, can you explain the theory or recommend some related articles so that I can understand the following code;

Thank you very much for your work!

format issue

I attempted to run the. sh file (using bash./scripts/icbhi_patchmix_cl. sh), but reported an error:

Traceback (most recent call last):
File "main.py", line 545, in
main()
File "main.py", line 487, in main
train_loader, val_loader, args = set_loader(args)
File "main.py", line 214, in set_loader
train_dataset = ICBHIDataset(train_flag=True, transform=train_transform, args=args, print_flag=True)
File "/work/msy/ICBHI_patch_mix_AST/patch-mix_contrastive_learning-main/util/icbhi_dataset.py", line 66, in init
self.file_to_device[f.strip().split('.')[0]] = self.device_to_id[device]
KeyError: 'format'

May I ask what modifications should I make? The dataset is already in the specified location in the data folder.

The code running results are inconsistent with the experiment

I tried running icbhi_patchmix_cl.sh file, but I found that the running result is only around 60.2(Best S_p 77.14, S_e 43.25, Score 60.19). I am not sure what the problem is. I hope someone can help me. Thank you very much

Questions about val loss

Hi, @raymin0223. Thank you for opensource code !
When I reproduced your code, I found that the validation curve kept increasing, and the training curve started to increase around 30 epochs. Additionally, I followed your code and the score I obtained was only 58.46. Are these situations normal?
Thank you very much !

how to use my own dataset

Hi, @raymin0223. Thank you for providing good research and opensource code!
If I want to train and test on my own dataset, which parts of the code should I modify? Also, do I need to make any modifications to my own dataset?
Thanks! ^_^

How to calculate the mean and standard?

Hi, @raymin0223. Thank you for opensource code !
I noticed that your paper includes the phrase “We also applied the standard normalization on the spectrograms with the mean and standard deviation of –4.27 and 4.57, respectively.”However, I cannot obtain the correct values based on the code you provided.I calculated the mean and standard using your code, and they are -9.0943 and 3.5168, respectively.

Here are my calculation steps. Did you calculate the mean and variance this way?

Step 1: Comment out the normalization in the generate_fbank function in the icbhi_util.py file.

Step 2: Calculate the mean and standard in the set_loader function in the main.py file.

Thank you very much !

Distributed Run Code Error

Hi @raymin0223,
I have found that when I run code using multiple GPUs, the following errors occur:

Traceback (most recent call last):
File "main.py", line 563, in
main()
File "main.py", line 531, in main
loss, acc = train(train_loader, model, classifier, projector, criterion, optimizer, epoch, args, scaler)
File "main.py", line 384, in train
mix_images, labels_a, labels_b, lam, index = model(images, y=labels, patch_mix=True, time_domain=args.time_domain)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 172, in forward
return self.gather(outputs, self.output_device)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 184, in gather
return gather(outputs, output_device, dim=self.dim)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 86, in gather
res = gather_map(outputs)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 81, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 81, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
TypeError: 'float' object is not iterable

I think it's caused by the parameter 'lam', so I have rewritten the code as follows:

When mixing patches, I put the lam parameter into the tensor. When calculating the contrast loss, I perform the lam.mean() operation.
Will this affect the calculation of contrast loss? If so, what should I do to solve the problem?

raymin0223 / patch-mix_contrastive_learning Goto Github PK

patch-mix_contrastive_learning's People

Contributors

Stargazers

Watchers

Forkers

patch-mix_contrastive_learning's Issues

Questions about the PatchMixLoss

format issue

The code running results are inconsistent with the experiment

Questions about val loss

how to use my own dataset

How to calculate the mean and standard?

Distributed Run Code Error

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent