raymin0223 / patch-mix_contrastive_learning Goto Github PK
View Code? Open in Web Editor NEWPatch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification (INTERSPEECH 2023)
Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification (INTERSPEECH 2023)
Hi, @raymin0223 ,
May i have ask you 2 questions about this class;
class PatchMixConLoss(nn.Module):
def __init__(self, temperature=0.06):
super().__init__()
self.temperature = temperature
def forward(self, projection1, projection2, labels_a, labels_b, lam, index, args):
batch_size = projection1.shape[0]
# proj: (bt, embed_dim)
projection1, projection2 = F.normalize(projection1), F.normalize(projection2)
# (bt, bt )
anchor_dot_contrast = torch.div(torch.matmul(projection2, projection1.T), self.temperature)
mask_a = torch.eye(batch_size).cuda() # initial the diag matrix;
mask_b = torch.zeros(batch_size, batch_size).cuda() # zero mask;
mask_b[torch.arange(batch_size).unsqueeze(1), index.view(-1, 1)] = 1
mask = lam * mask_a + (1 - lam) * mask_b
logits_max, _ = torch.max(anchor_dot_contrast, dim=1, keepdim=True) #(bt, 1)
logits = anchor_dot_contrast - logits_max.detach() # for numerical stability (bt, bt)
exp_logits = torch.exp(logits) #(bt, bt)
if args.negative_pair == 'diff_label':
labels_a = labels_a.contiguous().view(-1, 1)
logits_mask = torch.ne(labels_a, labels_a.T).cuda() + (mask_a.bool() + mask_b.bool())
exp_logits *= logits_mask.float()
log_prob = logits - torch.log(exp_logits.sum(1, keepdim=True))
mean_log_prob_pos = (mask * log_prob).sum(1) / mask.sum(1)
loss = -mean_log_prob_pos
loss = loss.view(1, batch_size)
loss = loss.mean()
return loss
In this class, the labels_b has not been used. And in the total project, the args.negative_pair == 'all '
default use all
instead the differ label
; why the labels_b
and the args.negative_pair == 'differ label'
has not been used;
As I understand it, the i
th row of the variable mask
matrix represents the i
th sample of the generated patch, with the diagonal representing the percentage of patches that are kept from the original sample, and the other positions representing the percentage of patches that are used from another sample; But the code after the variable mask, I really don't understand, can you explain the theory or recommend some related articles so that I can understand the following code;
Thank you very much for your work!
I attempted to run the. sh file (using bash./scripts/icbhi_patchmix_cl. sh), but reported an error:
Traceback (most recent call last):
File "main.py", line 545, in
main()
File "main.py", line 487, in main
train_loader, val_loader, args = set_loader(args)
File "main.py", line 214, in set_loader
train_dataset = ICBHIDataset(train_flag=True, transform=train_transform, args=args, print_flag=True)
File "/work/msy/ICBHI_patch_mix_AST/patch-mix_contrastive_learning-main/util/icbhi_dataset.py", line 66, in init
self.file_to_device[f.strip().split('.')[0]] = self.device_to_id[device]
KeyError: 'format'
May I ask what modifications should I make? The dataset is already in the specified location in the data folder.
I tried running icbhi_patchmix_cl.sh file, but I found that the running result is only around 60.2(Best S_p 77.14, S_e 43.25, Score 60.19). I am not sure what the problem is. I hope someone can help me. Thank you very much
Hi, @raymin0223. Thank you for opensource code !
When I reproduced your code, I found that the validation curve kept increasing, and the training curve started to increase around 30 epochs. Additionally, I followed your code and the score I obtained was only 58.46. Are these situations normal?
Thank you very much !
Hi, @raymin0223. Thank you for providing good research and opensource code!
If I want to train and test on my own dataset, which parts of the code should I modify? Also, do I need to make any modifications to my own dataset?
Thanks! ^_^
Hi, @raymin0223. Thank you for opensource code !
I noticed that your paper includes the phrase “We also applied the standard normalization on the spectrograms with the mean and standard deviation of –4.27 and 4.57, respectively.”However, I cannot obtain the correct values based on the code you provided.I calculated the mean and standard using your code, and they are -9.0943 and 3.5168, respectively.
Here are my calculation steps. Did you calculate the mean and variance this way?
Step 1: Comment out the normalization in the generate_fbank
function in the icbhi_util.py
file.
Step 2: Calculate the mean and standard in the set_loader
function in the main.py
file.
Thank you very much !
Hi @raymin0223,
I have found that when I run code using multiple GPUs, the following errors occur:
Traceback (most recent call last):
File "main.py", line 563, in
main()
File "main.py", line 531, in main
loss, acc = train(train_loader, model, classifier, projector, criterion, optimizer, epoch, args, scaler)
File "main.py", line 384, in train
mix_images, labels_a, labels_b, lam, index = model(images, y=labels, patch_mix=True, time_domain=args.time_domain)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 172, in forward
return self.gather(outputs, self.output_device)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 184, in gather
return gather(outputs, output_device, dim=self.dim)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 86, in gather
res = gather_map(outputs)
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 81, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/ygh/anaconda3/envs/pytorch20/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 81, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
TypeError: 'float' object is not iterable
I think it's caused by the parameter 'lam', so I have rewritten the code as follows:
When mixing patches, I put the lam parameter into the tensor. When calculating the contrast loss, I perform the lam.mean() operation.
Will this affect the calculation of contrast loss? If so, what should I do to solve the problem?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.