Comments (5)
Oh! The index of class in D2 is from 11 to 19 if debiased loss is True. So the class distance index is correct and my previous understanding is wrong. But I still confused about why should we compute class distance in the same dataset if debiased loss is True.
from otdd.
Hi @toooooodo. When debiased_loss=True
, we also need to compute label-to-label distances within each of the two datasets. To avoid carrying around 3 different tensors, we stack all of them together in a block-wise matrix of size (k + k')**2, assuming the datasets have k
and k'
classes respectively. The diagonal blocks of this matrix are the within-domain label distances, and the off-diagonal (the matrix is symmetric, so the two off-diagonal blocks are the same) are the usual across-domains label distances that you would get if you run OTDD with debiased_loss=False
. I hope that clarifies it!
from otdd.
Thanks for your immediate reply!
I understand that we have 3 tensors (label-to-label distance in D1, label-to-label distance in D2, and label-to-label distance across D1 and D2) and we stack all of them together to a symmetric matrix of size (k + k')**2. But why should we compute label-to-label distances within two datasets when debiased_loss=True
? I don't quite understand this parameter.
Could you please clarify the effect of this parameter and the reason to compute distance within datasets?
from otdd.
Ah, got it. So your question is about how the debiased parameter works in general. When debiased_loss=True
we compute an unbiased version of the sinkhorn divergence: d_debiased(a,b) = d(a,b) - 0.5(d(a,a) + d(b,b)). You can check out this paper for details: http://proceedings.mlr.press/v89/feydy19a/feydy19a.pdf, but basically this is done to guarantee that d(a,a) = 0, which in turn leads to unbiased gradients (note this is not the case in general for the vanilla sinkhorn loss).
from otdd.
Thanks! I'll check out this paper.
from otdd.
Related Issues (20)
- >>>> HEAD in setup.py HOT 2
- Input matrix is ill-conditioned HOT 1
- RuntimeError: symeig_cpu: the algorithm failed to converge HOT 1
- Why the same datasets otdd is not zero ? HOT 2
- Parameter setup for the *MNIST+USPS distance HOT 1
- Question:
- No such file or directory: dist/otdd/data/ag_news_csv/train.tsv
- Using OTDD on two different datasets with different sizes? HOT 10
- OTDD in datasets with missing values?
- Cov is nan in the flow example in readme
- Distance between the same dataset > 0? HOT 2
- Problem when using "Exact" method to calculate large samples HOT 3
- Questions regarding using the "exact" method and default "gaussian_approx" method
- IndexError: list index out of range while computing distance for datasets HOT 3
- How to use multiple GPUs for calculation?
- Possible to use otdd with coco dataset? HOT 1
- Unsupervised dataset RunTime error (setting the debiased_loss=False and ignore_target_labels=True)
- When I execute the function `pwdist_exact`, I get an error HOT 2
- Unexpected results on Apple M1 processor
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from otdd.