Comments (7)
A nice question as Meena (Google) has also claimed that top-K was enough to gain diversity. Our point of view is that the diversity of top-K sampling is different from that of latent space. We believe Neural Network (NN) is, after all, a mapping function, and mapping function can only do one-to-one mapping.
As an intuitive explanation of this, consider A,B, C to be correct responses. We may end up producing response D as NN averages out the responses A, B and C. However, it is possible that D does not contain any information of A,B and C but has totally different meaning. If we do top-K sampling we may resample 3 responses E,F,G around D, which won’t guarantee that you can recover A,B and C.
from research.
For the second question, the BOW loss and the generative loss actually "push" the latent variable to “leak” information of responses as much as possible. Thus, it would do against the target loss function to collapse the latent distribution to a single pattern. And indeed we have never observed that phenomenon.
from research.
和隐变量产生直接联系的好像只有 non-regressive 的 BoW loss ?
from research.
The main contribution of latent variable z is to improve the generative model(p(r|c,z)), BOW loss is regarded as an auxiliary loss. We do not need the BOW loss for learning theoretically. However, practically BOW loss is important for accelerating the convergence of the recognition network p(z|c,r), such that the generative model p(r|c,z) receives the correct input z.
from research.
@WorldEditors 您好,关于隐变量是怎么确定的呢?不同数据集会选择不同的隐变量吗?隐变量的取值也是需要事先规定的吗,还是只是一些向量而已?
from research.
We only need to specify the number of classes (K) of latent variables manually. The value of the latent vector is optimized during the training process. To specify K remains a question here, we'd like to see more future works in this problem.
from research.
Thanks a lot!
from research.
Related Issues (20)
- About ACL2020-GraphSum
- ACL2019-ARNOR, the link of Data version 2.0.0 is gone
- AttributeError: module 'easymia.transforms.functional' has no attribute 'load_dcm'
- 提供的DuReader-Checklist-BASELINE代码特征处理部分有问题? HOT 1
- 请问ACL2022-DuLeMon预计什么时候开源demo? HOT 2
- 请问会开源SynCLM的预训练模型吗
- 找不到DuConv的初始embedding文件sgns.weibo.300d.txt
- AICITY2020-track1链结失效
- Graphsum environment
- How to install unimo package
- 想问问这个训练集和测试集
- SSAN result.json is empty
- SSAN出现loss为nan HOT 1
- text2sql baseline 训练好的模型在CSpider数据集上推理报错
- When will release PLATO-LTM code ?
- Text2SQL-BASELINE 训练速度慢 HOT 4
- Text2SQL-BASELINE安装ernie时出错
- text2sql推理问题
- 您好,请问research的lanmark模型能补个链接吗,谢谢!
- EMNLP2021-SgSum code problem
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from research.