talschuster / crosslingualcontextualemb Goto Github PK
View Code? Open in Web Editor NEWCross-Lingual Alignment of Contextual Word Embeddings
License: MIT License
Cross-Lingual Alignment of Contextual Word Embeddings
License: MIT License
Not a code issue, but I have a question regarding your paper.
For footnote 3 in section3.3 (Supervised Anchored Alignment), it says "In practice, we may have multiple target words for a single source word, and the extension is straight-forward.".
My understanding is, dictionary D is a dictionary for source and target word pairs. However for a single source word that may have multiple target words, denote this as 'i', (e.g bear in source(english) can be translated to multiple target words), D would contain more than one entries of (i,D(i)) . So dictionary D would look like:
....
(source, target)
(bear, target word for 'animal bear')
(bear, target word for 'to have')
....
....
Say,
a = anchor vector for source word bear
b = anchor vector for target word 'animal bear'
c = anchor vector for target word 'to have'
then,
we learn a mapping between (a,b) and (a,c).
Is my understanding correct?
The problem is, it is hard to intuitively understand how this method can lead to results demonstrated in Figure2, where distinct point clouds are aligned with their corresponding distinct words in target
Thank you for the nice work. I read your paper and saw a few embedding visualizations that look very nice, but I can't seem to find the code for generating those in the repo here.
I'm also trying to visualize contextualized embeddings (for BERT). How did you manage to do that? Thank you so much for your help.
Is it possible to train anchored LM for few-shot parsing?
I've trained several ELMo weights for other languages (e.g., Finnish, Chinese ...)(I consider to enrich your repo), and now I wanna align them (including each LSTM layer) to English space, as just you did.
But, it seems you did not release the codes for generating such aligning matrix (like such below).
P.S.: note that you only released the code for generating the anchors, and I believe it is nothing to do with the aligning matrix.
Or, if I misunderstand the approach, please show the hints, correct my wrong.
So, may I have your prompt reply concerning this issue?
Thx a lot.
Hi, when running demo.py models/option262.json is not present by default, are we supposed to download?
Hi!
Could you please also share the alignment matrices for the unsupervised part of your work?
Thanks
Hi,
Thank you for this interesting work. I am currently working on extending this approach on top of mBERT and would need to generate English mapping from scratch. Did you learn the W matrix for English by aligning English to itself using MUSE? Wouldn't that be redundant?
Thanks,
This is not an issue, but a question. Could you tell me more about how your anchors were generated for your experiments? The paper says anchors were computed from the evaluation set (which amounts to 5% of the total CoNLL data), but I'd like to know more details - like exactly how many sentences were used, for each language, etc.
Thanks!
Hi,
I've read your article and the readme here and I'm a little confused about 2 things:
We provide the alignment of the first LSTM output of ELMo to English
and also:
Note that our alignments were done on the second layer of ELMo
My understanding is that "fist LSTM output" is not the second layer. Am I misunderstanding the quote? What layer do you align?
Thanks!
@TalSchuster I'm wondering how do you generate the pretrained model?
I have the following questions,
Were you using https://github.com/allenai/bilm-tf with the default argument?
How long did you train for each model?
And how many gpu and what kind of gpu you used?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.