lixin4ever / e2e-tbsa Goto Github PK
View Code? Open in Web Editor NEW[AAAI 2019] A Unified Model for Opinion Target Extraction and Target Sentiment Prediction
Home Page: https://arxiv.org/abs/1811.05082
[AAAI 2019] A Unified Model for Opinion Target Extraction and Target Sentiment Prediction
Home Page: https://arxiv.org/abs/1811.05082
No such file or directory: '/projdata9/info_fil/lixin/Research/OTE/embeddings/glove_840B_300d.txt'
请问下论文中报的结果都是micro-F1吗
Hi,
Your model returns two results, one for ote and the other for ts. However, in your paper (https://arxiv.org/pdf/1811.05082.pdf) u reported a global result without specifying the task (e.g, for laptop dataset 61.27 54.89 57.90 for precision, recall. and F1-score, respectively). My question is: are these results selected from ote or ts ones in ur code?
Thank u
数据集中标点符合好像都是紧跟着上一个word,没有用空格分隔,但是在标记的时候是单独标记的,请问这样有影响吗?
Hi @lixin4ever I am getting the following error.
Can you please tell me what should I do in this case
Hi,
Thank u for sharing ur code with us. However, could u plz tell me what does OTE and TS mean in the running results?
Exceed: test performance: ote: f1: 0.6512, ts: precision: 0.6156, recall: 0.4958, micro-f1: 0.5492
Does OTE mean the results of the aspect target extraction, and TS means the results of the aspect sentiment classification?
So in the example above, the model got an F1-score of 65.12% for OTE task and F1-score of 54.92% for aspect sentiment classification?
Thanks in advance
我运行了程序
作为输出的txt文档里有四列结果 ote_tag, ote_tag_gold, ts_tag, ts_tag_gold
能否请您简要介绍一下这些输出,以及应该选择哪个作为最终结果
谢谢!
Hi, lixin, thanks for your great work. I have a doubt about the dataset, For example, In laptop14_test.text:
In fact I still use manyLegacy programs (Appleworks, FileMaker Pro, Quicken, Photoshop etc)!####In=O fact=O I=O still=O use=O manyASPECT0=O Appleworks=T-NEU ,=O FileMaker=T-NEU Pro=T-NEU ,=O Quicken=T-NEU ,=O Photoshop=T-NEU etc=O !=O
In laptop14_train.text:
With the macbook pro it comes with freesecuritysoftware to protect it from viruses and other intrusive things from downloads and internet surfing or emails.####With=O the=O macbook=O pro=O it=O comes=O with=O freeASPECT0=O to=O protect=O it=O from=O viruses=O and=O other=O intrusive=O things=O from=O downloads=O and=O internet=O surfing=O or=O emails=O .=O
The Apple applications (ex.iPhoto) are fun, easy, and really cool to use (unlike the competition)!####The=O Apple=T-POS applications=T-POS exASPECT1=O are=O fun=O ,=O easy=O ,=O and=O really=O cool=O to=O use=O unlike=O the=O competition=O !=O .=O
Why 'manyLegacy', 'freeASPECT0' and 'ex.iPhoto' become 'manyASPECT0, 'freeASPECT0' and 'exASPECT1'? This happens many times in Laptop and Restaurant datasets
您好,我在MitchellETAL等人在2013 EMNLP发表的那篇论文下载到了twitter数据集,但是发现twitter数据集共有3288个标有Person或Organization的aspect。但是您这篇论文的aspect共有3199个。我看了一下您的数据集,您没有用BIO标记,您是用T代替BIO标记。所以想请问,您论文中aspect比原论文中的aspect总数少是否有一部分原因是标记的问题呢?或者说还有没有其他原因呢?
Hi @lixin4ever I am getting the following error.
Can you please tell me what should I do in this case
Hi @lixin4ever, Thanks for this amazing repo.
Can this be used to extract aspect and its sentiment of a new raw text(in laptop and Restaurant domain itself)?
If so please let me know how.
Thanks !
I am facing the following error while calculating predictions on the laptop14 dataset by using a model trained on rest_total dataset. Why is the model parameter loading process dependent on the size of training corpus vocabulary?
File "_dynet.pyx", line 1461, in _dynet.ParameterCollection.populate File "_dynet.pyx", line 1516, in _dynet.ParameterCollection.populate_from_textfile RuntimeError: Dimensions of lookup parameter /_0/_0 lookup up from file ({300,6465}) do not match parameters to be populated ({300,4738})
the default value for the tagging_scheme argument is 'BIO', which fails an assertion in model.py line 212.
'yelp_rest1': '/projdata9/info_fil/lixin/Research/yelp/yelp_vec_200_2_win5_sent.txt',
'yelp_rest2': '/projdata9/info_fil/lixin/Research/yelp/yelp_vec_200_2_new.txt', 'amazon_laptop':'/projdata9/info_fil/lixin/Resources/amazon_full/vectors/amazon_laptop_vec_200_5.txt'
您好,我想请问一下在运行process_data这个文件的时候,FileNotFoundError: [Errno 2] No such file or directory: './raw_data/laptop14_train.xml'出现了这样的错误,
代码部分是这样的
def extract_text(dataset_name):
"""
extract textual information from the xml file
:param dataset_name: dataset name
"""
delset = string.punctuation
fpath = './raw_data/%s.xml' % dataset_name
print("Process %s..." % fpath)
数据集不都是txt格式的吗'./raw_data/%s.xml'是不是应该将路径改为data下的txt格式文件
In Epoch 1 / 50 (current lr: 0.0010):
Traceback (most recent call last):
File "main.py", line 257, in
final_res_string, model_path = run(dataset=[train, val, test], model=model, params=args)
File "main.py", line 51, in run
loss, pred_ote_labels, pred_ts_labels = model.forward(x=train_set[i], is_train=True)
File "/data/E2E-TBSA/model.py", line 302, in forward
stm_lm_hs = [self.stm_lm(h) for h in ote_hs]
File "/data/E2E-TBSA/model.py", line 302, in
stm_lm_hs = [self.stm_lm(h) for h in ote_hs]
File "/data/E2E-TBSA/model.py", line 139, in call
Wx = self._W * x
File "_dynet.pyx", line 1859, in _dynet.Expression.mul
NotImplementedError
谢谢
def match_ot(gold_ote_sequence, pred_ote_sequence):
n_hit = 0
for t in pred_ote_sequence:
if t in gold_ote_sequence:
n_hit += 1
return n_hit
def match_ts(gold_ts_sequence, pred_ts_sequence):
# positive, negative and neutral
tag2tagid = {'POS': 0, 'NEG': 1, 'NEU': 2}
hit_count, gold_count, pred_count = np.zeros(3), np.zeros(3), np.zeros(3)
for t in gold_ts_sequence:
#print(t)
ts_tag = t[2]
tid = tag2tagid[ts_tag]
gold_count[tid] += 1
for t in pred_ts_sequence:
ts_tag = t[2]
tid = tag2tagid[ts_tag]
if t in gold_ts_sequence:
hit_count[tid] += 1
pred_count[tid] += 1
return hit_count, gold_count, pred_count
这里使用if in是否没有考虑tag的位置对应问题。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.