Giter VIP home page Giter VIP logo

e2e-tbsa's People

Contributors

lixin4ever avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

e2e-tbsa's Issues

cannot find this dataset

No such file or directory: '/projdata9/info_fil/lixin/Research/OTE/embeddings/glove_840B_300d.txt'

Reported results-Original paper

Hi,

Your model returns two results, one for ote and the other for ts. However, in your paper (https://arxiv.org/pdf/1811.05082.pdf) u reported a global result without specifying the task (e.g, for laptop dataset 61.27 54.89 57.90 for precision, recall. and F1-score, respectively). My question is: are these results selected from ote or ts ones in ur code?

Thank u

Prediction values

Hi,

Thank u for sharing ur code with us. However, could u plz tell me what does OTE and TS mean in the running results?

Exceed: test performance: ote: f1: 0.6512, ts: precision: 0.6156, recall: 0.4958, micro-f1: 0.5492

Does OTE mean the results of the aspect target extraction, and TS means the results of the aspect sentiment classification?
So in the example above, the model got an F1-score of 65.12% for OTE task and F1-score of 54.92% for aspect sentiment classification?

Thanks in advance

关于输出的问题想请教一下

我运行了程序
作为输出的txt文档里有四列结果 ote_tag, ote_tag_gold, ts_tag, ts_tag_gold
能否请您简要介绍一下这些输出,以及应该选择哪个作为最终结果
谢谢!

why the word like xxxAspectxxx in sentence?

Hi, lixin, thanks for your great work. I have a doubt about the dataset, For example, In laptop14_test.text:

In fact I still use manyLegacy programs (Appleworks, FileMaker Pro, Quicken, Photoshop etc)!####In=O fact=O I=O still=O use=O manyASPECT0=O Appleworks=T-NEU ,=O FileMaker=T-NEU Pro=T-NEU ,=O Quicken=T-NEU ,=O Photoshop=T-NEU etc=O !=O

In laptop14_train.text:

With the macbook pro it comes with freesecuritysoftware to protect it from viruses and other intrusive things from downloads and internet surfing or emails.####With=O the=O macbook=O pro=O it=O comes=O with=O freeASPECT0=O to=O protect=O it=O from=O viruses=O and=O other=O intrusive=O things=O from=O downloads=O and=O internet=O surfing=O or=O emails=O .=O

The Apple applications (ex.iPhoto) are fun, easy, and really cool to use (unlike the competition)!####The=O Apple=T-POS applications=T-POS exASPECT1=O are=O fun=O ,=O easy=O ,=O and=O really=O cool=O to=O use=O unlike=O the=O competition=O !=O .=O

Why 'manyLegacy', 'freeASPECT0' and 'ex.iPhoto' become 'manyASPECT0, 'freeASPECT0' and 'exASPECT1'? This happens many times in Laptop and Restaurant datasets

关于twitter数据集的问题

您好,我在MitchellETAL等人在2013 EMNLP发表的那篇论文下载到了twitter数据集,但是发现twitter数据集共有3288个标有Person或Organization的aspect。但是您这篇论文的aspect共有3199个。我看了一下您的数据集,您没有用BIO标记,您是用T代替BIO标记。所以想请问,您论文中aspect比原论文中的aspect总数少是否有一部分原因是标记的问题呢?或者说还有没有其他原因呢?

训练好的模型在修改后的测试集上运行报错

您好,我在laptop14数据集上训练好了一个模型,想要看看在我自己的数据集上的效果。我根据测试集的格式标注了几句,用训练好的模型进行测试会报如下错误
屏幕截图 2020-12-22 232531
我感觉这个不是我数据的问题,因此我将原有测试集删除了几行运行,同样也报错了。请问测试集的数量是需要设置的吗?
后面我想到了可以替换,于是我将原测试集后6句替换为自己标注的数据,运行后报错如下:
image
请问这是什么原因导致的啊?

error while using a trained model on different datasets

I am facing the following error while calculating predictions on the laptop14 dataset by using a model trained on rest_total dataset. Why is the model parameter loading process dependent on the size of training corpus vocabulary?

File "_dynet.pyx", line 1461, in _dynet.ParameterCollection.populate File "_dynet.pyx", line 1516, in _dynet.ParameterCollection.populate_from_textfile RuntimeError: Dimensions of lookup parameter /_0/_0 lookup up from file ({300,6465}) do not match parameters to be populated ({300,4738})

I cannot find these datasets ; please give me some info about it

'yelp_rest1': '/projdata9/info_fil/lixin/Research/yelp/yelp_vec_200_2_win5_sent.txt',
'yelp_rest2': '/projdata9/info_fil/lixin/Research/yelp/yelp_vec_200_2_new.txt', 'amazon_laptop':'/projdata9/info_fil/lixin/Resources/amazon_full/vectors/amazon_laptop_vec_200_5.txt'

No such file or directory: './raw_data/laptop14_train.xml'

您好,我想请问一下在运行process_data这个文件的时候,FileNotFoundError: [Errno 2] No such file or directory: './raw_data/laptop14_train.xml'出现了这样的错误,
代码部分是这样的
def extract_text(dataset_name):
"""
extract textual information from the xml file
:param dataset_name: dataset name
"""
delset = string.punctuation
fpath = './raw_data/%s.xml' % dataset_name
print("Process %s..." % fpath)
数据集不都是txt格式的吗'./raw_data/%s.xml'是不是应该将路径改为data下的txt格式文件

执行main.py报错:NotImplementedError

In Epoch 1 / 50 (current lr: 0.0010):
Traceback (most recent call last):
File "main.py", line 257, in
final_res_string, model_path = run(dataset=[train, val, test], model=model, params=args)
File "main.py", line 51, in run
loss, pred_ote_labels, pred_ts_labels = model.forward(x=train_set[i], is_train=True)
File "/data/E2E-TBSA/model.py", line 302, in forward
stm_lm_hs = [self.stm_lm(h) for h in ote_hs]
File "/data/E2E-TBSA/model.py", line 302, in
stm_lm_hs = [self.stm_lm(h) for h in ote_hs]
File "/data/E2E-TBSA/model.py", line 139, in call
Wx = self._W * x
File "_dynet.pyx", line 1859, in _dynet.Expression.mul
NotImplementedError

谢谢

tp/ hit 计算的问题

def match_ot(gold_ote_sequence, pred_ote_sequence):
    n_hit = 0
    for t in pred_ote_sequence:
        if t in gold_ote_sequence:
            n_hit += 1
    return n_hit


def match_ts(gold_ts_sequence, pred_ts_sequence):
 
    # positive, negative and neutral
    tag2tagid = {'POS': 0, 'NEG': 1, 'NEU': 2}
    hit_count, gold_count, pred_count = np.zeros(3), np.zeros(3), np.zeros(3)
    for t in gold_ts_sequence:
        #print(t)
        ts_tag = t[2]
        tid = tag2tagid[ts_tag]
        gold_count[tid] += 1
    for t in pred_ts_sequence:
        ts_tag = t[2]
        tid = tag2tagid[ts_tag]
        if t in gold_ts_sequence:
            hit_count[tid] += 1
        pred_count[tid] += 1
    return hit_count, gold_count, pred_count

这里使用if in是否没有考虑tag的位置对应问题。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.