比如AI Challenger2018情感分析的多标签怎么处理喂进去

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[Question] 标签一定要str类型吗，可以拿来做情感分析吗 about kashgari HOT 15 CLOSED

brikerman commented on May 18, 2024

[Question] 标签一定要str类型吗，可以拿来做情感分析吗

from kashgari.

Comments (15)

BrikerMan commented on May 18, 2024

多标签分类的模型还没有增加，可以提供详细的数据集链接，我会看情况增加相关任务。

from kashgari.

Owenscu commented on May 18, 2024

@BrikerMan 这是数据集链接链接：https://pan.baidu.com/s/1spbM8QuyjRmgCSa9i-P7Mw
提取码：679y
只有7天有效哦

from kashgari.

BrikerMan commented on May 18, 2024

这个问题并不是单个维度的多标签，感觉需要专门建模处理才可以，我这里能支持的是 [(1, 2), (3,4),(5,)] 这种简单的多标签多分类。

from kashgari.

BrikerMan commented on May 18, 2024

这样的多分类能满足需求嘛，y 是由多个标签组成的~

from kashgari.

alexwwang commented on May 18, 2024

A multi label classification problem could be solved by training a model with sigmoid as activation function and binary-crossentropy as loss function. The output is an n-dim one-hot vector to predict n possible labels.
That's it.

from kashgari.

BrikerMan commented on May 18, 2024

@alexwwang @Owenscu check this #29 out, is this looks good to you? I am planning to add multi_label classification, but it seems we had to change all the classification models to support this feature.
Dataset is from http://tcci.ccf.org.cn/conference/2018/taskdata.php task 1.

from kashgari.

alexwwang commented on May 18, 2024

Is it possible to wrap up a given nn models in a multi-label task model? I mean just replace the last output layer to fit with different kind of classification tasks and allow users to determine which to choose.

Meanwhile I am considering allowing users setting hyper_parameters while initializing a model class in the model zoo.

I think these two aspects could be put together, so the flexibility and concision could both be kept.

from kashgari.

BrikerMan commented on May 18, 2024

I think your solution is better than mine, please make the changes and submit a pull request~ @alexwwang

from kashgari.

alexwwang commented on May 18, 2024

Ok but I'm afraid it would take some time. Fighting the shape bug yet.

from kashgari.

BrikerMan commented on May 18, 2024

Take your time, no need to rush.

from kashgari.

alexwwang commented on May 18, 2024

#34 This commit fulfilled this need. By passing sigmoid and binary_crossentropy hyper_parameters to init function of a classification model, you could get one model support multi-label vector output.

from kashgari.

BrikerMan commented on May 18, 2024

@alexwwang I think multi-label is not finished yet, still need to change here

Kashgari/kashgari/tasks/classification/base_model.py

Line 110 in 692e2c2

padded_y = to_categorical(tokenized_y,

to process multi_label.

from kashgari.

alexwwang commented on May 18, 2024

@BrikerMan Yeah, the data padding work. The nearest approach, I think, maybe with the help of sklearn.preprocessing.MultiLabelBinarizer. And this tool could also deal with multi-class classification as a specific type of multi-label classification. Or just add up another switch to approach multi-label y-vector, leaving current part unchanged.
Whatever, seems there's no shortcut here in data preprocessing and prediction processing, if we want to keep the predict confidence/probability of each label/class.
How do you think?

from kashgari.

BrikerMan commented on May 18, 2024

Please check out the #29, I have used sklearn.preprocessing.MultiLabelBinarizer and rewrite the data-process and predict function.

from kashgari.

BrikerMan commented on May 18, 2024

Kashgari supported multi-label classification now, change y from string to a string list, then set multi_label=True While init model class.

from kashgari.

[Question] 标签一定要str类型吗，可以拿来做情感分析吗 about kashgari HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent