Multi-label classification is one of the standard tasks in text analytics. The objective is to perform an eXtreme multi-label classification (XMLC) on two datasets( https://www.kaggle.com/hsrobo/titlebased-semantic-subject-indexing) -EconBiz( ZBW - Leibniz Information Centre for Economics from July 2017) and PubMed(5th BioASQ challenge on large-scale semantic subject indexing of biomedical articles).In an XMLC setting, there are k many labels from a large pool of n labels to be assigned to the data objects. The classification task is extreme in two senses: First, the number of n labels is very large with hundreds or thousands of labels. Second, there are only very few k labels to assign, i. e. it holds k <<n. Thus, it is likely to have false positives.

Jupyter Notebook 100.00%

data-preprocessing evaluation f1-score feature-extraction hamming-loss load-data model-building

Recommend Projects

divya171997 / text-analytics-extreme-multi-label-classification-xmlc Goto Github PK

text-analytics-extreme-multi-label-classification-xmlc's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent