divya171997 / text-analytics-extreme-multi-label-classification-xmlc Goto Github PK
View Code? Open in Web Editor NEWMulti-label classification is one of the standard tasks in text analytics. The objective is to perform an eXtreme multi-label classification (XMLC) on two datasets( https://www.kaggle.com/hsrobo/titlebased-semantic-subject-indexing) -EconBiz( ZBW - Leibniz Information Centre for Economics from July 2017) and PubMed(5th BioASQ challenge on large-scale semantic subject indexing of biomedical articles).In an XMLC setting, there are k many labels from a large pool of n labels to be assigned to the data objects. The classification task is extreme in two senses: First, the number of n labels is very large with hundreds or thousands of labels. Second, there are only very few k labels to assign, i. e. it holds k <<n. Thus, it is likely to have false positives.