Giter VIP home page Giter VIP logo

smpcup2017_elp's Introduction

SMPCUP2017

Team: ELP

Organization:University of International Relations

Members:Lu Junru; Chen Le; Meng Kongming; Wang Fengyi; Xiangjun; Zhou Kaimin; Dong Zhenyuan; Shan jiawei; Lian Lingchen

Introduction:

this repository is established to review codes and documents for our team ELP in SMPCUP2017, which is a user profiling contest based on massive data provided by CSDN, including text, relations and interactions of users.

In SMPCUP2017, every team is requested to work on three specific tasks:

  • TASK1: Given a number of user documents (blogs or posts), generate 3 most appropriate keywords for each document. The generated keywords must appear in the document.
  • TASK2: Given the user's document information (blog or post) and behavior data (browsing, comment, collection, forwarding, point-of-thumb, step-by-step, private, etc.) for each user, mark the three most relevant interests for each user. The label space is given by CSDN.
  • TASK3: Given a number of users in a period of time (at least 1 year) document information (blog or post) and behavioral data (browsing, commentary, collection, forwarding, dating, attention, private messages, etc.), predict each user in the future over a period of time (half a year to 1 year) growth value. User growth is based on the user's overall performance scoring income, but will not publish the specific score criteria. The growth value will be normalized to the [0, 1] interval, where the value is 0 for user churn.

More detailed imformation could be browsed on the page: https://biendata.com/competition/smpcup2017/

Baseline Models:

Final Models:

  • TASK1: S-TFIDF, a promoted model based on TFIDF and Textrank.
  • TASK2: S-TFIDF/DocumentEmbedding-SVC-Stacking(SDSS), a stacking model that using S-TFIDF and DocumentEmbedding as first layer and using SVC as second layer.
  • TASK3: PAR/GDR-NuSVR-Stacking(PGNS), a stacking model that using PassiveActiveRegressor and GrandientBoostingRegressor as first layer and using NuSVR as second layer.

Performance:

TASK1 TRAIN VALID TEST
TFIDF 0.56 0.52 None
S-TFIDF 0.61 0.56 0.56
TASK2 TRAIN VALID TEST
W-BAG None 0.40 0.373
SDSS None 0.39 0.378
TASK3 TRAIN VALID TEST
BPXG 0.54 0.59 None
PGNS 0.765 0.73 0.75

Environment:

  • Task1: python 2.7
  • Task2: python 3.0
  • Task3: python 2.7

More question:

[email protected]

Please give credit to the original author when you use it elsewhere

smpcup2017_elp's People

Contributors

lujunru avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.