Giter VIP home page Giter VIP logo

nlpcc-2016-wordsim's Introduction

NLPCC-2016-WordSim

Code for NLPCC2016 Chinese Word Similarity Task

This is our solution for NLPCC2016 shared task: Chinese Word Similarity Measurement.

This task provides a dataset of Chinese word similarity to evaluate and compare different semantic measures of lexical similarity, including 500 word pairs and their similarity scores.

We proposes a novel framework for measuring the Chinese word similarity by combining word embedding and Tongyiic Cilin. We also utilize retrieval techniques to extend the contexts of word pairs and calculate the similarity scores to weakly supervise the selection of a better result.

The official results show that our solution of the team "DLUT_NLPer" achieves the 2nd place in the Chinese Lexical Similarity Computation (CLSC) shared task with 0.457 Spearman rank correlation coeffcient.

After the submission, we boost the embedding model by merging an English model into the Chinese one and learning the co-occurrence sequence via LSTM networks. Our final result is 0.541(Spearman rank correlation coeffcient), which outperform the state-of-the-art performance to the best of our knowledge.

Team members: Jiahuan Pei, Cong Zhang

Please feel free to send us emails ([email protected]) if you have any problems.

[1] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[2] Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural computation, 12(10), 2451-2471.
[3] Graves, Alex. Supervised sequence labelling with recurrent neural networks. Vol. 385. Springer, 2012.
[4] Bastien, Frédéric, Lamblin, Pascal, Pascanu, Razvan, Bergstra, James, Goodfellow, Ian, Bergeron, Arnaud, Bouchard, Nicolas, and Bengio, Yoshua. Theano: new features and speed improvements. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2012.
[5] Bergstra, James, Breuleux, Olivier, Bastien, Frédéric, Lamblin, Pascal, Pascanu, Razvan, Desjardins, Guillaume, Turian, Joseph, Warde-Farley, David, and Bengio, Yoshua. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy), June 2010.

nlpcc-2016-wordsim's People

Contributors

zeco-01 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.