Giter VIP home page Giter VIP logo

medium-claps-rnn's Introduction

HU logo

HU Kaggle Competition – Medium Claps Prediction

For the Advanced Data Analytics for Management Support course at the chair of Information Systems, Humboldt-Universität zu Berlin, students participated in an in-class data science competition on Kaggle. The target was to predict, using ~279.5k rows of labelled data, the count of claps an article had received on the publishing platform Medium.com. Features in the data were limited to basic details about a given article: publish date; author; publisher; number of words; and, foremost, the article and teaser text. The regression task of predicting claps was performed using a recurrent neural network (GRU, specifically), in order to capture the sequential nature of text data and thereby account for contextual meaning in an article.

In preparing the text as input for the RNN, natural language processing methods were employed to unify the heterogeneous article structures / idiosyncrasies. This might include recurring punctuation errors, HTML tags, words / characters from different languages / alphabets, etc. In testing and Kaggle submissions, accuracy was assessed via mean squared error (MSE).

Final code can be seen in the file 'medium_net.ipynb'. All code was written in Python within the Jupyter Notebook environment via Google's Colab env., offering free, base-level cloud computing resources (incl. GPU access).

My efforts earned me position 3 of 20 in the class.

Kaggle Competition

medium-claps-rnn's People

Contributors

alextruesdale avatar

Watchers

 avatar Jana Vihs avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.