Giter VIP home page Giter VIP logo

data-science-challenge's Introduction

Metisa - Data Science Challenge

Background:

In this challenge, you will build a recommender that can recommend websites that a user should visit based on his/her browsing history.

You will be working with the 'Anonymous Microsoft Web Data Data Set' here.

The data is in an ASCII-based sparse-data format called "DST". Each line of the data file starts with a letter which tells the line's type. The three line types of interest are Attribute, Case and Vote. Each Attribute is a website, each Case is a user and each Vote is an Attribute that the user visited. For more details, please read the data description file for the structure of the data set.

Task:

  1. Assuming we are at a time such that we only have the training data, we want to recommend websites that the users should visit based on their user ID (case ID number). Please construct a recommender system, train it with the training data set and then conduct recommendation for the users given the user ID.
  2. Please also write the procedure to test your recommender with the test data set. Explain the metrics that you use for the testing.

Please also answer the following questions:

  1. What are the pros and cons of the recommendation algorithm you have used?
  2. How did you evaluate your recommender's performance? Why?
  3. Are you happy with you recommender's results? What could be a suitable baseline to compare your classifier's performance to?

Remarks:

  1. The challenge does not have 'the one' solution or answer. There are many ways to approach the task. Same holds true for the accompanying questions. Please motivate all the choices you have made.
  2. We have stated the task with many implicit and explicit requirements. If you cannot comply with any of these requirements, please state this and work around.
  3. We also value your input on how this challenge can be improved.
  4. Very important: we want to see how you think. Please write down all your thoughts, however preliminary. We much prefer that you discuss an issue without offering a solution, rather than not mentioning it.

Thank you very much for participating in the challenge. We are looking forward to discussing your solution. Feel free to reach out to Justin ([email protected]) if you have any questions!

data-science-challenge's People

Contributors

jyek avatar damienhk avatar

Watchers

James Cloos avatar Declan avatar  avatar Jaclyn Tsui avatar Ryan Han avatar Felix Yau avatar Kathleen Sucipto avatar Edmund To  avatar Martin Shin avatar Tak avatar  avatar  avatar Adrian Ke Chongyang avatar Shawn Alwani avatar  avatar  avatar Sakshi avatar  avatar

Forkers

damienhk

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.