In this challenge, you will build a recommender that can recommend websites that a user should visit based on his/her browsing history.
You will be working with the 'Anonymous Microsoft Web Data Data Set' here.
The data is in an ASCII-based sparse-data format called "DST". Each line of the data file starts with a letter which tells the line's type. The three line types of interest are Attribute, Case and Vote. Each Attribute is a website, each Case is a user and each Vote is an Attribute that the user visited. For more details, please read the data description file for the structure of the data set.
- Assuming we are at a time such that we only have the training data, we want to recommend websites that the users should visit based on their user ID (case ID number). Please construct a recommender system, train it with the training data set and then conduct recommendation for the users given the user ID.
- Please also write the procedure to test your recommender with the test data set. Explain the metrics that you use for the testing.
Please also answer the following questions:
- What are the pros and cons of the recommendation algorithm you have used?
- How did you evaluate your recommender's performance? Why?
- Are you happy with you recommender's results? What could be a suitable baseline to compare your classifier's performance to?
- The challenge does not have 'the one' solution or answer. There are many ways to approach the task. Same holds true for the accompanying questions. Please motivate all the choices you have made.
- We have stated the task with many implicit and explicit requirements. If you cannot comply with any of these requirements, please state this and work around.
- We also value your input on how this challenge can be improved.
- Very important: we want to see how you think. Please write down all your thoughts, however preliminary. We much prefer that you discuss an issue without offering a solution, rather than not mentioning it.
Thank you very much for participating in the challenge. We are looking forward to discussing your solution. Feel free to reach out to Justin ([email protected]) if you have any questions!