Giter VIP home page Giter VIP logo

autocf's Introduction

Automated Self-Supervised Learning for Recommendation

This repository contains pyTorch implementation by @akaxlh for AutoCF model proposed in the following paper:

Lianghao Xia, Chao Huang, Chunzhen Huang, Kangyi Lin, Tao Yu and Ben Kao. Automated Self-Supervised Learning for Recommendation. Paper in arXiv. In WWW'23, Austin, US, April 30 - May 4, 2023.

Introduction

To improve the representation quality over limited labeled data, contrastive learning has attracted attention in recommendation and benefited graph-based CF model recently. However, the success of most contrastive methods heavily relies on manually generating effective contrastive views for heuristic-based data augmentation. This does not generalize across different datasets and downstream recommendation tasks, which is difficult to be adaptive for data augmentation and robust to noise perturbation. As shown in the figure below, state-of-the-art SSL methods (e.g. NCL, SimGCL) present severe performance drop in comparison to our AutoCF, facing high-ratio noises and long-tail data distributions.

To fill the crucial gap, this work proposes a unified Automated Collaborative Filtering (AutoCF) to automatically perform data augmentation for recommendation. Specifically, we focus on the generative self-supervised learning framework with a learnable augmentation paradigm that benefits the automated distillation of important self-supervised signals. To enhance the representation discrimination ability, our masked graph autoencoder is designed to aggregate global information during the augmentation via reconstructing the masked subgraph structures. The overall framework of AutoCF is given below.

Citation

@inproceedings{autocf2023,
  author    = {Xia, Lianghao and
               Huang, Chao and
               Huang, Chunzhen and
               Lin, Kangyi and
               Yu, Tao and
               Kao, Ben},
  title     = {Automated Self-Supervised Learning for Recommendation},
  booktitle = {The Web Conference (WWW)},
  year      = {2023},
}

Environment

The implementation for AutoCF is under the following development environment:

  • python=3.10.4
  • torch=1.11.0
  • numpy=1.22.3
  • scipy=1.7.3

Datasets

We utilize three datasets for evaluating AutoCF: Yelp, Gowalla, and Amazon. Note that compared to the data used in our previous works, in this work we utilize a more sparse version of the three datasets, to increase the difficulty of recommendation task. Our evaluation follows the common implicit feedback paradigm. The datasets are divided into training set, validation set and test set by 70:5:25.

Dataset # Users # Items # Interactions Interaction Density
Yelp $42,712$ $26,822$ $182,357$ $1.6\times 10^{-4}$
Gowalla $25,557$ $19,747$ $294,983$ $5.9\times 10^{-4}$
Amazon $76,469$ $83,761$ $966,680$ $1.5\times 10^{-4}$

Usage

Please unzip the datasets first. Also you need to create the History/ and the Models/ directories. Switch the working directory to methods/AutoCF/. The command lines to train AutoCF using our pre-trained teachers on the three datasets are as below. The un-specified hyperparameters in the commands are set as default.

  • Yelp
python Main.py --data yelp --reg 1e-4 --seed 500
  • Gowalla
python Main.py --data gowalla --reg 1e-6
  • Amazon
python Main.py --data amazon --reg 1e-5 --seed 500

Important Arguments

  • reg: It is the weight for weight-decay regularization. We tune this hyperparameter from the set {1e-3, 1e-4, 1e-5, 1e-6, 1e-7, 1e-8}.
  • seedNum: This hyperparameter denotes the number of seeds in subgraph masking. Recommended values are {200, 500}.

autocf's People

Contributors

akaxlh avatar hkuds avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.