Giter VIP home page Giter VIP logo

amazonreviews2023's Introduction

Amazon Reviews 2023

[馃寪 Website] 路 [馃 Huggingface Datasets] 路 [馃搼 Paper] 路 [馃敩 McAuley Lab]


This repository contains:

Recommendation Benchmarks

Based on the released Amazon Reviews 2023 dataset, we provide scripts to preprocess raw data into standard train/validation/test splits to encourage benchmarking recommendation models.

More details here -> [datasets & processing scripts]

BLaIR

BLaIR, which is short for "Bridging Language and Items for Retrieval and Recommendation", is a series of language models pre-trained on Amazon Reviews 2023 dataset.

BLaIR is grounded on pairs of (item metadata, language context), enabling the models to:

  • derive strong item text representations, for both recommendation and retrieval;
  • predict the most relevant item given simple / complex language context.

More details here -> [checkpoints & code]

Amazon-C4

Amazon-C4, which is short for "Complex Contexts Created by ChatGPT", is a new dataset for the complex product search task.

Amazon-C4 is designed to assess a model's ability to comprehend complex language contexts and retrieve relevant items.

More details here -> [datasets & code]

Contact

Please let us know if you encounter a bug or have any suggestions/questions by filling an issue or emailing Yupeng Hou (@hyp1231) at [email protected].

Acknowledgement

If you find Amazon Reviews 2023 dataset, BLaIR checkpoints, Amazon-C4 dataset, or our scripts/code helpful, please cite the following paper.

@article{hou2024bridging,
  title={Bridging Language and Items for Retrieval and Recommendation},
  author={Hou, Yupeng and Li, Jiacheng and He, Zhankui and Yan, An and Chen, Xiusi and McAuley, Julian},
  journal={arXiv preprint arXiv:2403.03952},
  year={2024}
}

The recommendation experiments in the BLaIR paper are implemented using the open-source recommendation library RecBole.

The pre-training scripts refer a lot to huggingface language-modeling examples and SimCSE.

amazonreviews2023's People

Contributors

hyp1231 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    馃枛 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 馃搳馃搱馃帀

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google 鉂わ笍 Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.