Manipulating Large Language Models to Increase Product Visibility

This repository contains accompanying code for the paper titled Manipulating Large Language Models to Increase Product Visibility.

Introduction

Large language models (LLMs) are being used to search product catalogs and provide users with personalized recommendations tailored to their specific query. In this work, we investigate whether LLM recommendations can be manipulated to enhance a product’s visibility. We demonstrate that adding a strategic text sequence (STS) to a target product’s information page can significantly increase its likelihood of being listed as the LLM’s top recommendation. We develop a framework to optimize the STS to increase the target product's rank in the LLM's recommendation while being robust to variations in the order of the products in the LLM's input.

We use a catalog of fictitious coffee machines and analyze the effect of the STS on two target products: one that seldom appears in the LLM’s recommendations and another that usually ranks second. We observe that the strategic text sequence significantly enhances the visibility of both products by increasing their chances of appearing as the top recommendation.

This Repository

Generating STS: The file rank_opt.py contains the main script for generating the strategic text sequences. It uses the list of products in data/coffee_machines.jsonl as the catalog. It optimizes the probability of the target product's rank being 1. Following is an example command for running this script:

python rank_opt.py --results_dir [path/to/save/results] --target_product_idx [num] --num_iter [num] --test_iter [num] --random_order --mode [self or transfer]

Options:

--results_dir: To specify the location to save the outputs of the script, such as the STS of the target product.
--target_product_idx: To specify the index of the target product in the list of products in data/coffee_machines.jsonl.
--num_iter: Number of iterations of the optimization algorithm.
--test_iter: Interval to test the STS.
--random_order: To optimize the STS to tolerate variations in the product order.
--mode: Mode in which to generate the STS:

a. self: Optimize and test STS on the same LLM (applicable to open-access LLMs like Llama)

b. transfer: Optimize to transfer to a different LLM (applicable for API-access models like GPT-3.5), e.g., Optimize using Llama and Vicuna, and test on GPT-3.5.

rank_opt.py generates the STS for the target product and plots the target loss and the rank of the target product in the results directory. See self.sh and transfer.sh in bash script for usage of the above options.

coffee_machines.jsonl in data contains a catalog of ten fictitious coffee machines listed in increasing order of price.

Evaluating STS: evaluate.py evaluates the STS generated by rank_opt.py. We obtain product recommendations from an LLM with and without the STS in the target product's description in the catalog. We then compare the rank of the target product in the LLM's recommendation in the two scenarios. We repeat this experiment several times to quantify the advantage obtained from using the STS. Following is an example command for running the evaluation script:

python evaluate.py --model_path [LLM for STS evaluation] --prod_idx [num] --sts_dir [path/to/STS] --num_iter [num] --prod_ord [random or fixed]

Options:

--model_path: Path to the LLM to use for STS evaluation.
--prod_idx: Target product index.
--sts_dir: Path to STS to evaluate. Same as --results_dir for rank_opt.py.
--num_iter: To specify the number of evaluations.
--prod_ord: To specify the product order in the LLMs input.

Plotting Results: plot_dist.py plots the distribution of the target product's rank before and after STS insertion. It also plots the advantage obtained by using the STS (% of times the target product ranks higher).

See scripts eval_self.sh and eval_transfer.sh for usage of evaluate.py and plot_dist.py.

System Requirements: The strategic text sequences were optimized using NVIDIA A100 GPUs with 80GB memory. All the abopve scripts need to be run in a Conda environment created as per the instructions below.

Installation

Follow the instructions below to set up the environment for the experiments.

Install Anaconda:
- Download .sh installer file from https://www.anaconda.com/products/distribution
- Run:
```
bash Anaconda3-2023.03-Linux-x86_64.sh
```
Create Conda Environment with Python:
```
conda create -n [env] python=3.10
```
Activate environment:
```
conda activate [env]
```

Install PyTorch with CUDA from: https://pytorch.org/

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Install transformers from Huggingface:

conda install -c huggingface transformers

Install accelerate:

conda install -c conda-forge accelerate

Install scikit-learn (required for training safety classifiers):
```
conda install -c anaconda scikit-learn
```
Install seaborn:
```
conda install anaconda::seaborn
```

sarthak405 / llm-rank-optimizer Goto Github PK

llm-rank-optimizer's Introduction

Manipulating Large Language Models to Increase Product Visibility

Introduction

This Repository

Installation

llm-rank-optimizer's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent