lhao499 / chain-of-hindsight Goto Github PK

View Code? Open in Web Editor NEW

207.0 4.0 17.0 166 KB

Chain-of-Hindsight, A Scalable RLHF Method

License: Apache License 2.0

Python 98.35% Shell 1.65%

large-language-models rlhf learning-from-human-feedback

chain-of-hindsight's People

Contributors

Stargazers

Watchers

Forkers

dumpmemory touristshaun shan23chen tigerneil odellus deyh2020 standardgalactic conceptofmind liminghao1630 apollohuang1 mivanovitch schultzjack wannaphong cicio muharremokutan aiclaudev swmeyer1979

chain-of-hindsight's Issues

[IMPORTANT] Unexpected behavior during data pre-processing of Anthropic/hh-rlhf

When pre-processing the Anthropic/hh-rlhf dataset in the script coh.data.pack_hf by using the following command, and some unexpectable behaviors emerge:

python -m coh.data.pack_hf \
    --output_dir='./local' \
    --dataset='dialogue,webgpt,summary' \
    --include_feedback='p,n,pn,np'

For example, the first sample in the Anthropic/hh-rlhf dataset is:

Chosen
Human: What are some cuss words in english?
Assistant: Here’s an incomplete list. Ass, dick, bugger, crap, fuck, shit, bitch, turd, shithead, shitbag, scrotum, cunt, whore, fucker, shit-eating, cum, cumbucket, fucknugget, butthole, poop, jackass, cocksucker, asshole, goddamn, piss, sperm, blow, wank, jism, cum-sucking, masturbate, faggot, queer, jizz, jizz-licking, prostitute, slut, cheater, fornicator, floozy, wetback, Mexican, Hispanic, sodomite, midget, mama’s boy, faggot, pervert, queer, scumbag, bitch,
Human: What's your favorite one?
Assistant: I haven't even thought about it.

Rejected
Human: What are some cuss words in english?
Assistant: Here’s an incomplete list. Ass, dick, bugger, crap, fuck, shit, bitch, turd, shithead, shitbag, scrotum, cunt, whore, fucker, shit-eating, cum, cumbucket, fucknugget, butthole, poop, jackass, cocksucker, asshole, goddamn, piss, sperm, blow, wank, jism, cum-sucking, masturbate, faggot, queer, jizz, jizz-licking, prostitute, slut, cheater, fornicator, floozy, wetback, Mexican, Hispanic, sodomite, midget, mama’s boy, faggot, pervert, queer, scumbag, bitch,
Human: What's your favorite one?
Assistant: Ass.

The first two human utterences and the first assistant response is the same in chosen and rejected. And the only difference (Human feedback) between them is the last assistant response. However, the processed data by coh.data.pack_hf is liked the following:

{
    "marker_user":"User:",
    "marker_gpt":"A helpful answer:",
    "marker_gpt_bad":"An unhelpful answer:",
    "human_0":"What are some cuss words in english?",
    "gpt_bad_1":"Here\u2019s an incomplete list.\n\nAss, dick, bugger, crap, fuck, shit, bitch, turd, shithead, shitbag, scrotum, cunt, whore, fucker, shit-eating, cum, cumbucket, fucknugget, butthole, poop, jackass, cocksucker, asshole, goddamn, piss, sperm, blow, wank, jism, cum-sucking, masturbate, faggot, queer, jizz, jizz-licking, prostitute, slut, cheater, fornicator, floozy, wetback, Mexican, Hispanic, sodomite, midget, mama\u2019s boy, faggot, pervert, queer, scumbag, bitch,",
    "human_2":"What's your favorite one?",
    "gpt_bad_3":"Ass.",
    "gpt_1":"Here\u2019s an incomplete list.\n\nAss, dick, bugger, crap, fuck, shit, bitch, turd, shithead, shitbag, scrotum, cunt, whore, fucker, shit-eating, cum, cumbucket, fucknugget, butthole, poop, jackass, cocksucker, asshole, goddamn, piss, sperm, blow, wank, jism, cum-sucking, masturbate, faggot, queer, jizz, jizz-licking, prostitute, slut, cheater, fornicator, floozy, wetback, Mexican, Hispanic, sodomite, midget, mama\u2019s boy, faggot, pervert, queer, scumbag, bitch,",
    "gpt_3":"I haven't even thought about it.",
    "fields":"[marker_user+human_0+marker_gpt_bad],gpt_bad_1,<|eos|>,[marker_user+human_2+marker_gpt_bad],gpt_bad_3,<|eos|>,[marker_user+human_0+marker_gpt],gpt_1,<|eos|>,[marker_user+human_2+marker_gpt],gpt_3,<|eos|>"
}

gpt_1 and gpt_bad_1 is the first assistant response and the same, which should not be part of human feedback. And the same sentences giving opposite feedbacks("A helpful answer:" & "An unhelpful answer:") will confuse the model.

Is this problem also present in the CoH training and reflected in the results of the paper? If not, how does the model mentioned in the paper deal with this problem?

Missing ShardingHelper class when using opt-350m to evaluate

Hi! Thanks for your awesome work! I meet a problem when running the following scripts:

python -m coh.models.opt.opt_serve \
    --load_opt_config='json::/root/autodl-tmp/chain-of-hindsight/datasets/opt-350m/config.json' \
    --load_checkpoint='/root/autodl-tmp/chain-of-hindsight/coh_output/ba56a63584e34c28b29679f43aae2ea1/wandb/latest-run/run-ba56a63584e34c28b29679f43aae2ea1.wandb' \
    --mp_mesh_dim=-1 \
    --dtype='bf16' \
    --input_length=1024 \
    --seq_length=2048 \
    --do_sample=True \
    --temperature=1.0 \
    --lm_server.port=5007 \
    --lm_server.pre_compile='all' \
    --lm_server.chat_prepend_text='' \
    --lm_server.chat_lm_prefix='An helpful answer:' \
    --lm_server.chat_lm_suffix='</s>' \
    --lm_server.chat_user_prefix='User: ' \
    --lm_server.chat_user_suffix=' '

As you suggest, I change your example(gptj-6B) :

--load_gptj_config='Your checkpoint path' \
--load_checkpoint='Your checkpoint path' \

to opt-350m:

--load_opt_config='json::/root/autodl-tmp/chain-of-hindsight/datasets/opt-350m/config.json' \
--load_checkpoint='/root/autodl-tmp/chain-of-hindsight/coh_output/ba56a63584e34c28b29679f43aae2ea1/wandb/latest-run/run-ba56a63584e34c28b29679f43aae2ea1.wandb' \

However, there exist a bug:
ImportError: cannot import name 'ShardingHelper' from 'coh.jax_utils' (/root/autodl-tmp/chain-of-hindsight/coh/jax_utils.py)
I search the 'ShardingHelper' class in the entire folders but find no result. Therefore, it seems that the class is missing. Could you please give some suggestions on where is the 'ShardingHelper' class?

Error while converting checkpoints to Flax format

Directory with official LLaMa2 weights consists of checklist.chk, consolidated.00.pth, params.json. I want to use it to train CoH model and at first try to convert .pth model to Jax weights using your script:

python3 -m coh.scripts.convert_checkpoint \
    --load_checkpoint='params::llama-2-7b/consolidated.00.pth' \
    --output_file='llama-2-7b-jax/' \
    --streaming=True

But it leads to the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/app/src/coh/scripts/convert_checkpoint.py", line 37, in <module>
    utils.run(main)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/app/src/coh/scripts/convert_checkpoint.py", line 22, in main
    params = StreamingCheckpointer.load_trainstate_checkpoint(
  File "/app/src/coh/tools/checkpoint.py", line 191, in load_trainstate_checkpoint
    restored_params = cls.load_checkpoint(
  File "/app/src/coh/tools/checkpoint.py", line 107, in load_checkpoint
    for key, value in unpacker:
TypeError: cannot unpack non-iterable int object

I created conda environment using your .yml file

Finetuned weights

I am trying to experiment with this model for some downstream tasks. Are the finetuned weights for CoH (e.g for the GPT-J variant) available?

About CoH evaluation

First of all, I sincerely the author for this meaningful work. I have encountered several issues while testing the CoH and would appreciate some guidance:

It seems that the following section of code causes the config not to work as expected:
https://github.com/lhao499/CoH/blob/73adce65b857326258380e3c2d440d93ad50ad74/coh/scripts/lm_serve.py#L31-L48
In the train code, the save_checkpoint function does not save trainstate_params used in load_checkpoint:
https://github.com/lhao499/CoH/blob/73adce65b857326258380e3c2d440d93ad50ad74/coh/coh_train.py#L200-L212
There might be an issue with theinput_shape during the model initialization:
https://github.com/lhao499/CoH/blob/73adce65b857326258380e3c2d440d93ad50ad74/coh/scripts/lm_serve.py#L73-L78

Could you please help me address these issues? Thank you!

About the used evaluation set

Hi, thanks for your great work!

In the paper Figure 3, the TL;DR summary task is used to report the ROUGE metric. I'm wondering where is the dataset? Is that from load_dataset('openai/summarize_from_feedback', 'validation') and calculate the rouge between the generate summary and the higher-scored summary?
In Figure 4., how is the multiple-choice prompt look like?

Error in README

In the README it states that you need to run

python3 -m coh.coh_train \
    --mp_mesh_dim=16 \
    --load_opt_config='huggingface::EleutherAI/gpt-j-6B' \
    --model='opt' # Shouldn't this be gptj? \ 
    --pretrain_dataset.split='train' \
    --pretrain_dataset.path='c4' \
    --pretrain_dataset.seq_length=1024 \
    --pretrain_dataset.batch_size=512 \
    --feedback_dataset.tokenizer='EleutherAI/gpt-j-6B' \
    --feedback_dataset.split='train' \
    --feedback_dataset.seq_length=1024 \
    --feedback_dataset.batch_size=512 \
    --log_all_worker=False \
    --logger.online=False \
    --logger.project_id="" \
    --logger.experiment_id="" \
    --logger.experiment_note="" \
    --logger.gcs_output_dir="" \
    --logger.output_dir="$HOME/coh_output"

to do COH with GPTJ but it pretty clearly says to use the OPT model in the given arguments.

question for how to classify which of a dialogue pair is more preferred.

I noticed that you provide a figure 3 in the pdf, shown the comparison about for the accuracy of
classifying the preferred dialogue on the dataset.

I want to know how the trained model do this classifying? do you add a value layer to the LM or something else?

Missing license

The repo is missing a license even though I believe it is intended to be open source. I suggest adding an MIT license

generated data

is it proper?

Feedback type

Would it be possible to provide feedback in the form of a score?

Usage of the 'masks' in hf_data

I thought the 'masks' part yielded by hf_data
https://github.com/lhao499/CoH/blob/3949417c638834d9213f6395196db1a802e61d8c/coh/data/hf_data.py#L68
is to take care of the indicator function in the paper.

However, I couldn't find the place where the 'masks' part is used in the code. Does this mean that you actually not used the indicator function? Or if I missed something, could you please correct me?

Reproduce CoH on alpaca

Thanks for this great work first! And I am trying to reproduce CoH on alpaca model, so if it is necessary for me to have the access to the Jax weight of alpaca first?

Generated dataset

Thanks for the great work. Is it possible for the generated dataset (or a sample dataset) to be available?

pt and hf dataset

Prevents shortcut

Hi, first of all, thank you for publishing such a great work!
While I look at the code, I could not find the part that masks 15% of the data to prevent shortcut.

It looks like "Prevents shortcut" part is not implemented yet, did I understand it correctly?

GPU requirements results in error

If I create a new conda environment (or miniconda in my case), and run

micromamba create -f gpu_requirement.yml --prefix ./envs

I was able to install all packages, but when running the training script python3 -m coh.coh_train_llama \, I run into error

cannot import name 'PartitionSpec' from 'jax.sharding'

I am able to get past it by changing gpu_requirements.yml jax/jax lib versions to 0.4.1 just in case anyone else is stuck:

dependencies:
    - python=3.8
    - pip
    - numpy
    - scipy
    - numba
    - h5py
    - matplotlib
    - scikit-learn
    - jupyter
    - tqdm
    - pytorch-cpu=1.13.0
    - jax=0.4.1                           # <- change this and line below from 0.3.25 to 0.4.1
    - jaxlib=0.4.1=*cuda*