Hi, I tried a cli command as follows: <div class="Box Box--condensed my-2"

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

dpo cli command error about trl HOT 4 OPEN

Galaxy-Husky commented on June 28, 2024

dpo cli command error

from trl.

Comments (4)

younesbelkada commented on June 28, 2024

Hi @Galaxy-Husky
Thanks for the issue, I think this is the wrong dataset format, can you try for example with this one: https://huggingface.co/datasets/trl-internal-testing/Anthropic-hh-rlhf-processed ?

from trl.

Galaxy-Husky commented on June 28, 2024

Hi @Galaxy-Husky Thanks for the issue, I think this is the wrong dataset format, can you try for example with this one: https://huggingface.co/datasets/trl-internal-testing/Anthropic-hh-rlhf-processed ?

Thank you for your prompt reply!
It works with the version 0.8.1. But It appears to be much smaller than the original dataset, which is Anthropic/hh-rlhf.
Is there a dataset of the same size as the original one?

from trl.

Yuancheng-Xu commented on June 28, 2024

@Galaxy-Husky Here? https://huggingface.co/datasets/trl-internal-testing/hh-rlhf-trl-style

from trl.

Galaxy-Husky commented on June 28, 2024

@Galaxy-Husky Here? https://huggingface.co/datasets/trl-internal-testing/hh-rlhf-trl-style

Thank you for your reply! But the version 0.8.1 only accepts this https://huggingface.co/datasets/trl-internal-testing/Anthropic-hh-rlhf-processed, which is different from trl-internal-testing/hh-rlhf-trl-style in format.

from trl.

dpo cli command error about trl HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent