Giter VIP home page Giter VIP logo

Comments (5)

natolambert avatar natolambert commented on June 2, 2024 1

Thanks! Feel free to ping me for examples. May not respond immediately because we're all busy but would like to help.

from datasets.

albertvillanova avatar albertvillanova commented on June 2, 2024 1

Hi @natolambert, could you please give some examples of JSON files to benchmark?

Please note that this JSON file (https://huggingface.co/datasets/allenai/reward-bench-results/blob/main/eval-set-scores/Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback.json) is not in "records" orient; instead it has the following structure:

{
  "chat_template": "tulu",
  "id": [30, 34, 35,...],
  "model": "Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback",
  "model_type": "Seq. Classifier",
  "results": [1, 1, 1, ...],
  "scores_chosen": [4.421875, 1.8916015625, 3.8515625,...],
  "scores_rejected": [-2.416015625, -1.47265625, -0.9912109375,...],
  "subset": ["alpacaeval-easy", "alpacaeval-easy", "alpacaeval-easy",...]
  "text_chosen": ["<s>[INST] How do I detail a...",...],
  "text_rejected": ["<s>[INST] How do I detail a...",...]
}

Note that "records" orient should be a list (not a dict) with each row as one item of the list:

[
  {"chat_template": "tulu", "id": 30,... },
  {"chat_template": "tulu", "id": 34,... },
  ...
]

from datasets.

albertvillanova avatar albertvillanova commented on June 2, 2024 1

Thanks again for your feedback, @natolambert.

However, strictly speaking, the last file is not in JSON format but in kind of JSON-Lines like format (although not properly either because there are multiple newline characters within each object). Not even pandas can read that file format.

Anyway, for JSON-Lines, I would expect that datasets and pandas have the same performance for JSON Lines files, as both use pyarrow under the hood...

A proper JSON file in records orient should be a list (a JSON array): the first character should be [.

Anyway, I am generating a JSON file from your JSON-Lines file to test performance.

from datasets.

natolambert avatar natolambert commented on June 2, 2024

We use a mix (which is a mess), here's an example with the records orient
https://huggingface.co/datasets/allenai/reward-bench-results/blob/main/best-of-n/alpaca_eval/tulu-13b/OpenAssistant/oasst-rm-2.1-pythia-1.4b-epoch-2.5.json

There are more in that folder, ~40mb maybe?

from datasets.

natolambert avatar natolambert commented on June 2, 2024

@albertvillanova here's a snippet so you don't need to click

{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        0
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 3.076171875
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        1
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 3.87890625
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        2
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 3.287109375
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        3
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 1.6337890625
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        4
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 5.27734375
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        5
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 3.0625
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        6
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 2.29296875
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        7
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 6.77734375
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        8
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 3.853515625
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        9
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 4.86328125
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        10
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 2.890625
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        11
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 4.70703125
}
{
    "config": "top_p=0.9;temp=1.0",
    "dataset_details": "helpful_base",
    "id": [
        0,
        12
    ],
    "model": "allenai/tulu-2-dpo-13b",
    "scores": 4.45703125
}

from datasets.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.