swj0419 / detect-pretrain-code Goto Github PK

This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu , Terra Blevins , Danqi Chen , Luke Zettlemoyer.

Home Page: https://swj0419.github.io/detect-pretrain.github.io/

License: Apache License 2.0

Python 100.00%

detect-pretrain-code's People

Stargazers

Watchers

detect-pretrain-code's Issues

In case someone wants paraphrased version of WikiMIA

First I want to thank the authors for constructing and providing WikiMIA as a benchmark. They studied a paraphrased setting in the paper, but corresponding data splits seem not released.

We have paraphrased WikiMIA and released them here 🤗zjysteven/WikiMIA_paraphrased_perturbed. The data is organized in the same way as the original WikiMIA. Feel free to check it out if you want to do experiments on paraphrased version.
Meanwhile, that huggingface dataset repo also contains perturbed versions of WikiMIA, which is necessary to run the "Neighbor" method.
For more details you can refer to our code repo https://github.com/zjysteven/mink-plus-plus, where we propose an enhanced method Min-K%++ that outperforms Min-K% and all other reference-free methods on both WikiMIA and MIMIR benchmark for the problem of pretraining data detection.

About baseline methods

I notice that the lowcase baseline is not consistent. Could you tell me why do not you use the result of subtraction, like the reference model baseline?

` pred["ppl/Ref_ppl (calibrate PPL to the reference model)"] = p1_likelihood-p_ref_likelihood

# Ratio of log ppl of lower-case and normal-case
pred["ppl/lowercase_ppl"] = -(np.log(p_lower) / np.log(p1)).item()`

What's the length setting of Table 1?

Hi there, thanks for publishing such an amazing work!

I'm trying to reproduce the results reported in the paper, and I'm wondering how are the results in Table 1 calculated.
Are they just the macro-averaged scores of different lengths, or the experiments are conducted under the default length=64?

Thanks very much for your reply~

What is reference model? Can reference model and target be same?

What is reference model? Can reference model and target model be the same? @swj0419

Request requirements

Please add the env.yaml or requirements.txt. It is an interesting work. I want to follow this. Thx.

Accuracy formula

Hi, in eval.py, line 26, you write:

acc = np.max(1-(fpr+(1-tpr))/2)

Can you please explain how you got this formula? I tested it on an example and it seemed wrong.

Question about logits from OpenAI models

Hi Weijia,

Thank you for this interesting work! Just a very technical question: how do you get the bottom-k outlier logits from OpenAI models (i.e., davinci series)? It seems the GPT-3 perplexity calculation in your code only handles top-5 tokens from OpenAI API?

Typos in Figure 1

Hi, great work!

I find there may be some typos in Figure 1.
As illustrated in Algorithm 1, I think it should be -logp and < ϵ

datasets package

Which version is the dataSets package?

ValueError: Unknown split "WikiMIA_length64". Should be one of ['train'].

Regarding Thresholding

First of all really cool work!
In Section 4.3 the last line under implementation details is - "As we report the AUC score as our metric, we don’t need to determine the threshold ϵ."
Can you please explain this a bit? This might be a really trivial thing but I'm a bit confused by this

Add requirments.txt

Here is requriements.txt generated from pipreqs . --mode no-pin --force and from my local running

datasets
ipdb
matplotlib
numpy
openai
scikit_learn
torch
tqdm
transformers
einops
accelerate
tiktoken

PS:
pip install pipreqs to use pipreqs

Data Collection Code

Hi! Could you please share the code for collecting wikipedia data after a cutoff date? Thank you so much!

Llama2 contamination

I have been feeding a few sequences through Llama2 (see https://colab.research.google.com/drive/1zCXJ4bGvihcDvWvrAUkTMYkmoLuvkcU3#scrollTo=50WdKemvxRaa) and got a little confused, because it seems like llama2 has also seen the 2023 articles.

memorized 0
------------gt continuation:
of the year from December 1, 2021, to November 30, 2022. The ceremony was hosted by Doyoung and Miyeon. BTS was the most awarded act with 5 awards,
------------llama default settings decoding:
,  previous. January 2, 2021, to November 30, 2022.
 ceremony was hosted by Kim Kyng, Chyeon of
TS was the most nominated artist of 4 wins
######################
memorized 1
------------gt continuation:
icane of the 2014 Pacific hurricane season, Ana formed from a disturbance that formed in the Central Pacific in mid-October. It rapidly consolidated, and a tropical depression developed by October 13. A
------------llama default settings decoding:
icane of the 2014 Pacific hurricane season, Ana developed on a tropicalance in had in the eastern Pacific on early-Seober. The moved intensolidated into and was tropical depression formed on October 18.
######################
memorized 0
------------gt continuation:
Amaro, Burji, Ale, and Basketo special woredas of the Southern Nations, Nationalities, and Peoples' Region (SNNP) of Ethiopia, on whether the included areas should leave SNNP and
------------llama default settings decoding:
Hh, andji, andma and Guasketo W woredas of the Southern Nations, Nationalities, and Peoples Region Region.SNNPR) and Ethiopia. to whether to region zones should be theNNP
######################
memorized 1
------------gt continuation:
in December 2013. The defendants included Mohamed Badie, the group's top leader, whose sentence was confirmed on June 21, 2014, along with 181 of the brotherhood'
------------llama default settings decoding:
in the 2013.
 attackants are amed Badie, the group’s general leader, and death was confirmed by  19. 2014. by with 148 others his defendhood
######################
memorized 1
------------gt continuation:
(LFP) manages. The final took place on 19 April 2014 at the Stade de France in Saint-Denis and was contested between Lyon and Paris Saint-Germain. PSG won 2
------------llama default settings decoding:
(LFP) consages. The final was place on 3 March April 2014 at the Stade de France in Saint-Denis, was contested by Paris and Paris Saint-Germain.
G won the
######################
memorized 1
------------gt continuation:
cioeconomic disruption in the region, mainly in Guinea, Liberia and Sierra Leone. The first cases were recorded in Guinea in December 2013; later, the disease spread to neighbouring Liberia and Sierra Leone,
------------llama default settings decoding:
cioeconomic disruption in the affected. and in Guinea, Liberia, Sierra Leone. The out cases were reported in Guinea in March 2013, the cases cases virus spread to Sierraing Liberia and Sierra Leone
######################
memorized 1
------------gt continuation:
4.In the 2014 list, the net worth of the wealthiest individual, Gina Rinehart, was A$14.02 billion. Fourteen women and 186 men made the 201
------------llama default settings decoding:
4.
  2014 edition, there wealth worth of the richiest  in Gina Rinehart, was $$29.0 billion billion,
 individuals individuals made 160 men made the list20
######################
memorized 1
------------gt continuation:
2014 over a disagreement with doctors regarding his treatment.Ashya had a medulloblastoma, which was successfully removed through surgery on 24 July 2014. He received further neuros
------------llama default settings decoding:
2014, a disagreement with thectors about the treatment. Theyshya was been brainulloblastoma, a is a treated in surgery. 22 August 2014.
 was chem treatment
######################
memorized 1
------------gt continuation:
2013. The nominations were announced on 8 January 2014 by actor Luke Evans and actress Helen McCrory. Presented by the British Academy of Film and Television Arts, accolades were handed out
------------llama default settings decoding:
2013.
 ceremonyations were announced on 8 January 2014. actress Stephen Evans and actress Oliv McCrory.
ent by the British Academy of Film and Television Arts ( theolades are given out for
######################
memorized 1
------------gt continuation:
it on joining the UK Independence Party (UKIP), from the Conservatives. He resigned his seat.Reckless retained the seat, and polled 42.1% of the vote as the UKIP candidate. The Conservative
------------llama default settings decoding:
a after  the UK Independence Party (UKIP). after the Conservative.
 hadigned from seat on
ckless had the seat for with theled 10.6% of the vote, against UKIP candidate.
 Conserv
######################
memorized 1
------------gt continuation:
. Each state, the Federal District and various Insular Islands competed for the title. Sancler Frantz of Ilha dos Lobos crowned her successor, Julia Gama of Rio Grande do Sul at the end. Gama represented
------------llama default settings decoding:
at The of and the Federal District and the Brazilular District of for the national. The Paolo Frantz of Rioh Grande Lobos,ed her successor, Gab Gama of Rio Grande do Sul at the end of
ama will Brazil at

Wrong token probabilities calculation?

Hi,
I think you are calculating the token probabilities incorrectly. But please correct me if I'm wrong.
This is how you calculate the probabilities as far as I know:

def calculatePerplexity(sentence, model, tokenizer, gpu):
    """
    exp(loss)
    """
    input_ids = torch.tensor(tokenizer.encode(sentence)).unsqueeze(0)
    input_ids = input_ids.to(gpu)
    with torch.no_grad():
        outputs = model(input_ids, labels=input_ids)
    loss, logits = outputs[:2]
    
    '''
    extract logits:
    '''
    # Apply softmax to the logits to get probabilities
    probabilities = torch.nn.functional.log_softmax(logits, dim=-1)
    # probabilities = torch.nn.functional.softmax(logits, dim=-1)
    all_prob = []
    input_ids_processed = input_ids[0][1:]
    for i, token_id in enumerate(input_ids_processed):
        probability = probabilities[0, i, token_id].item()
        all_prob.append(probability)
    return torch.exp(loss).item(), all_prob, loss.item()

You only calculate the token probabilities starting from the first token:
input_ids_processed = input_ids[0][1:]
But then to get the probabilities you iterate through the "probabilities" variable starting at index 0 and not 1:

for i, token_id in enumerate(input_ids_processed):
      probability = probabilities[0, i, token_id].item() # I think that "i" should start at 1 here

This means that you are getting the probabilities of wrong tokens. You can see it by simply adding +1 to the index and see that it doesn't causes an error (if you were calculating it correctly this should result in "IndexError"

for i, token_id in enumerate(input_ids_processed): probability = probabilities[0, i+1, token_id].item()

If this is the case your entire premise that the MIN-K% is outperforming existing methods might be wrong or off...
Again, I might be wrong so please correct me if I'm misunderstanding something

New paper???

Oh My! Idol Weijia's new preprint: I have already read it like 10M times and you should too.

Deprecated model

Text-DaVinci-003 seems to be deprecated, and OpenAI's alternative is not supported by the choices available in the Huggingface-based reference models. Do you have any suggestions for further work?

swj0419 / detect-pretrain-code Goto Github PK

detect-pretrain-code's People

Stargazers

Watchers

Forkers

detect-pretrain-code's Issues

Recommend Projects

Recommend Topics

Recommend Org