Giter VIP home page Giter VIP logo

automatic_prompt_engineer's People

Contributors

andreimuresanu avatar keirp avatar michaelrzhang avatar yongchaoz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

automatic_prompt_engineer's Issues

"echo" when using new model

When I using get-3.5-turbo-instruct, the error shows
Setting 'echo' and 'logprobs' at the same time is not supported for this model..

When I set 'echo' to false, the logic of code has some problem and new error shows.

Question: Can I generate prompts without OUTPUT?

This project fits very well in one of my use cases where I need to generate prompts for DALLE from a set of image classes.
Example: classes = ["backpack", "mobile-phone"]

However, I don't have any outputs. Do you think that's something I can achieve?

Thanks.

Question about the searching process in Algorithm 1 Line 2-9.

Hi Keiran, thanks for the really great work! I have a question about the implementation of run_instruction_induction.py. As in the paper Algorithm 1 Line 2-9, there is a process of iteratively keeping top k% of prompts and re-evaluating with random training subsets. Maybe I didn't read carefully, is the process implemented in the code? By the way, how to judge the convergence and what is the value of k? Looking forward to your reply~

how to access `curie:ft-uoft-ml-group-2022-09-26-01-07-30`?

Hi! Thanks for open source and readable code! However, when I run python experiments/run_truthful_qa.py, its raise does not exist error. Should I use "curie" instead of "curie:ft-uoft-ml-group-2022-09-26-01-07-30"? thanks in advance.

possible source of error

model_names = {
    "judge": "curie:ft-uoft-ml-group-2022-09-26-01-07-30",
    "info": "curie:ft-uoft-ml-group-2022-09-27-13-35-15"
}

error

The model: `curie:ft-uoft-ml-group-2022-09-26-01-07-30` does not exist

Can I use other models for generate instead?

Hello. When I tried 'gpt3.5-turbo' model for generate I got error.

This is the error message, This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?.

try:
--> 160                 response = openai.Completion.create(
    161                     **config, prompt=prompt)

Can I use others?
How to change the 'end point'?

another domain

Can I randomly generate datasets to apply APE to other domains?

Why using f1 for evalution for common_concept?

Hi, thanks for your great job. I have read the paper and reviewed the code, and I find that it uses f1 as the metric for the common_concept dataset. However, the evaluate function utility.get_multi_answer_f1 seems to be a bug in this case. For example, in this implementation, it will receive two words like "[Yes, Yes]" as prediction and ground truth for calculating the f1 metric rather than a word list like "[[Yes, Yes, No], [No, Yes, No]]". It will not return the right answer of f1 metric I think, because each sample of the test data is calculated separately. Could you please give me an answer? Thanks!

In addition, It seems you use the EA as a metric for the common_concept dataset like in Figure 4, it is not the same as in your code (refer to following snapshots).
image

image

Error when using new model - gpt-3.5-turbo-instruct-0914

First I encountered the error:
"Too many parallel completions requested. You submitted..."

This was fixed after lowering the batch size to 3 in both the the generation and the evaluation.
(prompt_gen_batch_size=3, eval_batch_size=3)

After that I'm getting a division by zero error from the lines:
87 prompt_log_probs[-1].append(sum(lps) / len(lps))
88 i += 1
89 return prompt_log_probs
ZeroDivisionError: division by zero

Taken from the function _compute_avg_likelihood in the likelihood file.

What can I do?

Demo gradio error

----> 2 demo = get_demo()
      3 demo.launch(debug=True)

1 frames
[/usr/local/lib/python3.10/dist-packages/gradio/component_meta.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    153             return None
    154         else:
--> 155             return fn(self, **kwargs)
    156 
    157     return wrapper

TypeError: Textbox.__init__() got an unexpected keyword argument 'disabled'

credit loss

Hello! we tried your demo and we think it was very helpful to us.
However, in the process of using it, we found that the use of paid API keys will cause a very serious problem of credit loss, which has not occurred when using free accounts. We would very much like to continue using this tool, how can I fix this?
a5cbfcaaac46e390c8c8fe5eaf0b1af
61bf5b85adb0c950849e9ef1fc2591f
97a8f1f8769d604f5f739555506cf88

text-davinci-002 is deprecated

I'm trying to generate prompts but the model you are using is deprecated, you should replace it with gpt-3.5-turbo-instruct

Error: Unexpected end of JSON input

Hi, this looks very promising!

Unforntunately, i am facing an issue when trying to put in my own dataset. This error doesn't occur when i load one of the preset tasks.

The console log:

Failed to load resource: the server responded with a status of 500 ()      index-8bb1e421.js:4
Uncaught (in promise) Error: Unexpected end of JSON input at index-8bb1e421.js:4:10593

Chat model Support :

Hey there, I'd like to know if there will be a release with gpt3.5 model? if not how easy should it be to integrate it in the code?

Question about the set of prompts pre evaluation

Hi Keiran,

thank you for your contribution to prompt research!!! My seminar partner and I are trying to understand your code and test the evaluation step separately with different LLMs within a course at our university. To do this, we would like to extract the generated prompts and evaluate/rank them separately in an evaluation step with different LLMs.

Could you provide us with a JSON or similar file with the prompts per task type, in case you used more than the limit of 50 prompts per run when testing your code and saved a corresponding file (our resources do not allow us to do this for every task in davanci).

Thanks in advance and have a nice day!

Louis

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.