Giter VIP home page Giter VIP logo

instructor's Introduction

💥 whats up?

Currently working as an independent consultant. I use my expertise in recommendation systems to helps fast-growing startups build out their RAG applications. I am also the creator of Instructor, Flight, and an ML and data science educator.

Support

I want to support me, you can sponsor me on github. I don't want to start a substack, but I do want to write more. So this will fund my morning coffee and tea.


  • Creator
  • Sabbatical @ South Park Commons - 2023 - Present
  • Staff Machine Learning Engineer @ Stitchfix — 2016, 2018-2023
  • Prev, Meta, ActionIQ, NYU, Meltwater - 2013-2018
  • Computational Mathematics and Statistics @ University of Waterloo

Writing

Systems

Talks and Podcasts

instructor's People

Contributors

aastroza avatar aiexanderdicke avatar anmol6 avatar cristobalcl avatar daaniyaan avatar daveokpare avatar ethanleifer avatar fpingham avatar gao-hongnan avatar ivanleomk avatar jlondonobo avatar jxnl avatar lakshyaag avatar leobeeson avatar medott29 avatar phibrandon avatar phodaie avatar rgbkrk avatar ryanhalliday avatar samiur avatar savarin avatar shanktt avatar skrawcz avatar smuotoe avatar stepheni12 avatar tadash10 avatar tavisrudd avatar tedfulk avatar toolittlecakes avatar zby avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

instructor's Issues

create func of ChatCompletion does not return completion if self.function is None

in dsl/completion.py shouldn't create return completion?

def create(self):
"""
Create a chat response from the OpenAI API

    Returns:
        response (OpenAISchema): The response from the OpenAI API
    """
    kwargs = self.kwargs
    completion = openai.ChatCompletion.create(**kwargs)
    if self.function:
        return self.function.from_response(completion)
   **return completion**

support for completions endpoint

Is your feature request related to a problem? Please describe.
The recent -instruct models are instruction tuned rather than dialogue tuned and should be very useful for most use cases of this library.

class UserDetail(BaseModel):
    name: str
    age: int

user: UserDetail = openai.Completion.create(
    model="gpt-3.5-turbo-instruct",
    response_model=UserDetail,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ]
)

This should work.

Describe the solution you'd like
Patch should also patch the openai.Completion.create method.

The multi classification does not actually work as intended.

Describe the bug
The multi classification does not actually work as intended.

To Reproduce
I copy paste the example for multi prediction and the outputs result in all the labels being predicted always. No matter the classes declared, the prompt used, the result is the same. All classes are predicted.

examples/classification/multi_prediction.py

Small error in `openai_function`

Was getting error:
AttributeError: 'openai_function' object has no attribute 'schema'

Fixed by changing line 30 to:
assert message["function_call"]["name"] == self.openai_schema["name"], "Function name does not match"

Thanks for putting this up, this code is super useful.

Weird usecase where pydantic model has field that represents code but gets invalid json characters, failing model_validate_json

Is your feature request related to a problem? Please describe.
I have a weirdish use case, where one of the fields of the pydantic model represents code.
The code is often returned with a bunch of invalid json characters in it, like control characters (\u0000-\u001F).

This makes instructor fail on errors like this:
File "/opt/homebrew/lib/python3.11/site-packages/pydantic/main.py", line 530, in model_validate_json return cls.__pydantic_validator__.validate_json(json_data, strict=strict, context=context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pydantic_core._pydantic_core.ValidationError: 1 validation error for RustCode Invalid JSON: control character (\u0000-\u001F) found while parsing a string at line 4 column 0 [type=json_invalid, input_value='\n{\n"generated_code": "...xample_output": "37"\n}', input_type=str] For further information visit https://errors.pydantic.dev/2.4/v/json_invalid joelkronander@MacBook-Pro-5 swissknife %

Describe the solution you'd like
Maybe one could handle cases like this with some form of "pre-validators" that could for example run byte64 encoding on those non-json compatible strings? Not sure how it would fit in exactly.

Additional context
Instructor is nice.

Base example doesnt work?

Hi jason, watched your Pydantic talk and thought I'd check it out. Seems like a fantastic idea but on openai==1.1.0 and instructor==0.3.0 raises a TypeError. This of course does not arise when using the "unpatched" openai client and sending the request, without the response_model kwarg

user = client.chat.completions.create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'classmethod' object is not callable

thanks! and great talk

Changing Patch behavior

I think there are a few ways to add the response_model and other capabilies.

Monkey Patch Global

import instructor

instructor.patch()

resp = openai.ChatComplete.create(..., response_model=Model)
assert isinstance(resp, Model)

Monkey Patch Context

with instructor.patch():
      resp = openai.ChatComplete.create(..., response_model=Model)
      assert isinstance(resp, Model)

Import Custom SDK

from instructor import client (#as openai)

resp = client.ChatComplete.create(..., response_model=Model)
assert isinstance(resp, Model)

I think we need to be concious of how other tools also patch the client.

Doc improvement: *why* would one use distillation?

I was reading through the docs and saw https://jxnl.github.io/instructor/distillation/ . The page explains the "what" and the "how", but not the "why" - I assume this feature caters to some usecases, but it's not clear to me at all what those would be? The examples given seem like a ridiculously bad idea - replacing instantaneous, deterministic on-device calculations with slow, prone to hallucination api calls? Why would I ever want to use an LLM to perform simple math? I get they're just examples, but maybe it would be nice to have a paragraph explaining real usecases for this.

Decoupling the llm backend

Is your feature request related to a problem? Please describe.
I see the library is tightly coupled with openai function calling. but it would be good to decouple the model from pydantic way of doing things and use any model (llms from langchain) that way we can experiment with smaller/self-hosted/other cloud models

Describe the solution you'd like
ability to pass pydantic structures to any llm and get results back. for eample, something like using langchain tools where function calling is isolated from llm.

Describe alternatives you've considered
custom tools in langchain implementation for function calling

Additional context
not sure its already possible. I haven't experimented yet, but it looks like its coupled based on the repo subtitles /examples

Function does not obey Enums

Describe the bug
I set a enum for one of the function inputs.
I have a pydantic class that refers to the enum.
The output args show that the enum is not followed.

Expected behavior
I would expect that the generated args obey the enum I set for that field.

Bug: openai_schema removes properties named title

openai_schema removes properties/fields named title from json schema

Example:

class Author(OpenAISchema):
  """Class representing an author. 
  This class is used to extract author's name and
  poem's title from a text"""
  name: str = Field(..., description="Name of the author")
  title: str = Field(..., description="Title of the article")
Author.openai_schema

# output:
"""
{'name': 'Author',
 'description': "Class representing an author. \nThis class is used to extract author's name and
\npoem's title from a text",
 'parameters': {'type': 'object',
  'properties': {'name': {'description': 'Name of the author',
    'type': 'string'}},
  'required': ['name']}}
"""

Exclude properties with defaults from required

Suggestion:
parameters["required"] = sorted(k for k, v in parameters.get("properties", {}).items() if not "default" in v)
instead of
parameters["required"] = sorted(parameters["properties"])

That would allow us to:
data: Any = Field(None, description="Optional data attached")

Upcoming openai-python 1.0.0 release

Hello. Thanks for your great work on Instructor. Really appreciate that it's thoughtfully constructed for use in production.

I wanted to check what your plans are for the upcoming openai-python 1.0.0 release (openai/openai-python#631). Instructor currently has a dependency on <0.28.

Thanks!

Default parameters to pydantic model

Is your feature request related to a problem? Please describe.
I'm always frustrated when I need send default parameters to pydantic response_model

Describe the solution you'd like
I want to send for example default sex to model (don't extract data with ChatCompletion), because I know Jason's sex 😄 :

class UserDetail(BaseModel, sex):
    weight: int
    sex: str
    def is_obese(self):
        if self.sex=='female' and self.weight>100:
            return True
        if self.sex=='male' and self.weight>120:
            return True
        return False


user: UserDetail = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    response_model=UserDetail,
    parameters={'sex': 'male'}
    messages=[
        {"role": "user", "content": "Extract Jason 200kg"},
    ]
)

Describe alternatives you've considered
I'd considered create another pydantic class to complete properties for user. But it is not correct way, because UserDetail should have all user properties, some extracted from ChatCompletion and others send for me.

Maybe, I have lost something. I'm not expert using pydantic. If you can share me another option I would be grateful.

Bugs in `Example 2: Schema Extraction`

There are two bugs in Example 2: Schema Extraction.

  1. There's a missing comma character after functions=[UserDetails.openai_schema]
  2. Missing import, from pydantic import Field

JsonDecoderError at the specific place

Describe the bug
When Using the instructor, at some input. It will raise json error fault.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A way to fix the bug

Screenshots
image

Desktop (please complete the following information):
LSB Version: :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.9.2009 (Core)
Release: 7.9.2009
Codename: Core

Logic error in ChatCompletion __or__

for class ChatCompletion(BaseModel):

def or(self, other: Union[Message, OpenAISchema]) -> "ChatCompletion":
if isinstance(other, Message):
if isinstance(other, SystemMessage):
if self.system_message:
self.system_message.content += "\n\n" + other.content
self.system_message = other

should be

if isinstance(other, SystemMessage):
if self.system_message:
self.system_message.content += "\n\n" + other.content
else:
self.system_message = other

Does instructor support Azure OpenAI API ?

When I use Azure OpenAI, I often encounter errors, but occasionally it succeeds. I am not sure if the current instructor can use the Azure OpenAI API. Below is the function and frequent error message.

new_updates = openai.ChatCompletion.create(
        response_model=Report,
        deployment_id= dep.GPT_4,
        max_retries=2,
        messages=[
                {
                    "role": "system",
                    "content": SYSTEM_PROMPT_KG_SYT
                },
                {
                    "role": "user",
                    "content": f"""Extract any new events from the following:
                    # Part {i}/{num_iterations} of the input:

                    {inp}"""
                },
                {
                    "role": "user",
                    "content": f"""Here is the current state of the report:
                    {cur_state.model_dump_json(indent=2)}"""
                }
            ],
        
    )  # type: ignore

Describe the bug
openai.error.InvalidRequestError: 'content' is a required property - 'messages.3'

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
Traceback (most recent call last):
File "C:\Users\yubo.he\Desktop\LLM_AE_Extrator\run.py", line 92, in
ade_report: Report = generate_report(text_chunks)
File "C:\Users\yubo.he\Desktop\LLM_AE_Extrator\run.py", line 47, in generate_report
new_updates = openai.ChatCompletion.create(
File "C:\Users\yubo.he\AppData\Local\Continuum\anaconda3\envs\syngenta\lib\site-packages\instructor\patch.py", line 162, in new_chatcompletion_sync
response, error = retry_sync(
File "C:\Users\yubo.he\AppData\Local\Continuum\anaconda3\envs\syngenta\lib\site-packages\instructor\patch.py", line 117, in retry_sync
response = func(*args, **kwargs)
File "C:\Users\yubo.he\AppData\Local\Continuum\anaconda3\envs\syngenta\lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "C:\Users\yubo.he\AppData\Local\Continuum\anaconda3\envs\syngenta\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 155, in create
response, _, api_key = requestor.request(
File "C:\Users\yubo.he\AppData\Local\Continuum\anaconda3\envs\syngenta\lib\site-packages\openai\api_requestor.py", line 299, in request
resp, got_stream = self._interpret_response(result, stream)
File "C:\Users\yubo.he\AppData\Local\Continuum\anaconda3\envs\syngenta\lib\site-packages\openai\api_requestor.py", line 710, in _interpret_response
self._interpret_response_line(
File "C:\Users\yubo.he\AppData\Local\Continuum\anaconda3\envs\syngenta\lib\site-packages\openai\api_requestor.py", line 775, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: 'content' is a required property - 'messages.3'

Desktop (please complete the following information):

  • OS: Windows

Additional context
Azure OpenAI version : 2023-08-01-preview

Async might not be properly handled in latest instructor/openai versions?

I am using instructor = "^0.3.1" and openai = "^1.2.0".

I initialize my client as:

client = instructor.patch(AsyncOpenAI(
    api_key=OPENAI_API_KEY,
))

And then call it as:

async def myfunc():
    ...
                response = await client.chat.completions.create(
                    model=model_name,
                    messages=messages,
                    response_model=response_model, # type: ignore
                    max_retries=2
                )

This gives me an error: Error in getting response from model: 'coroutine' object has no attribute 'choices'.
I stepped through the code in a debugger and it seems like wrap_chatcomplete wraps the AsyncOpenAI().chat.completion.create as a sync function, not an async one?

image

pip install instructor has dependency conflicts in Colab

Describe the bug

Running !pip install instructor in Colab creates the following dependency conflicts:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which is not installed.
lida 0.0.10 requires uvicorn, which is not installed.
llmx 0.0.15a0 requires cohere, which is not installed.
llmx 0.0.15a0 requires tiktoken, which is not installed.
tensorflow-probability 0.22.0 requires typing-extensions<4.6.0, but you have typing-extensions 4.8.0 which is incompatible.

To Reproduce
Steps to reproduce the behavior:

  1. Create a new colab
  2. Run !pip install instructor

Expected behavior
Clean install without dependency conflicts.

Support parameters docstring for `@openai_function` annotation

Is your feature request related to a problem? Please describe.
I want to be able to define a good old python function to use it both for the schema and execution, but if I want to add description to the parameters. Right now, I have to use a class definition. This could be solved by supporting the standard parameters parsing from docstrings.

Describe the solution you'd like
E.g., this should work:

@openai_function
def get_current_weather(
    location: str, format: Literal["celsius", "fahrenheit"] = "celsius"
) -> WeatherReturn:
    """
    Gets the current weather in a given location, use this function for any questions related to the weather

    Parameters
    ----------
    location
        The city to get the weather, e.g. San Francisco. Guess the location from user messages

    format
        A string with the full content of what the given role said
    """

    return WeatherReturn(
        location=location,
        forecast="sunny",
        temperature="25 C" if format == "celsius" else "77 F",
    )

But right now the description of the parameters goes into the function description, not into the parameters description.

How it is right now:

{
    'name': 'get_current_weather',
    'description': '\n    Gets the current weather in a given location, use this function for any questions related to the weather\n\n    Parameters\n    ----------\n    location\n        The city to get the weather, e.g. San Francisco. Guess the location from user messages\n\n    format\n        A string with the full content of what the given role said\n    ',
    'parameters': {
        'properties': {
            'location': {'type': 'string'},
            'format': {
                'default': 'celsius',
                'enum': ['celsius', 'fahrenheit'],
                'type': 'string'
            }
        },
        'required': ['format', 'location'],
        'type': 'object'
    }
}

How I expect it:

{
  'name': 'get_current_weather',
  'description': 'Gets the current weather in a given location, use this function for any questions related to the weather',
  'parameters': {
      'properties': {
          'location': {
              'description': 'The city to get the weather, e.g. San Francisco. Guess the location from user messages',
              'type': 'string'
          },
          'format': {
              'description': 'A string with the full content of what the given role said',
              'default': 'celsius',
              'enum': ['celsius', 'fahrenheit'],
              'type': 'string'
          }
      },
      'required': ['location'],
      'type': 'object'
  }
}

Adding a lightweight prompt abstraction to the SchemaClass

Sure! Here's the updated proposal where PromptConfig has the model as a required argument and all other attributes as optional. The default model is set to "gpt3.5-turbo-0613":

from pydantic import BaseModel
from typing import Optional

class OpenAISchema(BaseModel):
    class PromptConfig:
        model: str = "gpt3.5-turbo-0613"
        system: Optional[str]
        message: Optional[str]
        temperature: Optional[float]
        max_tokens: Optional[int]

    @classmethod
    def from_response(cls, response):
        # Implementation based on the actual response format.

    @classmethod
    def create(cls, message=None, *args, force_function=False, **kwargs):
        messages = kwargs.get("messages", [])

        if not messages and hasattr(cls, "PromptConfig"):
            if cls.PromptConfig.system:
                messages.append({
                    "role": "system",
                    "content": cls.PromptConfig.system
                })
            if cls.PromptConfig.message:
                messages.append({
                    "role": "user",
                    "content": cls.PromptConfig.message
                })

        if message:
            messages.append({
                "role": "user",
                "content": message
            })

        if force_function:
            kwargs['function_call'] = {"name": cls.openai_schema["name"]}

        kwargs['messages'] = messages

        if hasattr(cls, "PromptConfig"):
            kwargs.setdefault('model', cls.PromptConfig.model)
            kwargs.setdefault('temperature', cls.PromptConfig.temperature)
            kwargs.setdefault('max_tokens', cls.PromptConfig.max_tokens)

        completion = openai.ChatCompletion.create(
            functions=[cls.openai_schema],
            **kwargs
        )
        return cls.from_response(completion)

class Search(OpenAISchema):
    # Implementation remains the same

class MultiSearch(OpenAISchema):
    class PromptConfig:
        system = "You are a capable algorithm designed to correctly segment search requests."
        message = "Correctly segment the following search request"
        model = "gpt3.5-turbo-0613"
        temperature = 0.5
        max_tokens = 1000

    # Implementation remains the same

# Example of usage:
queries = MultiSearch.create(
    "Please send me the video from last week about the investment case study and also documents about your GPDR policy."
)
queries.execute()

This revision makes the PromptConfig more flexible and easier to use with the default model set and all other parameters as optional. This configuration can be overridden on a per-class basis, as shown in the MultiSearch.PromptConfig example.

Where to inject the few shot examples?

Where should I put the few shot examples into the prompt to improve accuracy? Should I put it in the model docstring or somewhere else? Can you provide an example?

Thanks.

Compatibility with Langchain

Is your feature request related to a problem? Please describe.
Would like to resolve dependency incompatibility between langchain and openai_function_call

Describe the solution you'd like
langchain and openai_function_call to be compatiable

Describe alternatives you've considered
None

Additional context

  Because no versions of openai-function-call match >0.2.0,<0.3.0
   and openai-function-call (0.2.0) depends on pydantic (>=2.0.2,<3.0.0), openai-function-call (>=0.2.0,<0.3.0) requires pydantic (>=2.0.2,<3.0.0).
  And because langchain (0.0.238) depends on pydantic (>=1,<2)
   and no versions of langchain match >0.0.238,<0.0.239, openai-function-call (>=0.2.0,<0.3.0) is incompatible with langchain (>=0.0.238,<0.0.239).
  So, because nira-ai depends on both langchain (^0.0.238) and openai-function-call (^0.2.0), version solving failed.

pip install instructor has dependency conflicts in Colab

Describe the bug

Running !pip install instructor in Colab creates the following dependency conflicts:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which is not installed.
lida 0.0.10 requires uvicorn, which is not installed.
llmx 0.0.15a0 requires cohere, which is not installed.
llmx 0.0.15a0 requires tiktoken, which is not installed.
tensorflow-probability 0.22.0 requires typing-extensions<4.6.0, but you have typing-extensions 4.8.0 which is incompatible.

To Reproduce
Steps to reproduce the behavior:

  1. Create a new colab
  2. Run !pip install instructor

Expected behavior
Clean install without dependency conflicts.

[Bounty] Instructor finetuning CLI needs to support validation_file and hyperparameters

Is your feature request related to a problem? Please describe.

We need to be able to pass in the hyperparameters and validation file here:
https://github.com/jxnl/instructor/blob/main/instructor/cli/jobs.py#L135

It should basically look like: https://platform.openai.com/docs/api-reference/fine-tuning/create#fine-tuning-create-hyperparameters

Describe the solution you'd like

  1. make a PR to add it into the cli
  2. update the documentation in the finetune docs page here: https://github.com/jxnl/instructor/blob/main/docs/cli/finetune.md

Default description for generated schema

When getting the .openai_schema from an OpenAISchema (BaseModel) class, if the class has a docstring, then that is used as the description. If there is no docstring, one is automatically added. The current default description (no docstring) is this - a description about the extraction process rather than a description of the object.

For example, if I define an Address as

class Address(City):
    country: str
    state: str
    city: str
    street: str

Then the .openai_schema is

{'name': 'Address',
 'description': 'Correctly extracted `Address` with all the required parameters with correct types',
 'parameters': {'properties': {'country': {'type': 'string'},
   'state': {'type': 'string'},
   'city': {'type': 'string'},
   'street': {'type': 'string'}},
  'required': ['city', 'country', 'state', 'street'],
  'type': 'object'}}

However, if I add a docstring to the type, like

class Address(City):
    """An address"""
    country: str
    state: str
    city: str
    street: str

then the .openai_schema is

{'name': 'Address',
 'description': 'An address',
 'parameters': {'properties': {'country': {'type': 'string'},
   'state': {'type': 'string'},
   'city': {'type': 'string'},
   'street': {'type': 'string'}},
  'required': ['city', 'country', 'state', 'street'],
  'type': 'object'}}

The current default string doesn't really have the same use case as the description when a docstring is present.

I think a better default description would be the empty string ("") or maybe just the class name. In most cases, I think it would be preferable that the language model is given no description of the type than one about the schema generation process.

Typer version too old

Describe the bug
Is there a reason why Typer version ^0.4.0 is used while the latest version is 0.9.0 ?
It might conflict with other packages that required more recent version of Typer

Bounty: Streaming function calls

To be considered checkout : https://replit.com/bounties/@jxnl/streaming-json-parse

I'd like to have the capability of parsing functions calls as they stream out for MultiTask when doing streaming function calls. You can use any existing python library. Must work for nested and deep objects.

Below is some code that won't work, since theres no good way of doing this:

from pydantic import BaseModel

class Task(BaseModel):
    id: int
    title: str

# This is your existing generator that yields chunks of JSON string
def json_chunks(json_string):
    for i in range(0, len(json_string), 5):  # replace 5 with the chunk size you want
         chunk = json_string[i:i+5]
         print("yield chunk:", chunk)
         yield chunk

def tasks_from_chunks(json_chunks: Generator[str, None, None]):
     # do something to get a single task_json
     task = Task.parse_raw(**task_json)
     print("yield task", task)
     yield task
     
json_string = '{"tasks":[{"id":1,"title":"task1"},{"id":2,"title":"task2"},{"id":3,"title":"task3"}]}'

for task in tasks_from_chunks(json_chunks(json_string)):
     print(task)

Success criteria

  1. tasks are yielded as soon as they are parsed, there for task 1 should yield before all jsons chunks are yielded
  2. must contain a few examples to show it works correctly.

Add LLM based citation

It will be nice to have Fact generated with semantic citations (not the Regex-based ones that you have in the cookbook). We can do this with a custom validation function that invokes an LLM call.

Bug - cannot import name 'FieldValidationInfo' from 'pydantic'

Describe the bug
I get the following error: cannot import name 'FieldValidationInfo' from 'pydantic'.

When doing:

from instructor import OpenAISchema

To Reproduce

from instructor import OpenAISchema

Expected behavior

Expected it to not crash

Screenshots

Screenshot 2023-10-15 at 8 36 19 PM

Desktop (please complete the following information):
Version 0.2.8
Macbook Pro - Intel
Chrome

Help: Reorganize module strucutre

would be nice to have a structure where theres a directory per example so we can have a readme.md for each example and a list of evals to run.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.