guardrails-ai / guardrails Goto Github PK
View Code? Open in Web Editor NEWAdding guardrails to large language models.
Home Page: https://www.guardrailsai.com/docs
License: Apache License 2.0
Adding guardrails to large language models.
Home Page: https://www.guardrailsai.com/docs
License: Apache License 2.0
Describe the bug
A TypeError occurs when I try to use Chat completion models
To Reproduce
Steps to reproduce the behavior:
<rail version="0.1">
<output>
<object name="patient_info">
<string name="gender" description="Patient's gender" />
<integer name="age" format="valid-range: 0 100"/>
<list name="symptoms" description="Symptoms that the patient is currently experiencing. Each symptom should be classified into separate item in the list.">
<object>
<string name="symptom" description="Symptom that a patient is experiencing" />
<string name="affected area" description="What part of the body the symptom is affecting"
format="valid-choices: {['head', 'neck', 'chest']}"
on-fail-valid-choices="reask"
/>
</object>
</list>
<list name="current_meds" description="Medications the patient is currently taking and their response">
<object>
<string name="medication" description="Name of the medication the patient is taking" />
<string name="response" description="How the patient is responding to the medication" />
</object>
</list>
</object>
</output>
<instructions>
You are a helpful assistant only capable of communicating with valid JSON, and no other text.
@json_suffix_prompt_examples
</instructions>
<prompt>
Given the following doctor's notes about a patient,
please extract a dictionary that contains the patient's information.
If the answer doesn't exist in the document, enter `null`.
{{doctors_notes}}
@xml_prefix_prompt
{output_schema}
</prompt>
</rail>
raw_llm_response, validated_output = guard(
openai.ChatCompletion.create,
prompt_params={"doctors_notes": doctors_notes},
engine="chatgpt",
max_tokens=1024,
temperature=0.0,
)
Expected behavior
The same behaviour as with completion models
Library version:
Version (e.g. 0.1.6 & 0.1.7)
Additional context
The problem seems to come from the fact that 'json.loads' in run.py returns an error because the response from the chat model is not a single JSON or contains text around it
Add support for asynchronous calls in the llm_providers
module. Currently, the module only supports calling LLM APIs synchronously, which might lead to blocking behavior.
Steps:
openai.ChatCompletion.acreate
asyncio.run
to run async LLM API calls from a synchronous context.While trying to convert guardrail spec into a langchain promptTemplate it breaks up with an error :
lxml.etree.XMLSyntaxError: xmlParseEntityRef: no name, line 14, column 48
python = "^3.11"
fastapi = "^0.95.0"
python-dotenv = "^1.0.0"
uvicorn = "^0.22.0"
langchain = "^0.0.158"
guardrails-ai = {git = "https://github.com/ShreyaR/guardrails", rev = "3ab8e20"}
@guard_route.post("/")
async def generate_response_langchain(message: MessageInput):
output_parser = GuardrailsOutputParser.from_rail_string(rail_spec_template)
print("\n\n", output_parser.guard.base_prompt)
prompt = PromptTemplate(
template=output_parser.guard.base_prompt,
input_variables=output_parser.guard.prompt.variable_names,
)
model = OpenAI(
client=openai.ChatCompletion.create,
openai_api_key=settings.OPENAI_API_KEY,
)
output = model(prompt.format_prompt(doctors_notes=message.text).to_string())
print(output)
return output_parser.parse(output)
<rail version="0.1">
<output>
<object name="patient_info">
<string name="gender" description="Patient's gender" />
<integer name="age" format="valid-range: 0 100" />
<string name="symptoms" description="Symptoms that the patient is currently experiencing" />
</object>
</output>
<prompt>
Given the following doctor's notes about a patient, please extract a dictionary that contains the patient's information.
49 y/o Male with chronic macular rash to face & hair, worse in beard, eyebrows & nares.Itchy, flaky, slightly scaly. Moderate response to OTC steroid cream
@complete_json_suffix_v2
</prompt>
</rail>
I initially opened up a discussion on Discord.
At the moment each guardrails call appends some generic examples of JSON parsing at the end of the message, optimizing for the normal completion context. This works fine for simple use cases, but being able to specify our own few-shot examples will allow us to accurately extract into more complicated rail schemas, and better control how to handle parsing edge cases.
Chat completions allow you to specify messages with system messages, assistant messages, and user messages. The system message contains a description of the task, and the alternating user/assistant messages contain examples of the task being completed, up to the last user message, which prompts an assistant message.
For example, a message chain can be constructed as such:
messages = [{"role": "system", "content": system_prompt}]
for example_prompt, example_response in examples:
messages.append({"role": "user", "content": example_prompt})
messages.append({"role": "assistant", "content": example_response})
messages.append({"role": "user", "content": prompt})
An example rail spec to specify this could look like:
<rail version="0.1">
<output>
...
</output>
<system-prompt>
You are a helpful assistant, able to express yourself purely through JSON, strictly and precisely adhering to the provided XML schemas.
@complete_json_suffix_v2
</system-prompt>
<example>
<prompt>
example prompt 1
</prompt>
<response>
example response 1
</response>
</example>
<example>
<prompt>
example prompt 2
</prompt>
<response>
example response 2
</response>
</example>
<prompt>
prompt text
</prompt>
</rail>
Describe the bug
I am trying to generate the pre-screening questions for recruiter using the job description, but it gives you in-consistent result. Since I am using the open ai, it don't give me exact error- mentioned here too.
To Reproduce
rail_str = """
<rail version="0.1">
<output>
<list name="pre_screening_questions" description="Generate the list of pre-screening questions based on given job description." format="length: 2 10" on-fail-length="noop">
<object>
<integer name="qa_id" description="The question's id." format="1-indexed" />
<string name="question" description="The Pre-screening Question text." />
<string name="answer" description="The Pre-screening Answer text." />
</object>
</list>
</output>
<prompt>
Generate a dataset of pre-screening questions and brief answers to shortlist potential candidates that matches with the following job description:
{{job_description}}. Return a JSON that follows the correct schema.
@complete_json_suffix</prompt>
</rail>
"""
guard = gd.Guard.from_rail_string(rail_str)
raw_llm_response, validated_response = guard(openai.ChatCompletion.create, prompt_params={"job_description": job_description_text},model="gpt-3.5-turbo", max_tokens=3000, temperature=0)
print(validated_response)
Errors
I am getting different errors when re-run the program.
Error-1:
Traceback (most recent call last):
File "/Users/ankurkhandelwal/Desktop/Python/Github/ChatPdf/pre_screen_question_rail.py", line 41, in
raw_llm_response, validated_response = guard(openai.ChatCompletion.create, prompt_params={"job_description": job_description_text},
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/guard.py", line 166, in call
guard_history = runner(prompt_params=prompt_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/run.py", line 90, in call
validated_output, reasks = self.step(
^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/run.py", line 147, in step
validated_output = self.validate(index, output_as_dict, output_schema)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/run.py", line 266, in validate
validated_output = output_schema.validate(output_as_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/schema.py", line 332, in validate
validated_response = self[field].validate(
^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/datatypes.py", line 269, in validate
schema = validator.validate_with_correction(key, value, schema)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/validators.py", line 204, in validate_with_correction
return self.validate(key, value, schema)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/validators.py", line 609, in validate
last_val = [value[-1]]
~~~~~^^^^
KeyError: -1
Error-2:
Traceback (most recent call last):
File "/Users/ankurkhandelwal/Desktop/Python/Github/ChatPdf/pre_screen_question_rail.py", line 41, in
raw_llm_response, validated_response = guard(openai.ChatCompletion.create, prompt_params={"job_description": job_description_text},
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/guard.py", line 166, in call
guard_history = runner(prompt_params=prompt_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/run.py", line 90, in call
validated_output, reasks = self.step(
^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/run.py", line 147, in step
validated_output = self.validate(index, output_as_dict, output_schema)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/run.py", line 266, in validate
validated_output = output_schema.validate(output_as_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/schema.py", line 332, in validate
validated_response = self[field].validate(
^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/datatypes.py", line 278, in validate
value = item_type.validate(i, item, value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/guardrails/datatypes.py", line 319, in validate
child_key, value.get(child_key, None), value
^^^^^^^^^
AttributeError: 'str' object has no attribute 'get'
Library version:
guardrails_ai==0.1.6 and 0.1.7
Additional context
TLDR: Seems that LLM output can cause trouble if it evaluates into valid nested dictionaries.
An example where gpt-3.5-turbo produces garbage output that leads into unhandled exceptions (basically the getting started demo):
"""Basic usage of guardrails."""
import logging
from rich import print
import guardrails as gd
import openai
logging.basicConfig(level=logging.DEBUG)
def main():
"""Main entry point for the script."""
rail_spec = """
<rail version="0.1">
<output>
<object name="bank_run" format="length: 2">
<string
name="explanation"
description="A paragraph about what a bank run is."
format="length: 200 240"
on-fail-length="noop"
/>
<url
name="follow_up_url"
description="A web URL where I can read more about bank runs."
required="true"
format="valid-url"
on-fail-valid-url="filter"
/>
</object>
</output>
<prompt>
Explain what a bank run is in a tweet.
@xml_prefix_prompt
{output_schema}
@json_suffix_prompt_v2_wo_none
</prompt>
</rail>
"""
guard = gd.Guard.from_rail_string(rail_spec)
print(guard.base_prompt)
# Wrap the OpenAI API call with the `guard` object
raw_llm_output, validated_output = guard(
openai.ChatCompletion.create,
model="gpt-3.5-turbo",
max_tokens=1024,
temperature=0.3,
)
# Print the validated output from the LLM
print(validated_output)
print("Raw output is..")
print(raw_llm_output)
if __name__ == "__main__":
main()
Which leads into the following exception:
Traceback (most recent call last):
raw_llm_output, validated_output = guard(
File "/home/mikko/programs/guardrails/guardrails/guard.py", line 135, in __call__
guard_history = runner(prompt_params=prompt_params)
File "/home/mikko/programs/guardrails/guardrails/run.py", line 87, in __call__
validated_output, reasks = self.step(
File "/home/mikko/programs/guardrails/guardrails/run.py", line 139, in step
validated_output = self.validate(index, output_as_dict, output_schema)
File "/home/mikko/programs/guardrails/guardrails/run.py", line 228, in validate
validated_output = output_schema.validate(output_as_dict)
File "/home/mikko/programs/guardrails/guardrails/schema.py", line 330, in validate
validated_response = self[field].validate(
File "/home/mikko/programs/guardrails/guardrails/datatypes.py", line 285, in validate
value = child_data_type.validate(
File "/home/mikko/programs/guardrails/guardrails/datatypes.py", line 71, in validate
schema = validator.validate_with_correction(key, value, schema)
File "/home/mikko/programs/guardrails/guardrails/validators.py", line 206, in validate_with_correction
return self.validate(key, value, schema)
File "/home/mikko/programs/guardrails/guardrails/validators.py", line 628, in validate
last_val = [value[-1]]
KeyError: -1
The culprit is (probably) the response from GPT, which contains nested dictionaries (it fails at following this specific prompt):
DEBUG:guardrails.validators:Validating {'value': "A bank run is a situation where a large number of customers withdraw thei
r deposits from a bank, often due to concerns about the bank's solvency or stability. This can lead to a liquidity crisis f
or the bank, as it may not have enough cash on hand to meet all of the withdrawal requests. Bank runs can be contagious, as
news of one bank run can cause customers of other banks to panic and withdraw their deposits as well.", 'format': 'length'
, 'min': 200, 'max': 240} is in length range 200 - 240...
where {"value": A bank run....}
is the actual value of the response. Downstream, for validation, this gets evaluated into a dictionary, at Runner.call
where we have output_as_dict = json.loads(output, strict=False)
.
Another poisonous output:
{
"bank_run": {
"explanation": {
"length": {
"min": 200,
"max": 240
}
},
"follow_up_url": {
"valid-url": true,
"required": true
}
}
}
Seems though that some validators do expect strings, leading directly to the above errors. For example the ValidLength
validator calls len(value)
which will be 1 for the reponse above, leading directly to the above validation error.
Since the example also tries to validate the URL, that will also fail unexpectedly for the same reasons:
DEBUG:guardrails.validators:Validating {'valid-url': '', 'required': True} is a valid URL...
Traceback (most recent call last):
raw_llm_output, validated_output = guard(
File "/home/mikko/programs/guardrails/guardrails/guard.py", line 135, in __call__
guard_history = runner(prompt_params=prompt_params)
File "/home/mikko/programs/guardrails/guardrails/run.py", line 91, in __call__
validated_output, reasks = self.step(
File "/home/mikko/programs/guardrails/guardrails/run.py", line 143, in step
validated_output = self.validate(index, output_as_dict, output_schema)
File "/home/mikko/programs/guardrails/guardrails/run.py", line 234, in validate
validated_output = output_schema.validate(output_as_dict)
File "/home/mikko/programs/guardrails/guardrails/schema.py", line 330, in validate
validated_response = self[field].validate(
File "/home/mikko/programs/guardrails/guardrails/datatypes.py", line 285, in validate
value = child_data_type.validate(
File "/home/mikko/programs/guardrails/guardrails/datatypes.py", line 71, in validate
schema = validator.validate_with_correction(key, value, schema)
File "/home/mikko/programs/guardrails/guardrails/validators.py", line 206, in validate_with_correction
return self.validate(key, value, schema)
File "/home/mikko/programs/guardrails/guardrails/validators.py", line 721, in validate
response = requests.get(value)
File "/home/mikko/.pyenv/versions/guardrails3.10/lib/python3.10/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/home/mikko/.pyenv/versions/guardrails3.10/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/home/mikko/.pyenv/versions/guardrails3.10/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "/home/mikko/.pyenv/versions/guardrails3.10/lib/python3.10/site-packages/requests/sessions.py", line 695, in send
adapter = self.get_adapter(url=request.url)
File "/home/mikko/.pyenv/versions/guardrails3.10/lib/python3.10/site-packages/requests/sessions.py", line 792, in get_ada
pter
raise InvalidSchema(f"No connection adapters were found for {url!r}")
requests.exceptions.InvalidSchema: No connection adapters were found for "{'valid-url': '', 'required': True}"
From a quick glance it seems that we may only want the top-level of the LLM output to be parsed to a dictionary. Alternatively the validators should do more checking on the input.
I've noticed that in our docs we say
Supported types
Guardrails supports many data types, including:, string, integer, float, boolean, list, object, url, email and many more.
But boolean
doesn't work as defined in the Guardrails spec. Instead, it seems like we have to use bool
.
Should we move to boolean
or update the documentation to ensure it maps to the code?
seems true of both the readme and the gh-repo description
Add support for more LLM API providers in guardrails/llm_providers.py
Describe the bug
To Reproduce
Steps to reproduce the behavior:
openai_chat = ClientConnection(
client_name="openaichat",
client_connection=os.environ["OPENAI_API_KEY"],
engine="gpt-3.5-turbo",
)
manifest = Manifest(
client_pool=[openai_chat],
cache_name="sqlite",
cache_connection="my_manifest_cache.db",
)
chat_dict = [
{"role": "system", "content": "You are a helpful assistant."},
]
raw_llm_output, guardrail_output = guard(
manifest,
prompt_params={
"materials": "3 essence of fire, 2 black steel, 1 diamond"
},
messages=chat_dict,
max_tokens=100,
temperature=0.7,
)
Expected behavior
A clear and concise description of what you expected to happen.
Library version:
0.1.6
Additional context
Add any other context about the problem here.
Choose one of the nested options. See https://discord.com/channels/1085077079697150023/1086092225122926614/1087886515176210564 for more details
discord link in the readme, twitter bio and the repo's description is not working. it says invalid invitation link.
Currently, most of the validators in guardrails are deterministic checks.
As a framework, guardrails can also support probabilistic validations. As an example: for a string output, check if the sentiment in the output is positive or negative.
Opening this issue to brainstorm ideas about what would be good deterministic and non-deterministic validators to add.
Hello,
We are trying to integrate Guardrails with Langchain, but encountered an issue when a reask is required, where it would throw an exception of
TypeError: 'NoneType' object is not callable
Here is the most basic reproducible snippet of the code:
from langchain.llms import OpenAI
from langchain.output_parsers import GuardrailsOutputParser
from langchain.prompts import PromptTemplate
def main():
llm = OpenAI()
rail_spec = """
<rail version="0.1">
<output>
<object name="patient_info">
<string name="gender" description="Patient's gender" />
<integer name="age" format="valid-range: 0 100"/>
<list name="symptoms" description="Symptoms that the patient is currently experiencing. Each symptom should be classified into separate item in the list.">
<object>
<string name="symptom" description="Symptom that a patient is experiencing" />
<string name="affected area" description="What part of the body the symptom is affecting"
format="valid-choices: {['head', 'neck', 'chest']}"
on-fail-valid-choices="reask"
/>
</object>
</list>
<list name="current_meds" description="Medications the patient is currently taking and their response">
<object>
<string name="medication" description="Name of the medication the patient is taking" />
<string name="response" description="How the patient is responding to the medication" />
</object>
</list>
</object>
</output>
<prompt>
Given the following doctor's notes about a patient, please extract a dictionary that contains the patient's information.
{{doctors_notes}}
@complete_json_suffix_v2
</prompt>
</rail>
"""
output_parser = GuardrailsOutputParser.from_rail_string(rail_spec)
prompt = PromptTemplate(
template=output_parser.guard.base_prompt,
input_variables=output_parser.guard.prompt.variable_names,
)
doctors_notes = """
49 y/o Male with chronic macular rash to face & hair, worse in beard, eyebrows & nares.
Itchy, flaky, slightly scaly. Moderate response to OTC steroid cream
"""
output = llm(prompt.format_prompt(doctors_notes=doctors_notes).to_string())
print(output_parser.parse(output))
if __name__ == "__main__":
main()
Any advice would be greatly appreciated.
Describe the bug
I encountered XMLSyntaxError
when trying to reproduce the text summarization example.
To Reproduce
Steps to reproduce the behavior:
Rail spec:
<rail version="0.1">
<script language='python'>
document = open("search.txt", "r").read()
</script>
<output>
<string
name="summary"
description="Summarize the given document faithfully."
format="similar-to-document: {document}, 0.60"
on-fail-similar-to-document="filter"
/>
</output>
<prompt>
Summarize the following document:
{{document}}
@complete_json_suffix
</prompt>
</rail>
(I only changed the document path, which is a valid path for me.)
Running the command
guard = gd.Guard.from_rail_string(rail_str)
gives the following error:
Traceback (most recent call last):
File /databricks/python/lib/python3.9/site-packages/IPython/core/interactiveshell.py:3378 in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File <command-3862015891448427>:27
guard = gd.Guard.from_rail_string(rail_str)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-84b67d00-0c2e-469a-8873-18f5bc500df3/lib/python3.9/site-packages/guardrails/guard.py:132 in from_rail_string
return cls(Rail.from_string(rail_string), num_reasks=num_reasks)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-84b67d00-0c2e-469a-8873-18f5bc500df3/lib/python3.9/site-packages/guardrails/rail.py:108 in from_string
return cls.from_xml(ET.fromstring(string, parser=XMLPARSER))
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-84b67d00-0c2e-469a-8873-18f5bc500df3/lib/python3.9/site-packages/guardrails/rail.py:139 in from_xml
raw_output_schema = ET.fromstring(raw_output_schema, parser=XMLPARSER)
File src/lxml/etree.pyx:3257 in lxml.etree.fromstring
File src/lxml/parser.pxi:1916 in lxml.etree._parseMemoryDocument
File src/lxml/parser.pxi:1796 in lxml.etree._parseDoc
File src/lxml/parser.pxi:1085 in lxml.etree._BaseParser._parseUnicodeDoc
File src/lxml/parser.pxi:618 in lxml.etree._ParserContext._handleParseResultDoc
File src/lxml/parser.pxi:728 in lxml.etree._handleParseResult
File src/lxml/parser.pxi:657 in lxml.etree._raiseParseError
File <string>:6
XMLSyntaxError: attributes construct error, line 6, column 196
Expected behavior
The Guard should be created with no errors.
Library version:
Installed directly from GitHub, main branch.
Additional context
Removing the
format="similar-to-document: {document}, 0.60"
on-fail-similar-to-document="filter"
doesn't throw an error, but that would defeat its intended purpose.
FWIW, switching to the saliency-check
validator throws the same error.
From discord thread
Description
The output schema may contain variables. Since the schema is transpiled into a string and injected into the prompt at runtime, it should be possible to support variables within the transpiled output schema.
Why is this needed
[If you have a concrete use case, add details here.]
Implementation details
End result
<output>
<string
name="context"
description="Context of the question ${document} asked e.g comparision, recommendation, other, follow_up?"
format="length: 200 240"
on-fail-length="noop"
required="true"
/>
</output>
Open questions:
prompt_params
while calling Guard.__call__
I think the discord invite link has expired
Description
Each stage of the Run
class should call a list of optional callbacks before and after the stage is completed.
Why is this needed
Implementation details
Callback
that has methods like before_prepare
, after_prepare
, before_call
, etc. Each method should specify input arguments that are available within those functions.Guard
class at initialization can take in a list of callbacks that it passes to the Run
class at initializationRun
class should then call each callback at the appropriate location.End result
Callback
class. They only need to define the methods that they care about using in their derived class (e.g. if they care about tracking token usage, they only need to specify before_call
and after_call
)Guard
class, the user should pass in the list of callbacks which will be automatically called during the run.py
function.This code seems a little bit unintuitive. I would expect ValidUrl to test whether the string URL is valid, not make an API call. Perhaps we need to checks here? ValidUrl and EndpointIsOnline?
Add an Enum
datatype to guardrails/datatypes.py
While this could be implemented with a string
datatype and a valid-choices
formatter, and Enum
is a nice abstraction that would be good to have ootb.
hey, another small parsing issue.
My sample case:
from langchain.output_parsers import GuardrailsOutputParser
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate, LLMChain
rail_spec = """
<rail version="0.1">
<output>
<string
name="message"
description="the message the user wants to send"
/>
<bool
name="is_message_satisfactory"
description="Whether indicated they are satisfied with the draft message."
/>
</output>
<prompt>
Extract the message the user wants to send. If the message does not exist, enter an empty string.
Determine if the user is satisfied with the message. If the user is satisfied, enter `true`. If the user is not satisfied, enter `false`. If the answer does not exist, enter `false`.
```
{{transcript}}
```
@xml_prefix_prompt
{output_schema}
@json_suffix_prompt
</prompt>
</rail>
"""
output_parser = GuardrailsOutputParser.from_rail_string(rail_spec)
transcript = """
Person: Hi, can you help me draft a business letter?
AI Assistant: Of course, what type of letter are you looking to draft?
Person: I need to send a formal email to a potential client to introduce my business.
AI Assistant: Great. Can you give me some more details about your business and the purpose of the email?
Person: Sure, my business provides marketing services and I want to introduce our services to the potential client and request a meeting to discuss potential collaboration.
AI Assistant: Understood. Would you like me to provide you with a template to start with?
Person: Yes, that would be helpful.
AI Assistant: Okay, here's a draft you can start with:
Dear [Client Name],
I hope this email finds you well. My name is [Your Name], and I am reaching out from [Your Business Name], a marketing services company. We specialize in helping businesses like yours increase their brand awareness and drive sales through various marketing strategies.
I would like to take this opportunity to introduce our services to you and discuss potential collaboration opportunities. I would like to request a meeting to discuss in more detail how we can help your business grow.
Please let me know if this is something that interests you, and we can schedule a time that works best for you.
Thank you for considering [Your Business Name]. I look forward to hearing back from you.
Best regards,
[Your Name]
"""
chat_prompt = PromptTemplate(
template=output_parser.guard.base_prompt, input_variables=["transcript"]
)
chain = LLMChain(llm=ChatOpenAI(temperature=0), prompt=chat_prompt)
output = chain.run(transcript=transcript)
output_parser.parse(output)
I get the following:
File "/usr/local/lib/python3.11/site-packages/guardrails/datatypes.py", line 123, in validate
value = self.from_str(value)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/guardrails/datatypes.py", line 185, in from_str
if s.lower() == "true":
^^^^^^^
AttributeError: 'bool' object has no attribute 'lower'
It's notable that the LLM is providing good output, this is what I see as the values within the last frame:
{'s': False, 'self': <class 'guardrails.datatypes.Boolean'>}
I'm not yet too familiar with the codebase but my hunch is that we don't need to parse bools, assuming the LLM produces valid JSON. Alternatively, if the goal is to support non-JSON formats down the line, it may be worthwhile to pattern match on the type of s
to route to the right parsing logic.
This is really cool!
I've been learning about the method you are using to create the json outputs. From what I understand the method involves asking the model to both generate the content and format it into a json in one go. This seems like it would result in the answer being biased towards json file vocabulary.
Therefore I'm suggesting that this is split into two LM queries:
One to navigate into a better subset of the LM training data
Two for the desired output file formatting
Consider the below instruction + rail spec:
query = "new booking from hong kong to london with 2 pallets in a 40 HC"
rail_spec = """
<rail version="0.1">
<output strict="true">
<object name="shipment" format="length: 4">
<string name="origin_city" description="The city name of the origin of the shipment" />
<string name="orirgin_country" description="The country name of the origin of the shipment" />
<string name="destination_city" description="The city name of the destination of the shipment" />
<string name="destination_country" description="The country name of the destination of the shipment (can be taken from the destination_city)" />
</object>
</output>
<prompt>
Generate a valid JSON object for a shipment given a users query:
{{query}}
@complete_json_suffix
</prompt>
</rail>
"""
This will produce the following JSON:
{'origin_city': 'Hong Kong', 'origin_country': 'China', 'destination_city': 'London', 'destination_country': 'United Kingdom', 'pallets': 2, 'container_type': '40 HC'}
As you can see, pallets
and container_type
was added to the output, even though it is not defined in my output.
I just updated the spec used in this PR.
Instead of using on fail action as fix
, I used filter
.
It failed with the following error
160 """Validate a output against the schema.
161
162 Args:
(...)
168 path to the reasked element, and the ReAsk object.
169 """
171 validated_response = deepcopy(output)
--> 173 for field, value in validated_response.items():
174 if field not in schema:
175 logger.debug(f"Field {field} not in schema.")
RuntimeError: dictionary changed size during iteration
I checked the codebase, and it looks like it is happening because of the following line.
validated_response = schema[field].validate(
field, value, validated_response
)
Could we automatically fix this? (try to put all the other kwargs into the desired choice attribute)
At the least, it should reask like my schema
This is my schema:
<choice
name="action"
on-fail-choice="reask"
>
<case
name="new_file"
>
<object
name="new_file"
>
<string
name="filepath"
description="Path to the newly created file."
required="true"
/>
<string
name="description"
description="Description of the contents of the new file."
required="true"
/>
</object>
</case>
<case
name="edit_file"
>
<object
name="edit_file"
>
<string
name="filepath"
description="Path to the file to be edited."
required="true"
/>
<string
name="description"
description="Description of the changes to be made to the file."
required="true"
/>
<integer
name="start_line"
description="The line number of the first line of the hunk to be edited."
format="positive"
required="false"
on-fail="noop"
/>
<integer
name="end_line"
description="The line number of the last line of the hunk to be edited. Keep the hunk as short as possible while fulfilling the description."
format="positive"
required="false"
on-fail="noop"
/>
</object>
</case>
<case
name="finished"
>
<string
name="commit_message"
description="A more appropriate commit message based on the actions taken."
required="true"
/>
</case>
</choice>
This is the generated schema:
<output>
<string name="action" choices="new_file,edit_file,finished"/>
<object name="new_file" description="new_file" if="action==new_file">
<string name="filepath" description="Path to the newly created file." required="true"/>
<string name="description" description="Description of the contents of the new file." required="true"/>
</object>
<object name="edit_file" description="edit_file" if="action==edit_file">
<string name="filepath" description="Path to the file to be edited." required="true"/>
<string name="description" description="Description of the changes to be made to the file." required="true"/>
<integer name="start_line" description="The line number of the first line of the hunk to be edited." format="positive" required="false" on-fail="noop"/>
<integer name="end_line" description="The line number of the last line of the hunk to be edited. Keep the hunk as short as possible while fulfilling the description." format="positive" required="false" on-fail="noop"/>
</object>
<string name="finished" description="commit_message: A more appropriate commit message based on the actions taken." required="true" if="action==finished"/>
</output>
This is the generated string:
{
"action": "edit_file",
"filepath": "autopr/utils/repo.py",
"description": "Update the function call at line 124 to use the new tiktoken implementation.",
"start_line": 124,
"end_line": 124
}
This is the stack trace fragment:
File "/github/workspace/autopr/services/rail_service.py", line 56, in run_rail_object
raw_o, dict_o = pr_guard(
File "/venv/lib/python3.9/site-packages/guardrails/guard.py", line 135, in __call__
guard_history = runner(prompt_params=prompt_params)
File "/venv/lib/python3.9/site-packages/guardrails/run.py", line 86, in __call__
validated_output, reasks = self.step(
File "/venv/lib/python3.9/site-packages/guardrails/run.py", line 136, in step
validated_output = self.validate(index, output_as_dict, output_schema)
File "/venv/lib/python3.9/site-packages/guardrails/run.py", line 237, in validate
validated_output = output_schema.validate(output_as_dict)
File "/venv/lib/python3.9/site-packages/guardrails/schema.py", line 328, in validate
validated_response = self[field].validate(
File "/venv/lib/python3.9/site-packages/guardrails/datatypes.py", line 344, in validate
selected_value = schema[selected_key]
KeyError: 'edit_file'
Suggestion from @devenbhooshan in #50.
Editing to trigger autopr
logging issue
Current logging is defined to be saved under the project code path which is bad practice,
When we initiate gaurdrails we should define the logging path we want to use
Currently, a few in-context examples are provided in the @complete_json_suffix
primitive. However, these examples may not map very well to the output schema in the spec.
This issue should:
Guard
The full logs of the calls of the Guard
class are available (see https://shreyar.github.io/guardrails/logs/).
However, these logs are not formatted very well, and it would be helpful to provide abstractions for common workflows like:
Describe the bug
Installed guardrails-ai, happened with both sqlalchemy versions 1.4.x and 2.x after I updated to try and fix this.
179 class SqlDocument(Base):
180 __tablename__ = "documents"
--> 182 id: orm.Mapped[int] = orm.mapped_column(primary_key=True)
183 page_num: orm.Mapped[int] = orm.mapped_column(
184 sqlalchemy.Integer, primary_key=True
185 )
186 text: orm.Mapped[str] = orm.mapped_column(sqlalchemy.String)
AttributeError: module 'sqlalchemy.orm' has no attribute 'mapped_column'
To Reproduce
Steps to reproduce the behavior:
from guardrails.document_store import DocumentStoreBase, EphemeralDocumentStore
Expected behavior
No exception
Library version:
Version 0.1.6 (latest)
Getting the following error when running from guardrails.utils.pydantic_utils import register_pydantic
on 0.1.5
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 from guardrails.utils.pydantic_utils import register_pydantic
File ~/Library/Caches/pypoetry/virtualenvs/ar-estimator-lHe8_aOL-py3.9/lib/python3.9/site-packages/guardrails/utils/pydantic_utils.py:15, in <module>
12 import logging
13 from typing import TYPE_CHECKING, Dict
---> 15 from griffe.dataclasses import Docstring
16 from griffe.docstrings.parsers import Parser, parse
18 griffe_docstrings_google_logger = logging.getLogger("griffe.docstrings.google")
ModuleNotFoundError: No module named 'griffe'
Describe the bug
If I create a choice variable in the rail file, but the user does not give any information about this variable, a KeyError is thrown when the choice variable is validated.
Expected behavior
Add the line of code that I suggest below, or create a way to handle Nones
Library version:
Version (e.g. 0.1.6)
Additional context
I put the following line of code in the validation function of choice and this fixed the bug:
if value is None: return schema
referring to reask
Might be useful to have max retry count or similar
First, thanks for putting this library together, it's super useful.
I'm trying to use guardrails with langchain
and am running into a parsing error.
Here's my code:
from langchain.output_parsers import GuardrailsOutputParser
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate, LLMChain
rail_spec = """
<rail version="0.1">
<output>
<string
name="message"
description="the message the user wants to send"
/>
</output>
<prompt>
Given the following transcript, answer the following questions. If the answer doesn't exist in the transcript, enter `None`.
```
{{transcript}}
```
@xml_prefix_prompt
{output_schema}
@json_suffix_prompt
</prompt>
</rail>
"""
output_parser = GuardrailsOutputParser.from_rail_string(rail_spec)
transcript = """
Person: Hi, can you help me draft a business letter?
AI Assistant: Of course, what type of letter are you looking to draft?
Person: I need to send a formal email to a potential client to introduce my business.
AI Assistant: Great. Can you give me some more details about your business and the purpose of the email?
Person: Sure, my business provides marketing services and I want to introduce our services to the potential client and request a meeting to discuss potential collaboration.
AI Assistant: Understood. Would you like me to provide you with a template to start with?
Person: Yes, that would be helpful.
AI Assistant: Okay, here's a draft you can start with:
Dear [Client Name],
I hope this email finds you well. My name is [Your Name], and I am reaching out from [Your Business Name], a marketing services company. We specialize in helping businesses like yours increase their brand awareness and drive sales through various marketing strategies.
I would like to take this opportunity to introduce our services to you and discuss potential collaboration opportunities. I would like to request a meeting to discuss in more detail how we can help your business grow.
Please let me know if this is something that interests you, and we can schedule a time that works best for you.
Thank you for considering [Your Business Name]. I look forward to hearing back from you.
Best regards,
[Your Name]
"""
chat_prompt = PromptTemplate(template=output_parser.guard.base_prompt, input_variables=["transcript"])
chain = LLMChain(llm=ChatOpenAI(temperature=0), prompt=chat_prompt)
output = chain.run(transcript=transcript)
output_parser.parse(output) # <- this is where the error occurs
I get the following:
File "/usr/local/lib/python3.11/site-packages/guardrails/utils/reask_utils.py", line 265, in sub_reasks_with_fixed_values
value[key] = sub_reasks_with_fixed_values(value)
~~~~~^^^^^
TypeError: 'str' object does not support item assignment
The raw unparsed output looks reasonable too:
{
"output": {
"string": {
"@name": "message",
"@description": "the message the user wants to send",
"#text": "Dear [Client Name],\n\nI hope this email finds you well. My name is [Your Name], and I am reaching out from [Your Business Name], a
marketing services company. We specialize in helping businesses like yours increase their brand awareness and drive sales through various marketing
strategies.\n\nI would like to take this opportunity to introduce our services to you and discuss potential collaboration opportunities. I would like to
request a meeting to discuss in more detail how we can help your business grow.\n\nPlease let me know if this is something that interests you, and we can
schedule a time that works best for you.\n\nThank you for considering [Your Business Name]. I look forward to hearing back from you.\n\nBest regards,\n\n[Your
Name]"
}
}
}
I'm running: langchain==0.0.113
0.0.113 and guardrails-ai==0.1.1
. I've tried variations of the schema, including using objects instead of flat values. I also tried different JSON suffixes to no avail.
There's a few key differences in my example compared to the langchain docs for guardrails, though I don't think they should matter:
ChatOpenAI
rather than the OpenAI
model.From the discord:
How do I create a bool variable that can also be None or NA? Now every time I give an input and it doesn't contain the variable, it defaults to "False". But I want it to default to "None" when the user doesn't mention the variable...
I tried added some explanation in the prompt and description of the variable, but this doesn't seem to take effect. I also to tried playing around with the "on-fail" tag but to no avail...
When the outcome is not known or not available in the generated text, then it should be possible to support None values.
I am following this example and i keep getting this error, it seems the utils file in this repo doesn't actually contain the extract_prompt_from_xml
variable/module.
File "/Users/mac/Documents/GitHub/datagpt/guard.py", line 77, in <module>
fmt_qa_tmpl = output_parser.format(DEFAULT_TEXT_QA_PROMPT_TMPL)
File "/Users/mac/Documents/GitHub/datagpt/venv/lib/python3.10/site-packages/gpt_index/output_parsers/guardrails.py", line 84, in format
from guardrails.utils.reask_utils import extract_prompt_from_xml
ImportError: cannot import name 'extract_prompt_from_xml' from 'guardrails.utils.reask_utils' (/Users/mac/Documents/GitHub/datagpt/venv/lib/python3.10/site-packages/guardrails/utils/reask_utils.py)
See #12 for an example where the outermost JSON object is missing the shipment
key, and so dictionary validation is skipped.
Describe the bug
I am guardrails-ai for small python project. I have followed the getting started guide & modifed the .rails spec & prompt as per my requirement.
The below code snippet take from getting started guide
Print the validated output from the LLM
print(validated_output) : Outputs None
Ideally i wanted JSON string
Prints None
on the stdout
When i view the logs,
I found that guardrails-ai was able to get the output from LLM but was not able to give it back to my python code, logs showed me this error of module 'numpy' has no attribute 'bool'.
To Reproduce
Steps to reproduce the behavior:
<rail version="0.1">
<output>
<string name="epf" description="Employee Provident Fund Amount (EPF) per annum" />
<string name="gratuity" description="Gratuity per annum" />
<string name="medialInsurance" description="Medical Insurance per annum" />
<string name="termInsurance" description="Term Insurance per annum" />
<string name="ctc" description="Cost To Company per annum" />
<object name="miscellaneous" description="Cost To Company per annum">
</object>
</output>
<prompt>
I have shared sample data of offer letter which has a CTC amount and it's breakdown in it
{{table}}
@complete_json_suffix_v2
</prompt>
</rail>
guard(...)
)import os
import tabula
import openai
import guardrails as gd
# Get the path to the PDF file
pdf_file_path = "/home/sharad/personal/test-python-salary-gpt/test.pdf"
# Extract the table from the PDF file
table = tabula.read_pdf(pdf_file_path)
promt="""
${table}"""
print(promt.format(table=table))
guard = gd.Guard.from_rail('spec.rail')
# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = ""
# Wrap the OpenAI API call with the `guard` object
raw_llm_output, validated_output = guard(
openai.Completion.create,
prompt_params={"table": promt.format(table=table)},
engine="text-davinci-003",
max_tokens=1024,
temperature=0.5,
)
# Print the validated output from the LLM
print(validated_output)
I get output as None
Expected behavior
I should have gotten JSON output when printing print(validated_output)
statement
Library version:
Version (e.g. 0.1.6)
Guard Rail Version: 0.1.6
Additional context
Add any other context about the problem here.
Here are the logs of guarrails
Output When I run my program
Traceback (most recent call last):
File "/Users/dev/Library/Application Support/JetBrains/PyCharm2022.3/scratches/scratch_63.py", line 67, in <module>
guard = gd.Guard.from_rail_string(rail_str)
File "/Users/dev/.virtualenvs/ddgai-b0692ca2/lib/python3.10/site-packages/guardrails/guard.py", line 105, in from_rail_string
return cls(Rail.from_string(rail_string), num_reasks=num_reasks)
File "/Users/dev/.virtualenvs/ddgai-b0692ca2/lib/python3.10/site-packages/guardrails/rail.py", line 106, in from_string
return cls.from_xml(ET.fromstring(string, parser=XMLPARSER))
File "/Users/dev/.virtualenvs/ddgai-b0692ca2/lib/python3.10/site-packages/guardrails/rail.py", line 119, in from_xml
script = cls.load_script(raw_script)
File "/Users/dev/.virtualenvs/ddgai-b0692ca2/lib/python3.10/site-packages/guardrails/rail.py", line 183, in load_script
return Script.from_xml(root)
File "/Users/dev/.virtualenvs/ddgai-b0692ca2/lib/python3.10/site-packages/guardrails/rail.py", line 30, in from_xml
exec(root.text, globals())
File "<string>", line 2, in <module>
File "/Users/dev/.virtualenvs/ddgai-b0692ca2/lib/python3.10/site-packages/guardrails/utils/pydantic_utils.py", line 15, in <module>
from griffe.dataclasses import Docstring
ModuleNotFoundError: No module named 'griffe'
I think griffie
is missing from the package requirements
Description
Apparently, some models are better at following instructions than others. In my own experience, chat models are a bit more likely to butcher JSON output e.g.
With that in mind, it would be cool to use an expensive chat model to process the request and let it respond in spoken text. Then a model that behaves very well w.r.t. output structure can use that text output and generate the structured output.
Why is this needed
This is needed if you need a creative model but you really need structured output.
The dev
extras specifies isort>=5.12.0
as a requirement. That version is available only on Python >= 3.8 (https://pypi.org/project/isort/).
This obviously only affects the dev environment and guardrails-ai can be installed normally on Python 3.7. Maybe just something to document?
The from_str(...)
method of the integer and float datatype fail if the LLM responds with a None. Ideally, they should default to something like math.nan
or some other "null" value.
Have guardrails take in a GraphQL schema or ProtoBuff schema for format validation
Love this package :) thanks for building this.
Went through the codebase - specifically the validation side. Do you think we could use something like this
to speed up the development of validators?
Describe the bug
Currently, when using OpenAI chat completion API, and the API response fails (possible failure reasons: invalid API key, too many requests, network error, etc), library throws the following error:
Traceback (most recent call last):
File "/Users/matejm/dev/main.py", line 58, in <module>
specs = call_openai(f.read())
^^^^^^^^^^^^^^^^^^^^^
File "/Users/matejm/dev/main.py", line 19, in call_openai
_, validated_output = guard(
^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/guardrails/guard.py", line 166, in __call__
guard_history = runner(prompt_params=prompt_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/guardrails/run.py", line 90, in __call__
validated_output, reasks = self.step(
^^^^^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/guardrails/run.py", line 144, in step
output, output_as_dict = self.call(index, instructions, prompt, api, output)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/guardrails/run.py", line 236, in call
output = api(prompt.source)
^^^^^^^^^^^^^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/tenacity/__init__.py", line 289, in wrapped_f
return self(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/tenacity/__init__.py", line 379, in __call__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/tenacity/__init__.py", line 313, in iter
if not (is_explicit_retry or self.retry(retry_state)):
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/tenacity/retry.py", line 76, in __call__
return self.predicate(exception)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matejm/dev/venv/lib/python3.11/site-packages/tenacity/retry.py", line 92, in <lambda>
super().__init__(lambda e: isinstance(e, exception_types))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: isinstance() arg 2 must be a type, a tuple of types, or a union
Error is far from descriptive, it actually hides the actual problem from the developer.
To Reproduce
Steps to reproduce the behavior:
<rail version="0.1">
<output>
<string name="test" />
</output>
<prompt>
Fill test string
@complete_json_suffix_v2
</prompt>
</rail>
Simply run request without a valid API key.
import openai
import guardrails as gd
guard = gd.Guard.from_rail('demo.rail')
guard(
openai.ChatCompletion.create,
prompt_params={},
model="gpt-3.5-turbo",
max_tokens=1024,
temperature=0.3,
)
Expected behavior
Return invalid response or throw informative error which would allow developers to figure out what went wrong.
Library version:
Version 0.1.6
The prompt contains all of the instructions about the output schema and formatting instructions. Currently, the prompt is always passed in as a user message. However, for ChatCompletion models, it makes sense to pass in the output schema and formatting instructions in the system message instead of in the prompt.
In order to do this, the openai_chat_wrapper
should be updated so that it deconstructs the prompt into format instructions and everything else. The format instructions can be extracted from guard.prompt.format_instructions
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.