langchain-ai / langchain Goto Github PK
View Code? Open in Web Editor NEWš¦š Build context-aware reasoning applications
Home Page: https://python.langchain.com
License: MIT License
š¦š Build context-aware reasoning applications
Home Page: https://python.langchain.com
License: MIT License
Hi, is there any chance you could add support for the suffix parameter to the OpenAI class/call?
https://beta.openai.com/docs/api-reference/completions/create
I could land this if you don't have the time, but it should be simple enough that you could land it faster than I could fork the repo and make a PR.
elasticsearch, pinecone
Would be good to have some methods that split on tokens as in the OpenAI example
I get this error when I run pip install -r requirements.txt
ERROR: Could not find a version that satisfies the requirement faiss (from versions: none)
ERROR: No matching distribution found for faiss
As per this issue, requirements.txt should be updated to faiss-cpu.
It's just a bit annoying, I want to use this library in production and I currently store credentials not in the environment.
I think the ideal API is like all the AWS SDKs where you can either stick them in the environment OR pass them as params to the llm constructor.
I can do a PR for this if you're accepting PRs?
I have just updated to version 0.0.7
1- When running a simple question like: "What is the capital of Idaho?" , the result is OK
2- When running a question like: "What is the hometown of the reigning men's U.S. Open champion?"
I got the following error:
What is the hometown of the reigning men's U.S. Open champion?
Are follow up questions needed here: Yes.
Follow up: Who is the reigning men's U.S. Open champion?
(.......)
File ~/anaconda3/envs/nlp/lib/python3.10/site-packages/langchain/chains/self_ask_with_search/base.py:166, in SelfAskWithSearchChain.run(self, question)
152 def run(self, question: str) -> str:
153 """Run self ask with search chain.
154
155 Args:
(...)
...
--> 107 elif "snippet" in res["organic_results"][0].keys():
108 toret = res["organic_results"][0]["snippet"]
109 else:
KeyError: 'organic_results'
Is there a current API for hooking a conversational chain with an agent so, during the conversation, actions can be performed (e.g. db lookup etc)?
If not, how would you see this working architecturally?
The current implementation of HuggingFaceEmbeddings
local sentence-transformers to derive the encodings. This can be limiting as it requires a fairly capable machine to download the model, load it, and run inference.
An alternative is to support embeddings derived directly via the HuggingFaceHub. See this blog post for details.Ā This implementation will set similar expectations as Cohere and OpenAI embeddings API.
I would like to track versions of PromptTemplates through time. an optional version attribute would help with this.
I'm storing examples in both a vectorstore and a database. I would like to add the vectorstore id field to the database after the example has been successfully indexed. I think I could do this if add_texts in vectorstore could return a list of vectorstore IDs, and add_example in SemanticSimilarityExampleSelector propagated the returned ID back as well.
add llm extra dependencies
add all extra dependencies
Running some CRUD-like statements to an agent throws
ResourceClosedError: This result object does not return rows. It has been closed automatically.
From the implementation it appears that it always expects to see rows which are then cast to str
and returned as part of the chain. What would the impact of modifying this behaviour be on the expected usecase for the SQL chain as it is?
when initializing with a sql database, check to see if it already exists raise error if it does not
to make sure they are up to date
per #104 needed to start skipping unit tests due to a segfault - look into this more and figure out what fixes are needed https://github.com/hwchase17/langchain/blob/95dd2f140e19d29bdb62d4dae2048e3edf0ee147/tests/integration_tests/embeddings/test_huggingface.py#L7
Sorry to disturb,I wonder if langchain could process a batch of prompts or it just process each text by calling llm(text)?
Current implementation OpenAI embeddings are hard coded to return only text embeddings via. GPT-3. For example,
def embed_documents(self, texts: List[str]) -> List[List[float]]:
...
responses = [
self._embedding_func(text, engine=f"text-search-{self.model_name}-doc-001")
for text in texts
]
def embed_query(self, text: str) -> List[float]:
...
embedding = self._embedding_func(
text, engine=f"text-search-{self.model_name}-query-001"
)
However, recent literature on reasoning shows CODEX to be more powerful on reasoning tasks than GPT-3. OpenAIEmbeddings
should be modified to support both text and code embeddings.
Improved prompts for harrison/combine_documents_chain
"""QUESTION_PROMPT is the prompt used in phase 1 where we run the LLM on each chunk of the doc."""
question_prompt_template = """Use the following portion of a legal contract to see if any of the text is relevant to answer the question.
Return any relevant text verbatim.
{context}
Question: {question}
Relevant text, if any:"""
QUESTION_PROMPT = PromptTemplate(
template=question_prompt_template, input_variables=["context", "question"]
)
""" """
combine_prompt_template = """Given the following extracted parts of a contract and a question, create a final answer with references ("SOURCES").
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.
QUESTION: Which state/country's law governs the interpretation of the contract?
=========
Content: This Agreement is governed by English law and the parties submit to the exclusive jurisdiction of the English courts in relation to any dispute (contractual or non-contractual) concerning this Agreement save that either party may apply to any court for an injunction or other relief to protect its Intellectual Property Rights.
Source: 28-pl
Content: No Waiver. Failure or delay in exercising any right or remedy under this Agreement shall not constitute a waiver of such (or any other) right or remedy.\n\n11.7 Severability. The invalidity, illegality or unenforceability of any term (or part of a term) of this Agreement shall not affect the continuation in force of the remainder of the term (if any) and this Agreement.\n\n11.8 No Agency. Except as expressly stated otherwise, nothing in this Agreement shall create an agency, partnership or joint venture of any kind between the parties.\n\n11.9 No Third-Party Beneficiaries.
Source: 30-pl
Content: (b) if Google believes, in good faith, that the Distributor has violated or caused Google to violate any Anti-Bribery Laws (as defined in Clause 8.5) or that such a violation is reasonably likely to occur,
Source: 4-pl
=========
FINAL ANSWER: This Agreement is governed by English law.
SOURCES: 28-pl
QUESTION: What did the president say about Michael Jackson?
=========
Content: Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans. \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russiaās Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. \n\nGroups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland.
Source: 0-pl
Content: And we wonāt stop. \n\nWe have lost so much to COVID-19. Time with one another. And worst of all, so much loss of life. \n\nLetās use this moment to reset. Letās stop looking at COVID-19 as a partisan dividing line and see it for what it is: A God-awful disease. \n\nLetās stop seeing each other as enemies, and start seeing each other for who we really are: Fellow Americans. \n\nWe canāt change how divided weāve been. But we can change how we move forwardāon COVID-19 and other issues we must face together. \n\nI recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. \n\nThey were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \n\nOfficer Mora was 27 years old. \n\nOfficer Rivera was 22. \n\nBoth Dominican Americans whoād grown up on the same streets they later chose to patrol as police officers. \n\nI spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves.
Source: 24-pl
Content: And a proud Ukrainian people, who have known 30 years of independence, have repeatedly shown that they will not tolerate anyone who tries to take their country backwards. \n\nTo all Americans, I will be honest with you, as Iāve always promised. A Russian dictator, invading a foreign country, has costs around the world. \n\nAnd Iām taking robust action to make sure the pain of our sanctions is targeted at Russiaās economy. And I will use every tool at our disposal to protect American businesses and consumers. \n\nTonight, I can announce that the United States has worked with 30 other countries to release 60 Million barrels of oil from reserves around the world. \n\nAmerica will lead that effort, releasing 30 Million barrels from our own Strategic Petroleum Reserve. And we stand ready to do more if necessary, unified with our allies. \n\nThese steps will help blunt gas prices here at home. And I know the news about whatās happening can seem alarming. \n\nBut I want you to know that we are going to be okay.
Source: 5-pl
Content: More support for patients and families. \n\nTo get there, I call on Congress to fund ARPA-H, the Advanced Research Projects Agency for Health. \n\nItās based on DARPAāthe Defense Department project that led to the Internet, GPS, and so much more. \n\nARPA-H will have a singular purposeāto drive breakthroughs in cancer, Alzheimerās, diabetes, and more. \n\nA unity agenda for the nation. \n\nWe can do this. \n\nMy fellow Americansātonight , we have gathered in a sacred spaceāthe citadel of our democracy. \n\nIn this Capitol, generation after generation, Americans have debated great questions amid great strife, and have done great things. \n\nWe have fought for freedom, expanded liberty, defeated totalitarianism and terror. \n\nAnd built the strongest, freest, and most prosperous nation the world has ever known. \n\nNow is the hour. \n\nOur moment of responsibility. \n\nOur test of resolve and conscience, of history itself. \n\nIt is in this moment that our character is formed. Our purpose is found. Our future is forged. \n\nWell I know this nation.
Source: 34-pl
=========
FINAL ANSWER: The president did not mention Michael Jackson.
SOURCES:
QUESTION: {question}
=========
{summaries}
=========
FINAL ANSWER:"""
COMBINE_PROMPT = PromptTemplate(
template=combine_prompt_template, input_variables=["summaries", "question"]
)
"""COMBINE_DOCUMENT_PROMPT is fed into CombineDocumentsChain() at langchain/chains/combine_documents.py"""
COMBINE_DOCUMENT_PROMPT = PromptTemplate(
template="Content: {page_content}\nSource: {source}",
input_variables=["page_content", "source"],
)
Given:
Is there any way to do this elegantly in langchain? Perhaps some way to provide an output formatter to a chain, or some for_each pre- / post-processor? Or does this seem like just two independent chains with processing in between?
First, when I load them I get a warning:
hf = HuggingFaceHub(repo_id="google/flan-t5-xl")
You're using a different task than the one specified in the repository. Be sure to know what you're doing :)
Then, when I use it in inference, I get gibberish.
hf("The capital of New York is")
'ew York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is'
If I run the API via requests
, I get the expected answer
import requests
API_URL = "https://api-inference.huggingface.co/models/google/flan-t5-xl"
headers = {"Authorization": "Bearer api_org_xxxxxxxxxxxxxxxxxxxxxxxxxxx"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "The capital of New York is",
})
print(output)
[{'generated_text': 'Albany'}]
Any suggestions?
File "C:\ProgramData\Anaconda3\envs\LangChain\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 8: character maps to <undefined>
right now, some chains print out intermediate steps, some dont. lets standardize it so that they all have the same flag which turns it on/off, and things are printed out in a standard way. Ideally colorized
pytest tests/integration_tests
============================= test session starts ==============================
platform darwin -- Python 3.9.12, pytest-7.1.1, pluggy-1.0.0
rootdir: /Users/delip/workspace/langchain
plugins: anyio-3.5.0, dotenv-0.5.2
collected 14 items / 1 error
==================================== ERRORS ====================================
____________ ERROR collecting tests/integration_tests/test_faiss.py ____________
ImportError while importing test module '/Users/delip/workspace/langchain/tests/integration_tests/test_faiss.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../opt/anaconda3/lib/python3.9/importlib/__init__.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/integration_tests/test_faiss.py:9: in <module>
from langchain.faiss import FAISS
E ModuleNotFoundError: No module named 'langchain.faiss'
I would like to use SequentialChain with the option to use a different LLM class at each step. The rationale behind this is that I am using different temperature settings for different prompts within my chain. I also potentially may use different models for each step in the future.
a rough idea for config -- have a json dict specifying LLM config, and pass in a list of configs (or list of LLM objects) which is the same length as the number of prompttemplates in the chain if you want to use different objects per chain, or one LLM object or config object in the case where you want to use the same for all
make the calls in the "map" part concurrently
For example this list of entities but also the conversation summary
too few free trials, too expensive
Currently is as below, which is way too ugly
335 Create a new model by parsing and validating input data from keyword arguments.
336
337 Raises ValidationError if the input data cannot be parsed to form a valid model.
338 """
339 # Uses something other than `self` the first arg to allow "self" as a settable attribute
--> 340 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
341 if validation_error:
342 raise validation_error
...
---> 53 input_variables = values["input_variables"]
54 template = values["template"]
55 template_format = values["template_format"]
KeyError: 'input_variables'
Do you have a bibtex citation for this repo?
e.g. something like the following (from https://github.com/bigscience-workshop/promptsource)
@misc{bach2022promptsource,
title={PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts},
author={Stephen H. Bach and Victor Sanh and Zheng-Xin Yong and Albert Webson and Colin Raffel and Nihal V. Nayak and Abheesht Sharma and Taewoon Kim and M Saiful Bari and Thibault Fevry and Zaid Alyafeai and Manan Dey and Andrea Santilli and Zhiqing Sun and Srulik Ben-David and Canwen Xu and Gunjan Chhablani and Han Wang and Jason Alan Fries and Maged S. Al-shaibani and Shanya Sharma and Urmish Thakker and Khalid Almubarak and Xiangru Tang and Xiangru Tang and Mike Tian-Jian Jiang and Alexander M. Rush},
year={2022},
eprint={2202.01279},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
maybe not the ones that require $$
I want to save prompt templates in a JSONField in a django db. the current save() method on BasePromptTemplate outputs to a file rather than a json object. I'd prefer to have a method that loads the prompt to & from json, & I decide what to do with the JSON myself.
Hello,
When running the "Map Reduce" demo, I see the error below:
ImportError Traceback (most recent call last)
Cell In [3], line 1
----> 1 from langchain import OpenAI, PromptTemplate, LLMChain
2 from langchain.text_splitter import CharacterTextSplitter
3 from langchain.chains.mapreduce import MapReduceChain
ImportError: cannot import name 'PromptTemplate' from 'langchain' (/Users/mteoh/projects/langchain_sandbox/.env/lib/python3.9/site-packages/langchain/__init__.py)
I see it defined here: https://github.com/hwchase17/langchain/blob/c02eb199b6587aeeb50fbb083693572bd2f030cc/langchain/prompts/prompt.py#L13
And mentioned here:
https://github.com/hwchase17/langchain/blob/c02eb199b6587aeeb50fbb083693572bd2f030cc/langchain/__init__.py#L35
However, when grepping in the library directory, I do not find it:
:~/projects/langchain_sandbox/.env/lib/python3.9/site-packages/langchain $ grep -r PromptTemplate .
Relevant versions of my packages:
$ pip freeze | grep "langchain\|openai"
langchain==0.0.16
openai==0.25.0
Any advice? Thanks! Excited about this work.
cohere, huggingface, ai21
A declarative, efficient, and flexible JavaScript library for building user interfaces.
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ššš
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ā¤ļø Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.