Giter VIP home page Giter VIP logo

pre-processing-playground's People

Contributors

ddematheu avatar hwchase17 avatar rlancemartin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pre-processing-playground's Issues

Error on splitting txt file using UnstructuredIO

App is installed under Win11 in conda vEnv D:\LLM\ETL\vETL & Python3.10.0
git cloned to D:\LLM\ETL\vETL\Pyproject

From GUI
ValueError: Invalid file C:\Users\user\AppData\Local\Temp\tmpeo5yaxcb. The FileType.UNK file type is not supported in partition.

From conda prompt
Traceback (most recent call last):
File "D:\LLM\ETL\vETL\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "D:\LLM\ETL\vETL\PyProject\splitter.py", line 134, in
documents = document_loading(temp_file=file_path, loader_choice=loader_choice)
File "D:\LLM\ETL\vETL\PyProject\utils.py", line 50, in document_loading
return loader.load()
File "D:\LLM\ETL\vETL\lib\site-packages\langchain\document_loaders\unstructured.py", line 86, in load
elements = self._get_elements()
File "D:\LLM\ETL\vETL\lib\site-packages\langchain\document_loaders\unstructured.py", line 172, in _get_elements
return partition(filename=self.file_path, **self.unstructured_kwargs)
File "D:\LLM\ETL\vETL\lib\site-packages\unstructured\partition\auto.py", line 366, in partition
raise ValueError(f"{msg}. The {filetype} file type is not supported in partition.")
ValueError: Invalid file C:\Users\user\AppData\Local\Temp\tmpeo5yaxcb. The FileType.UNK file type is not supported in partition.

Error on splitting pdf file

App is installed under Win11, conda vEnv (D:\LLM\ETL\vETL) & Python 3.10.0

GUI Error
AuthenticationError: No API key provided. You can set your API key in code using 'openai.api_key = ', or you can set the environment variable OPENAI_API_KEY=). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = '. You can generate API keys in the OpenAI web interface. See https://platform.openai.com/account/api-keys for details.
File "D:\LLM\ETL\vETL\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "D:\LLM\ETL\vETL\PyProject\splitter.py", line 136, in
st.session_state.chunks = text_splitter(splitter_choice=splitter_choice, chunk_size=chunk_size, chunk_overlap=chunk_overlap, length_function=length_function, documents=documents)
File "D:\LLM\ETL\vETL\PyProject\utils.py", line 22, in text_splitter
splitter_code = llm_based_chunking_prep(documents[0].page_content)
File "D:\LLM\ETL\vETL\PyProject\SemanticHelpers\semantic_chunking.py", line 48, in llm_based_chunking_prep
chunking_strategy = llm_based_chunking_strategy(text=fixed_text)['content']
File "D:\LLM\ETL\vETL\PyProject\SemanticHelpers\semantic_chunking.py", line 20, in llm_based_chunking_strategy
response = openai.ChatCompletion.create(
File "D:\LLM\ETL\vETL\lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "D:\LLM\ETL\vETL\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 149, in create
) = cls.__prepare_create_request(
File "D:\LLM\ETL\vETL\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 106, in __prepare_create_request
requestor = api_requestor.APIRequestor(
File "D:\LLM\ETL\vETL\lib\site-packages\openai\api_requestor.py", line 138, in init
self.api_key = key or util.default_api_key()
File "D:\LLM\ETL\vETL\lib\site-packages\openai\util.py", line 186, in default_api_key
raise openai.error.AuthenticationError(

Conda prompt Error
raceback (most recent call last):
File "D:\LLM\ETL\vETL\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "D:\LLM\ETL\vETL\PyProject\splitter.py", line 136, in
st.session_state.chunks = text_splitter(splitter_choice=splitter_choice, chunk_size=chunk_size, chunk_overlap=chunk_overlap, length_function=length_function, documents=documents)
File "D:\LLM\ETL\vETL\PyProject\utils.py", line 22, in text_splitter
splitter_code = llm_based_chunking_prep(documents[0].page_content)
File "D:\LLM\ETL\vETL\PyProject\SemanticHelpers\semantic_chunking.py", line 48, in llm_based_chunking_prep
chunking_strategy = llm_based_chunking_strategy(text=fixed_text)['content']
File "D:\LLM\ETL\vETL\PyProject\SemanticHelpers\semantic_chunking.py", line 20, in llm_based_chunking_strategy
response = openai.ChatCompletion.create(
File "D:\LLM\ETL\vETL\lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "D:\LLM\ETL\vETL\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 149, in create
) = cls.__prepare_create_request(
File "D:\LLM\ETL\vETL\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 106, in __prepare_create_request
requestor = api_requestor.APIRequestor(
File "D:\LLM\ETL\vETL\lib\site-packages\openai\api_requestor.py", line 138, in init
self.api_key = key or util.default_api_key()
File "D:\LLM\ETL\vETL\lib\site-packages\openai\util.py", line 186, in default_api_key
raise openai.error.AuthenticationError(
openai.error.AuthenticationError: No API key provided. You can set your API key in code using 'openai.api_key = ', or you can set the environment variable OPENAI_API_KEY=). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = '. You can generate API keys in the OpenAI web interface. See https://platform.openai.com/account/api-keys for details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.