Comments (8)
@Dev-Khant sure go for it.
from embedchain.
Yeah this seems like a reasonable approach to me as well. Please proceed with this approach.
from embedchain.
@deshraj Can I pick this up?
from embedchain.
Hi @deshraj,
Here to get data for repo
, branch
and for specific folder
I think using get_repo
function from Github library would be easier compared to the current approach of cloning the repo and then traversing the tree. For extracting specific file we can directly use get_contents
.
Docs:
- get_repo: https://pygithub.readthedocs.io/en/latest/examples/Repository.html#get-all-of-the-contents-of-the-root-directory-of-the-repository
- get_contents: https://pygithub.readthedocs.io/en/latest/examples/Repository.html#get-a-specific-content-file
I have previously worked around this approach: https://github.com/Dev-Khant/Analyze-Github-Code/blob/main/LLM/scrap.py#L33
Making a change to this will only affect the query with type=="repo"
.
Let me know if I can move ahead with this approach.
from embedchain.
@deshraj Here do we need to store data from results
because currently data variable is already getting replaced by self._get_github_repo_data
. Let me know if we want to add data from results
or just the content of repo.
from embedchain.
Ah good catch. This seems like a bug and should be fixed. Can you please fix it in your PR?
from embedchain.
Yes I can fix it. But do we have to add results
to data
or just the repo contents?
from embedchain.
@deshraj I have raised the PR.
from embedchain.
Related Issues (20)
- How to add arguments "trust_remote_code": "True" to hugginface model? HOT 1
- Mistral llm not working HOT 1
- [Youtube Loader] Add support for storing timestamp metadata when loading youtube videos HOT 1
- Cohere integration not working HOT 2
- Add support for embedding model from Cohere HOT 3
- AttributeError: type object 'App' has no attribute 'from_config' HOT 3
- Integrate Infinity Framework for Enhanced Embedding Inference Speed
- DOC: Following Quick Start and Im facing a lot of issues HOT 9
- PDF File requires extra dependencies. Install with pip install --upgrade "embedchain[dataloaders]" HOT 4
- DOC: Following Quick Start with Hugging Face and no answer HOT 1
- Session factory is not initialized HOT 4
- DOC: Following Quick Start with Hugging Face and error (Python 3.12.2, embedchain 0.1.84) HOT 2
- Add support to pass Callback Handlers HOT 1
- Try to share app.config.id with two programs with no success with huggingface (embedchain 0.1.88)
- data source model uses "type" not "data_type" HOT 2
- Introduction of alembic caused regression HOT 2
- Facing issue with quick start guide HOT 2
- Add support for providing certificate files for the websites HOT 1
- query returns the prompt and document? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from embedchain.