Comments (1)
Comments from our discussion:
Thanks for putting the notebooks together my main feedback points summarized would be:
- have summary / conclusion for each in readme (also findings, learnings)
- use same PDF documents for each example
- generate the same machine processable (JSON) output from each (plus the human baseline) for comparison and further analysis
- use dotenv in the notebooks to allow loading environment variables from an .env file for the notebook so we don't need to ensure to add / remove credentials
Better use JSON than CSV actually because then we can handle multiple properties for each entity, align it with the structure we get from diffbot -> nodes / relationships
Just comments, no action needed here:
for the triple ones (like rebel / llama-index) our challenge here is that we can't use the results out of the box, we would either have to:
- modify them to output property graph nodes or relationships
- or post-process the triples to aggregate all entity attribute triples into properties and only keep the triples that represent semantic relationships as such
- or do this during insertion of the data into the graph - aggregating when inserting, e.g. initially create/merge the nodes with their ID and subsequently merge on id + add property and for the relationships find start and end node with label and id and create relationship
for the Rebel one in: def create_triplets(tx, triplet) : if we want to look at this approach in the future we should see if we can carry the entity-type over, so we can use not just the generic :Node but in addition also a label for the type like :Person or :Organization
and then also do the attribute aggregation there
from llm-graph-builder.
Related Issues (20)
- DEV branch is again giving Failed to replace env in config: ${NPM_TOKEN} HOT 8
- support for URL parameters:
- Don't rely on the implicit environment variables for OPENAI_API_KEY but pass them explicitely as constructor parameters
- Add support for Azure OpenAI for local deployment HOT 1
- fix the langchain and langchain_community version for deployment HOT 1
- neo4j.watch should only be installed when the debug config is enabled
- integration test for large file (esp. for upload + chunking) HOT 2
- persist schema across app reloads / refreshes
- New DEV branch file selection from front end to delete from graph not working for a file stuck in processing state HOT 1
- NotFoundError - Extract Chain Invoke
- Graph View from Info model
- Docker Compose fails to build on Mac with error installing triton package HOT 1
- Cannot Generate Graph via uploaded pdf file HOT 4
- Feature Request: Progress Bar, Disable Disconnect button and Support for Gemini 1.5 Pro and Gemini 1.5 Flash HOT 2
- Local Windows - Cannot Connect to Neo4j desktop - Could not user APOC procedures HOT 1
- document filter for chatbot
- Integrate structured data as well for QnA in addition to unstructured data from pdf files HOT 2
- Add Cancel Icon to info Popup HOT 1
- Confirmation modal for processing large files
- Uvicorn service started next steps HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llm-graph-builder.