arun1729 / cog Goto Github PK
View Code? Open in Web Editor NEWMicro Graph Database for Python Applications
Home Page: http://cogdb.io
License: MIT License
Micro Graph Database for Python Applications
Home Page: http://cogdb.io
License: MIT License
While attempting to install CogDB on a Win11 machine with Python 3.10.1 and pip 22.3.1 I observed that the installation fails due to dependency on old xxhash version:
Would it be possible to update the CogDB project to use a more recent version of xxhash library, that doesn't require installation of Microsoft C++ Build Tools?
I get the following message when I try to install with pip. Any ideas for troubleshooting?
bash-3.2$ pip install cogdb
Collecting cogdb
Could not find a version that satisfies the requirement cogdb (from versions: )
No matching distribution found for cogdb
i meet a pb when trying installing cogdb on windows : xxhash need to be build with C++ tools ; it is not indicated in the documentation
I just discovered cogdb and so far it looks like it has potential. I am still looking through everything, but is there a way to automatically set cog graph view width/height other than manually editing the HTML file? Also when showing the full graph for the books.csv it gets rendered slow in web browser and takes awhile to load, I am using Firefox. Is there any way to speed this up? Running the javascript source locally didn't help, thought maybe it had to do cloud-flare but maybe trying on GPU would help.
need compatibility update to be able to install and run cog on python3
Update the render method in Torque to include two arguments: height and width which should be to used set the iframe height and width in the html template. The current values should be the default values.
Render:
Line 318 in 242dbc9
How to enable in memory caching in Cogdb.? As of now when it is compared with other databases time taken for fetching data it more
Environment:
Python 3.8.9 (default, Apr 21 2021, 23:14:29)
[GCC 10.2.0] on cygwin
cogdb==2.0.5
xxhash==2.0.0
Issue:
When running your introductory example from the main page of https://cogdb.io/ (in a jupyter notebook), the following error was thrown:
OSError Traceback (most recent call last)
in
14 g.put("greg","status","cool_person")
15
---> 16 g.v().tag("from").out("follows").tag("to").view("follows").render()
/usr/local/lib/python3.8/site-packages/cog/torque.py in render(self)
324 iframe_html = r""" <iframe srcdoc='{0}' width="700" height="700"> </iframe> """.format(self.html)
325 from IPython.core.display import display, HTML
--> 326 display(HTML(iframe_html))
327
328 def persist(self):
/usr/lib/python3.8/site-packages/IPython/core/display.py in init(self, data, url, filename, metadata)
716 if warn():
717 warnings.warn("Consider using IPython.display.IFrame instead")
--> 718 super(HTML, self).init(data=data, url=url, filename=filename, metadata=metadata)
719
720 def repr_html(self):
/usr/lib/python3.8/site-packages/IPython/core/display.py in init(self, data, url, filename, metadata)
628 self.metadata = {}
629
--> 630 self.reload()
631 self._check_data()
632
/usr/lib/python3.8/site-packages/IPython/core/display.py in reload(self)
653 """Reload the raw data from file or URL."""
654 if self.filename is not None:
--> 655 with open(self.filename, self._read_flags) as f:
656 self.data = f.read()
657 elif self.url is not None:
OSError: [Errno 91] File name too long: ' <iframe srcdoc='\n\n\n \n <title>Cog Graph</title>\n <style type="text/css">\n body {\n padding: 0;\n margin: 0;\n width: 100%;!important; \n height: 100%;!important; \n }\n\n #cog-graph-view {\n width: 700px;\n height: 700px;\n }\n </style>\n\n\n <script\n type="text/javascript"\n src="https://cdnjs.cloudflare.com/ajax/libs/vis/4.21.0/vis.min.js"\n ></script>\n \n \n \n
\n\n <script type="text/javascript">\n\n results =[{"id": "fred", "from": "bob", "to": "fred"}, {"id": "fred", "from": "emily", "to": "fred"}, {"id": "dani", "from": "charlie", "to": "dani"}, {"id": "bob", "from": "charlie", "to": "bob"}, {"id": "bob", "from": "alice", "to": "bob"}, {"id": "greg", "from": "dani", "to": "greg"}, {"id": "bob", "from": "dani", "to": "bob"}, {"id": "greg", "from": "fred", "to": "greg"}] \n\n var nodes = new vis.DataSet();\n var edges = new vis.DataSet();\n for (let i = 0; i < results.length; i++) {\n res = results[i];\n nodes.update({\n id: res.from,\n label: res.from\n });\n nodes.update({\n id: res.to,\n label: res.to\n });\n edges.update({\n from: res.from,\n to: res.to\n });\n\n }\n\n var container = document.getElementById("cog-graph-view");\n var data = {\n nodes: nodes,\n edges: edges,\n };\n var options = {\n nodes: {\n font: {\n size: 20,\n color: "black"\n },\n color: "#46944f",\n shape: "dot",\n widthConstraint: 200,\n\n },\n edges: {\n font: "12px arial #ff0000",\n scaling: {\n label: true,\n },\n shadow: true,\n smooth: true,\n arrows: { to: {enabled: true}}\n },\n physics: {\n barnesHut: {\n gravitationalConstant: -30000\n },\n stabilization: {\n iterations: 1000\n },\n }\n\n };\n var network = new vis.Network(container, data, options);\n </script>\n \n\n\n' width="700" height="700"> </iframe> 'Thanks
Toms-MacBook-Pro:graphtest tomsmith$ pip3 install cogdb
Collecting cogdb
Using cached cogdb-0.1.2.tar.gz (9.0 kB)
Building wheels for collected packages: cogdb
Building wheel for cogdb (setup.py) ... done
Created wheel for cogdb: filename=cogdb-0.1.2-py3-none-any.whl size=8679 sha256=4081944f7748d8b81a35d568a923d0ef3b64629ad2182c6f8f545faf7f4d1d04
Stored in directory: /Users/tomsmith/Library/Caches/pip/wheels/8b/31/32/daa3d657e6c6bf56132ccca6081671b85dd37a302302e3cefc
Successfully built cogdb
Installing collected packages: cogdb
Successfully installed cogdb-0.1.2
WARNING: You are using pip version 20.3.1; however, version 20.3.3 is available.
You should consider upgrading via the '/usr/local/opt/[email protected]/bin/python3.9 -m pip install --upgrade pip' command.
Toms-MacBook-Pro:graphtest tomsmith$ python
Python 3.9.1 (default, Dec 17 2020, 03:41:37)
[Clang 12.0.0 (clang-1200.0.32.27)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
from cog.torque import Graph
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.9/site-packages/cog/torque.py", line 1, in
from cog.database import Cog
File "/usr/local/lib/python3.9/site-packages/cog/database.py", line 18, in
from core import Table
ModuleNotFoundError: No module named 'core'
Alternatives to all-at-once rehashing[edit]
Some hash table implementations, notably in real-time systems, cannot pay the price of enlarging the hash table all at once, because it may interrupt time-critical operations. If one cannot avoid dynamic resizing, a solution is to perform the resizing gradually:
Disk-based hash tables almost always use some alternative to all-at-once rehashing, since the cost of rebuilding the entire table on disk would be too high.
Incremental resizing[edit]
One alternative to enlarging the table all at once is to perform the rehashing gradually:
During the resize, allocate the new hash table, but keep the old table unchanged.
In each lookup or delete operation, check both tables.
Perform insertion operations only in the new table.
At each insertion also move r elements from the old table to the new table.
When all elements are removed from the old table, deallocate it.
To ensure that the old table is completely copied over before the new table itself needs to be enlarged, it is necessary to increase the size of the table by a factor of at least (r + 1)/r during resizing.
alice friend bob
alice likes_color blue
bob friend charlie
bob likes_color red
bob friend dani
dani likes_color red
find all friends of bob who likes color red
g.v("bob").out().filter("red").all()
this should return charlie, dani
Maybe I am missing it, but how do I get all edges connecting two vertices? Is it possible? I can't figure it out without filtering edges after the query.
Is there any way in Cogdb to delete all the created nodes and relationships. This is for an automation testing setup
export graph in to edge list file from any point in the graph traversal.
export graph in to a triple file at any point in the graph traversal.
I am wondering if any of these topics on the on the development roadmap...
parse create table
get column names
save column names in table metadata file.
If no columns are provided, create max of json field names so far, sorted in alphabetical order.
get "key" fields. save that in table metadata.
This is an awesome library.
I just started digging into this, but are there plans to support additional types (other than strings)? It would allow us to extend functionality with more extensive querying, such as getting all nodes with a score
attribute greater than x
, etc.
Thanks!
create root db dir if it does not exist
This is a question, not an issue. I'm following the basic documentation for creating a graph using put to populate the graph:
# We have a filename and path - begin parsing the XML file
tree = ET.parse(filename)
root = tree.getroot()
datagraph = Graph("graph1")
for child in root:
if not re.search("OpenPositions$", child.tag): # skip elements we don't care about
print('children of root: ',child.tag)
tradeid = child.find("TradeID")
marketstate = child.find("MarketState")
strategy = child.find("Strategy")
entrydate = child.find("EntryDate")
exitdate = child.find("ExitDate")
pctlossgain = child.find("GainLossPercent")
print('Trade: ', tradeid.text, marketstate.text, strategy.text, entrydate.text, exitdate.text, pctlossgain.text)
# populate the graph db
datagraph.put(child.tag,"hastrade",tradeid.text)
datagraph.put(tradeid.text,"marketstate",marketstate.text)
datagraph.put(tradeid.text,"strategy",strategy.text)
datagraph.put(tradeid.text,"entrydate",entrydate.text)
datagraph.put(tradeid.text,"exitdate",exitdate.text)
datagraph.put(tradeid.text,"pctlossgain",pctlossgain.text)
The input file for the tree is an XML document being parsed by ElementTree and contains around 3 million rows (114MB file). Of those 3 million rows, I need data from about 100,000 rows. Without populating the graph db, the code will rip through and print out the data in under 10 seconds. However, populating the graph has taken approximately 3 hours. The largest of the put statements is approximately 51 bytes in size. Are there faster ways to populate the db? I'm looking for a method that would load the graph in under a minute.
Also, once the graph is created, the data is on disk. To access it in the future (such as the next day or a week from now), do I use the same Graph(graph_name) to connect to it and begin using it again? Sorry if it is in the doc, I just couldn't find it.
Thanks for creating Cog.
implement sstable index
Currentlt cog only provides APIs for loading graph from file like load_csv
, load_triples
, load_edgelist
, can you provide corresponding APIs for saving, like save_csv
, save_triples
, save_edgelist
?
Hello,
I have just come across CogDB that I might be trying to use in a cross-platform AI project coming up but ran into an error with one of the tests.
I am working on Windows 10 (x64) with Python 3.11.4
> python -m unittest
.
.
.
.
D:\AI-Some\GraphDB\CogDB\test\cog\torque.py:265: DeprecationWarning: The use of func is deprecated, please use filter instead.
warnings.warn("The use of func is deprecated, please use filter instead.", DeprecationWarning)
.D:\AI-Some\GraphDB\CogDB\test\cog\torque.py:288: DeprecationWarning: The use of func is deprecated, please use filter instead.
warnings.warn("The use of func is deprecated, please use filter instead.", DeprecationWarning)
...*** deleted test data.
......*** deleted test data.
.....*** deleted test data.
======================================================================
ERROR: test_torque_load_csv (test.test_torque2.TorqueTest2.test_torque_load_csv)
----------------------------------------------------------------------
Traceback (most recent call last):
File "D:\AI-Some\GraphDB\CogDB\test\test\test_torque2.py", line 42, in test_torque_load_csv
g.load_csv(csv_file, "isbn")
File "D:\AI-Some\GraphDB\CogDB\test\cog\torque.py", line 206, in load_csv
self.cog.load_csv(csv_path, id_column_name, graph_name)
File "D:\AI-Some\GraphDB\CogDB\test\cog\database.py", line 425, in load_csv
for row in reader:
File "C:\PythonPython311\Lib\csv.py", line 111, in __next__
row = next(self.reader)
^^^^^^^^^^^^^^^^^
File "C:\PythonPython311\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 709: character maps to <undefined>
----------------------------------------------------------------------
Ran 69 tests in 2.896s
FAILED (errors=1)
Any ideas as to what may be happening here?
Thanks in advance
At database level, implement column abstraction. Column should be stored as value in the kv store. DB Object parses columns and returns column values.
A straight forward implementation would be simply store a py dict or JSON in values. Each property being a column name.
Columns will eventually be used for "select" queries.
Use dumps to serialize json: https://docs.python.org/2/library/json.html
and then index it.
Python 3.7 when tried to import Graph, throws exception
`>>> from cog.core import Table
from cog.torque import Graph
Traceback (most recent call last):
File "", line 1, in
File "/Users/vulogov/Library/Python/3.7/lib/python/site-packages/cog/torque.py", line 1, in
from cog.database import Cog
File "/Users/vulogov/Library/Python/3.7/lib/python/site-packages/cog/database.py", line 18, in
from core import Table
ModuleNotFoundError: No module named 'core'`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.