synbiodex / pysbol2 Goto Github PK
View Code? Open in Web Editor NEWA pure Python implementation of the SBOL standard.
License: Apache License 2.0
A pure Python implementation of the SBOL standard.
License: Apache License 2.0
Testing compatibility with etl-to-synbiohub-pipeline yields errors because PartShop does not have a getURL()
method.
======================================================================
ERROR: setUpClass (test_sbh_submissions.TestBackBuilder)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/etsp/tests/test_sbh_submissions.py", line 23, in setUpClass
test.sparql_endpoint = SynBioHubQuery(test.sbh.getURL() + '/sparql', False, test.sbh.getUser(),
AttributeError: 'PartShop' object has no attribute 'getURL'
======================================================================
ERROR: setUpClass (test_sbh_submissions.TestOrphans)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/etsp/tests/test_sbh_submissions.py", line 59, in setUpClass
test.sparql_endpoint = SynBioHubQuery(test.sbh.getURL() + '/sparql', False, test.sbh.getUser(),
AttributeError: 'PartShop' object has no attribute 'getURL'
Python 2 end of life is January 1, 2020. There are just a few places where specific Python 2 compatibility is maintained.
$ grep -r version_info *
sbol/test/test_componentdefinition.py: if sys.version_info[0] < 3:
sbol/test/test_componentdefinition.py: if sys.version_info[0] < 3:
sbol/test/test_sequence.py: if sys.version_info[0] < 3:
sbol/test/unit_tests.py.bak:# # if sys.version_info[0] < 3:
sbol/test/unit_tests.py.bak:# if sys.version_info[0] < 3:
sbol/test/unit_tests.py.bak:# if sys.version_info[0] < 3:
We should eliminate these and declare that this SBOL python module is only compatible with Python 3.
pySBOL defines IGEM_STANDARD_ASSEMBLY
as a function, but it is a URIRef in SBOL. Make it a function with the same capabilities as in pySBOL.
Also add some tests (i.e. more than 1) to verify that the functionality of IGEM_STANDARD_ASSEMBLY is the same as it was.
It is likely that IGEM_STANDARD_ASSEMBLY can be pulled out of libsbol.i where it is defined.
lxml and deprecated packages are required but not listed in the install_requires
parameter
The library should also be tested with the SBOL workshop tutorial notebook:
https://github.com/SynBioDex/Community-Media/blob/master/2019/IWBDA19/workshop/solution.ipynb
If this works, then it should support all the features discussed in the pySBOL journal article.
A test exists in test_componentdefinition.py and is marked as an expected failure
Hi Folks
I cloned and imported SBOL into ActivePython 3.6.6. Import appears to work ad I can use help(sbol) to get a list of the package contents and functions.
I tried out the test scripts from the ReadMe. I get the following error message:
import sbol
sbol.testSBOL()
Traceback (most recent call last):
File "", line 1, in
File "c:\users\myuser\documents\github\pysbol\sbol\sbol_init_.py", line 35, in testSBOL
import sbol.test as unit_tests
File "c:\users\myuser\documents\github\pysbol\sbol\sbol\test_init_.py", line 15, in
from .test_roundtrip import TestRoundTripSBOL2, TestRoundTripFailSBOL2
File "c:\users\myuser\documents\github\pysbol\sbol\sbol\test\test_roundtrip.py", line 14, in
FILES_SBOL2 = os.listdir(TEST_LOC_SBOL2)
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'c:\users\myuser\documents\github\pysbol\sbol\sbol\test\SBOLTestSuite\SBOL2'
I notice in the file paths that it is not not retaining capitalization used in the folder names (eg user should be Users, documents should be Documents, etc). What should I try next to correct this?
Thanks!
Large SBOL files (including some in the round-trip test suite) currently bog down during serialization and parsing. There are optimizations for this implemented in libSBOL which should be ported over to this new library.
The SBOL tutorial uses PartShop.search(), which is not defined yet.
This is required for #14
https://gitlab.sd2e.org/rpg/xplan-part-extraction is a new repo that uses pysbol. Test the compatibility and determine if additional work is needed.
test_roundtrip.py
compares the RDF after the roundtrip, and finds lots of differences. These are reported by default like this:
WARNING:sbol.test:Detected 24 differences in RDF
WARNING:sbol.test:Set environment variable SBOL_TEST_DEBUG to see details
To see the details run with the SBOL_TEST_DEBUG environment variable set (to any value), like this:
SBOL_TEST_DEBUG= python3 -m unittest sbol/test/test_roundtrip.py
The property code repeatedly checks if the given _rdf_type is in the the _sbol_owner's properties or owned_objects. If so, do one thing, if not maybe initialize it, or do something else.
Instead, we could use collections.defaultdict to make the property magically appear when accessed so we don't need to do the checks all over the place. We could simply deal with the property as an empty list, or a list with contents.
This might be a graceful switch over, or it might cause a little friction here and there. In the long run we'd have less code, so it would be easier to read and maintain.
Method not implemented yet. Related to #13
This is necessary for custom annotations to work. For an example, see 2fb31df
SYNBICT unit tests rely on sbol.Document.addNamespace(), which is not implemented yet.
See title
The SBOL tutorial uses Document.copy()
, which is not defined yet.
This is required for #14
object.compare()
deems two ModuleDefinitions the same even though they have different identities. This sample program works in pysbol but raises an exception in SBOL:
import sbol
sbol.setHomespace('http://example.org/Unit_Test')
doc = sbol.Document()
md1 = doc.moduleDefinitions.create('Foo1')
md2 = doc.moduleDefinitions.create('Foo2')
The root of this particular problem is that object.compare()
passes the property dictionaries to object.compare_unordered_lists()
. The lists of keys are then compared and found to be equal. But the values associated with those keys are different, and never checked.
test_roundtrip.py
has 114 hand-coded tests to invoke run_round_trip()
on files in sbol/test/SBOLTestSuite/SBOL2
. There are 190 files there though, so 76 test cases are missed.
Use unittest subtests to dynamically test all files in that directory even if the directory contents change in the future.
#17 is a bug that was missed because the test file is not tested with the current test_roundrip.py.
There should be an example of how to run the unit tests in README.md
Another test script for validating SBOL core functionality is here:
https://github.com/SynBioDex/libSBOL/blob/master/wrapper/CRISPR_example.py
This may be the best script for evaluating coverage of the core data model.
The pre-commit hook script fails during style checking.
Error:
Checking style...
./dev/hooks/pre-commit: line 20: [: 2
51
1: integer expression expected
Found 2
51
1 style violations
applyToModuleHierarchy
is not yet implemented, and is used in an early example program so implement it.
Saw this error via Document.append()
:
======================================================================
ERROR: test_AAA (test.test_roundtrip.TestRoundTripSBOL2) [pICSL50014.xml] (filename='pICSL50014.xml')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/SBOL/sbol/test/test_roundtrip.py", line 82, in test_AAA
self.run_round_trip(f)
File "/SBOL/sbol/test/test_roundtrip.py", line 45, in run_round_trip
split_path[0] + split_path[1]))
File "/SBOL/sbol/document.py", line 330, in read
self.append(filename)
File "/SBOL/sbol/document.py", line 381, in append
self.graph.parse(f, format="application/rdf+xml")
File "/usr/local/lib/python3.6/dist-packages/rdflib/graph.py", line 1043, in parse
parser.parse(source, self, **args)
File "/usr/local/lib/python3.6/dist-packages/rdflib/plugins/parsers/rdfxml.py", line 578, in parse
self._parser.parse(source)
File "/usr/lib/python3.6/xml/sax/expatreader.py", line 111, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python3.6/xml/sax/xmlreader.py", line 123, in parse
buffer = file.read(self._bufsize)
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 8027: ordinal not in range(128)
The file sbol/test/SBOLTestSuite/SBOL2/pICSL50014.xml
has non-ascii characters (see lines 136 and 137). This file passes SBOL validation.
Two alternatives seem to work:
with open(filename, 'rb')
) -- this is what RDFLib does (see parser.py)Both approaches pass the tests in test_roundtrip.py
.
We have a sample SD2 notebook that relies on the Python sbol module. Support any features it needs.
Document.find()
should return an SBOLObject
or None
if the object is not found. Instead it returns -1.
Document.find()
appears to be iterating over the keys in a dictionary, which are strings, instead of the objects in the dictionary, which are the values.
This is in support of #6
This underlies a bunch of code, including assemble().
In pySBOL:
>>> import sbol
>>> doc = sbol.Document()
>>> md = doc.moduleDefinitions.create('thing1')
>>> md.identity
'http://examples.org/ModuleDefinition/thing1/1'
>>> len(md.modules)
0
>>> md.modules.create('thing2')
Module
>>> len(md.modules)
1
But in SBOL, that last len
call results in zero.
@bbartley says that proper stringified SBOL will not contain the identity relation (http://sbols.org/v2#identity
). Do not write the identity relation when serializing a Document.
This is actually serialized in sbol/SBOL2Serialize.py
.
Add a test to confirm that the identity relation is not output when serializing an SBOL graph.
Debug logging messages do not show up when they should. Running sbol.testRoundTrip()
should show multiple debug log messages. It does not. To repeat this behavior, run that test in the sbol/test
directory. That directory contains logging_config.ini
which will get loaded by various classes (Document, SBOLObject, Property).
# Run in sbol/test because it contains logging_config.ini
cd sbol/test
python3 -c "import sbol; sbol.testRoundTrip()"
Notice that you don't see any debug log messages. At a minimum, you should see messages from Document.doc_serialize_rdf2xml()
.
self.logger
A quick test adding disable_existing_loggers=False
shows the log messages that had been missing in testRoundTrip().
Adding disable_existing_loggers=False
is not the right fix. Here are some thoughts about how to improve logging:
SYNBICT unit tests need ComponentDefinition.copy()
sub collection submission is not working. This needs investigation.
synbiohub_adapter/tests/test_sbh_submissions.py has test_submit_sub_collection() which demonstrates the issue. This is currently on branch upload_test of synbiohub_adapter.
Participation.participant is defined as a ReferencedObject. In pySBOL when the participant attribute on a Participation instance is accessed, a string (type str) is returned. In SBOL, when the participant attribute on a Participation instance is accessed, a ReferencedObject is returned.
Make the attribute access return a string to be backward compatible with pySBOL.
libSBOL uses the SBOLError class to define different types of exceptions. These can mostly be mapped to standard Python Exception classes. The SWIG bindings make this mapping here (but this mapping was never fully completed):
https://github.com/SynBioDex/libSBOL/blob/master/wrapper/libsbol.i#L134
for example the LiteralProperty
constructor led to perplexing downstream effects when an invalid RDF predicate was provided... to prevent this, perhaps a regular expression could be used, or more rudimentary check for proper URI scheme and delimiters
Locations, Ranges, and Cuts do not appear to be supported yet, and are failing round-trip. See #38
The Document
constructor takes an optional filename argument. It is ignored.
An example drawn from the SBOL workshop and tutorial materials:
# Load some generic parts from `parts.xml` into another Document
generic_parts = Document('parts.xml')
The expected behavior is to load the specified file. We probably need to:
Document.read(filename)
to the end of the method ANDThis is related to #14
See title
A SYNBICT test fails because ComponentDefinition has no wasDerivedFrom
attribute.
When loading the `crispr_example.xml' test file the roles of the component definitions are not associated with the objects. Similarly, the relationships between objects (for example components, functionalComponents, interactions) are not populated.
Existing code uses defined constants like sbol.SBO_PRODUCT
to test for membership in a list like Participation.roles. This membership test fails because the participation roles are strings, not URIRefs.
Make the constants strings instead of URIRefs to match pySBOL and to be backward compatible.
Use rdflib.URIRef as the internal representation of URIs, and expose that choice by returning rdflib.URIRefs instead of strings where appropriate. There are many cases, like URIProperty and ReferencedObject, where URIs are stored in data structures. Consistently represent these as rdflib.URIRef and return the URIRef instead of converting it to a Python str.
Note: rdflib.URIRef
extends type str
, so we are still returning a string in that sense. What really changes are membership tests.
>>> import sbol
INFO:rdflib:RDFLib Version: 4.2.2
>>> import rdflib
>>> isinstance(rdflib.URIRef('http://example.com/foo'), str)
True
>>> rdflib.URIRef('http://example.com/foo') in ['http://example.com/foo']
False
>>> 'http://example.com/foo' in [rdflib.URIRef('http://example.com/foo')]
False
A common pattern of use is doc.xxx.create('foo')
to create new objects of a given type. Here are two examples from the SBOL tutorial:
my_device = doc.componentDefinitions.create('my_device')
design = doc.designs.create('my_design')
doc.designs
and doc.componentDefinitions
are both of type OwnedObject
.
create()
method knows what type of object to create.
OwnedPythonObject
for backward compatibility (see pySBOLx.py)Extensions are not currently supported. Extensions are used in a number of projects we know about. Add support for extensions.
The configuration (see config.py, Config.getOption(), etc.) uses the ConfigOptions .value
under the covers. Why not just use the ConfigOptions themselves?
Config.getOption(ConfigOptions.SBOL_COMPLIANT_URIS.value)
would become Config.getOption(ConfigOptions.SBOL_COMPLIANT_URIS)
, which seems more intuitive.
Why am I asking? Because this was the source of a bug where code did not have the .value
suffix and was failing without an exception, so kind of silently. Using strings also allows users to rely on strings for core configuration options, which lends itself to typos and forced backward compatibility of what should be internal values.
It would be helpful if README.md contained some additional information:
git submodule init
git submodule update
SBOLObject.find_property_value()
returns all values of the given property instead of only returning those that match the passed value.
>>> import sbol
INFO:rdflib:RDFLib Version: 4.2.2
>>> doc = sbol.Document()
>>> md = doc.moduleDefinitions.create('foo')
>>> test_uri = 'http://examples.org/does/not/exist/1'
>>> matches = doc.find_property_value(sbol.SBOL_IDENTITY, test_uri)
>>> matches
[rdflib.term.URIRef('http://examples.org/ModuleDefinition/foo/1'), rdflib.term.URIRef('http://examples.org/Document/1')]
One of the SYNBICT unit tests (see below for which one) triggers an IndexError in OwnedObject.find_resource().
======================================================================
ERROR: test_pruning_annotating (test_curation.CurationTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/SYNBICT/test/test_curation.py", line 46, in test_pruning_annotating
pruned_definition = target_doc.getComponentDefinition('/'.join([HOMESPACE, 'UnnamedPart', '1']))
File "/SBOL/sbol/document.py", line 305, in getComponentDefinition
return self.componentDefinitions.get(uri)
File "/SBOL/sbol/property.py", line 534, in get
return self.__getitem__(uri)
File "/SBOL/sbol/property.py", line 443, in __getitem__
return self.get_uri(id)
File "/SBOL/sbol/property.py", line 478, in get_uri
object_store, parent_obj, typedURI=False)
File "/SBOL/sbol/property.py", line 516, in find_resource
persistentIdentity = parent_obj.properties[SBOL_PERSISTENT_IDENTITY][0]
IndexError: list index out of range
Another new pySBOL user from SD2: https://gitlab.sd2e.org/rmoseley/build_request_parser
Fix several issues revealed by synbiohub_adapter upload tests:
Collection
class is not exported from module sbolDocument.addCollection()
is not definedPartShop.submit()
raises SBOLError instead of urllib3.exceptions.HTTPError, breaking backward compatibility.Name | Comments |
---|---|
sbh-adapter | Needs upload sub collections (#57) |
sbol-dictionary-writer | does not use pySBOL |
REDOER | does not use pySBOL |
sbh-prospector | works |
intent-parser | works |
Robert Goldman's script | Works since #66 |
etl-to-synbiohub-pipeline | Works as of 1.0 beta 5 |
SYNBICT | Works as of 1.0 beta 5 |
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.