scossu / lakesuperior Goto Github PK
View Code? Open in Web Editor NEWLakesuperior, an alternative Fedora Repository implementation
Home Page: http://lakesuperior.readthedocs.io/
License: Apache License 2.0
Lakesuperior, an alternative Fedora Repository implementation
Home Page: http://lakesuperior.readthedocs.io/
License: Apache License 2.0
If a LDP-RS describes a LDP-NR, add a rel="describes"
Link
header as per https://fedora.info/2018/11/22/spec/#http-get-ldprs, second paragraph.
Operating system: OS X
Python version: 3.6
LAKEsuperior release, branch, or commit #: 1.0.0-alpha12 from pypi
Given a file in file.txt
with the content:
this is a file.
And this request:
curl localhost:8000/ldp/ -XPOST -H"Content-Type: text/plain" -i --data-binary @file.txt
Then issue a GET request on the newly created resource.
curl -i <the resource>
HTTP/1.1 500 INTERNAL SERVER ERROR
Server: gunicorn/19.7.1
Date: Mon, 09 Apr 2018 19:28:41 GMT
Connection: keep-alive
Content-Type: text/html
Content-Length: 291
The file contents. Or at least logs describing the error (there was nothing of interest in the error logs or in the console of the running application).
Requests to the metadata (e.g. /fcr:metadata) respond correctly with RDF.
It is also worth noting that the binary is present on the filesystem in ./data/ldpnr_store
Operating system: OS X
Python version: 3.6.4
LAKEsuperior release, branch, or commit #: 1.0.0-alpha12 from pypi
curl -XPATCH -i localhost:8000/ldp/ -H"Content-Type: application/sparql-update" --data-binary @sparql.txt
(The same sparql update applied to a non-root resource succeeds)
INSERT {
<> <http://purl.org/dc/terms/subject> [
a <http://example.org/Subject> ;
<http://www.w3.org/2000/01/rdf-schema#label> "A subject" ]
} WHERE {}
Response:
HTTP/1.1 405 METHOD NOT ALLOWED
Server: gunicorn/19.7.1
Date: Mon, 09 Apr 2018 18:51:11 GMT
Connection: keep-alive
Content-Type: text/html
Allow: HEAD, POST, GET, OPTIONS
Content-Length: 178
A successful response (e.g. 200 or 204)
I notice that PATCH is not included in the Allow
header, but the root resource also advertises the Allow-Patch
header, so it's a little inconsistent.
check_fixity
(not yet implemented), check_refint
, migrate
.
stats
may not need a report file since its output is predictably contained and may be more likely used for piping console output directlyAlignment with Fedora4 is spotty in regard to header support. The following headers need to be addressed:
Prefer
headers to either show all children or none)Include Preference-Applied
header to GET request, as per https://fedora.info/2018/11/22/spec/#http-get-ldprs
LDP containers seem to work but have zero test coverage.
An unsupported digest algorithm on creation raises a ValueError that results in a 500 error. It should produce a 400 error according to https://fcrepo.github.io/fcrepo-specification/#http-post-ldpnr
A malformed digest string containing an =
sign results in a 500 error:
File "lsup/src/lakesuperior/endpoints/ldp.py", line 287, in post_resource
request.headers['digest'].split('=')
ValueError: too many values to unpack (expected 2)
Implement a tool that allows to:
For line 6:
Install dependencies: pip install -r requirements.txt
should it be?
Install dependencies: pip3 install -r requirements.txt
If a request fails due to one or more constraints, this/these MUST be indicated in a Link
response header. See https://fedora.info/2018/11/22/spec/#constraints-document
Partly resolves #92
Implement messaging using ActiveStreams, which is halfway there.
Delta messaging optional for now if not too much of a hassle.
Allow changing LDP type of an existing resource to a subtype of the current one, as per https://fedora.info/2018/11/22/spec/#http-put
Operating system: RHEL 6
Python version: 3.5.1
LAKEsuperior release, branch, or commit #: 1.0.0a16
Perform a large batch of SPARQL queries on the server. The repository is quite large (~170M triples), however the queries are not very demanding (<1s response time).
Occasionally the following error is raised:
2018-05-07 17:33:03,437 ERROR flask.app - Exception on /query/sparql [POST]
Traceback (most recent call last):
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/lakesuperior/endpoints/query.py", line 86, in sparql
out_stream = query_api.sparql_query(qstr, fmt)
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/lakesuperior/api/query.py", line 128, in sparql_query
with TxnManager(rdf_store) as txn:
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/lakesuperior/store/ldp_rs/lmdb_store.py", line 62, in __enter__
self.store.begin(write=self.write)
File "/data/local/lake/lsup_env/lib64/python3.5/site-packages/lakesuperior/store/ldp_rs/lmdb_store.py", line 341, in begin
self.data_txn = self.data_env.begin(buffers=True, write=write)
lmdb.BadRslotError: mdb_txn_begin: MDB_BAD_RSLOT: Invalid reuse of reader locktable slot
No error should be raised.
This exception comes from the LMDB store and seems to be related to thread handling. See https://www.openldap.org/lists/openldap-devel/201409/msg00001.html
Change include
value of Prefer
header from http://fedora.info/definitions/v4/repository#InboundReferences
to http://fedora.info/definitions/fcrepo#PreferInboundReferences
.
See https://fedora.info/2018/11/22/spec/#additional-prefer-values , first bullet
Operating system: OS X
Python version: 3.6
LAKEsuperior release, branch, or commit #: 1.0.0-alpha12 from pypi
curl -i <any resource>
The response is Turtle, but the Content-Type
header is "text/html"
The Content-Type
header should be "text/turtle"
The headers seem to correspond but there is no test to verify at least the mandatory fields.
The underlying LAKEsuperior data model allows for a fairly rich provenance tracking. Even more fine-grained information, such as per-statement provenance complete with added and removed triples that allow to build a full log-like provenance trail, should be delegated to a specific subsystem such as the messenger.
Some work has already been done to support full delta logging in the messenger but is not complete. This ticket is to complete that work and write tests for it.
Provide access to triples
method of underlying storage to enable high-performance simple term queries.
Expose via Python API and UI.
UI should have some user-friendly facility to combine multiple term queries via AND.
Line 7 Run ./lsup_admin bootstrap to initialize the binary and graph stores
should read:
Run ./lsup-admin bootstrap to initialize the binary and graph stores
Reportedly, some operations on blank nodes are already handled by the underlying RDFLib implementation.
This ticket is to assess how complete this support is and to write tests to verify compliance.
Currently, direct containment is determined by the presence of ldp:membershipResource
and ldp:hasMemberRelation
predicates, and indirect containment by the additional ldp:insertedContentRelation
predicate. In these cases, the ldp:DirectContainer
or ldp:IndirectContainer
RDF types are inferred.
The logic should work the other way around: the RDF types determine the type of container, and must be explicitly stated; and the membership predicates are given a default object is one is not provided.
Also include support for inverse relationship predicate ldp:isMemberOfRelation
.
Allowing to change membership triples in an LDP-DC or LDP-IC is TBD, especially in regard to how to handle already established relationships.
See https://fedora.info/2018/11/22/spec/#ldpdc and https://fedora.info/2018/11/22/spec/#ldpic
Currently versions are quite wasteful because they back up a lot of server-managed properties.
Explore the possibility of only versioning parts of a resource that are meaningful.
As part of this ticket, performance of fcr:versions
should also be improved which is curently very slow even in 10K-resource repositories.
Operating system: N/A
Python version: N/A
LAKEsuperior release, branch, or commit #: 1.0.0a13 and later
500 Internal Server Error
Home page should be displayed.
The VERSION
file is not included in the wheel distribution. The home page makes use of this to display the revision version. The application raises an error because it can't find the file containing the version number.
It's worth finding a better place for the version number (e.g .module variable).
Operating system: OS X
Python version: 3.6
LAKEsuperior release, branch, or commit #: alpha 14
ldp:membershipResource
pointing to a hash URI on that membership resource.For example: given a membership resource of http://localhost:8000/ldp/resource
the DC would include the triple: <> ldp:membershipResource <http://localhost:8000/ldp/resource#members>
.
When adding child resources to the DC, there are no membership triples generated for the member resource.
The member resource would contain triples with the DC child resources.
You may want to take a look at some of the "bug tracker" examples in the LDP primer: https://www.w3.org/TR/ldp-primer/
Calculate the checksum of individual LDP-RS, store them and expose them on the LDP API.
The checksum should be used both for the Digest
header (base64-encoded, as per RFC 3230) and for the ETag
header.
Containment triples are considered server-managed and should not be allowed. This is already the case, but the response for user-provided containment triples is 412
and includes the offending triples in the body rather than in the headers.
the response for such request should be instead 409
and the offending triples should be listed in a constrains document.
This satisfies https://fedora.info/2018/11/22/spec/#ldpc
Support If-None-Match
HTTP header for GET/HEAD and PUT, as per RFC 7232, section 3.2.
With GET requests, the server will return a 304 Not Modified
if the resource ETag corresponds to any the client-provided ETags.
With PUT requests, the server will return a 412 Precondition Failed
and the resource will not be updated if a match is found. This can be used in combination with the special *
value to prevent an update of a location that contains an existing resource.
Operating system: OS X
Python version: 3.6
LAKEsuperior release, branch, or commit #: alpha 14
Create an Indirect Container with the triple:
<> ldp:insertedContentRelation ldp:MemberSubject .
Now add child resources to the indirect container via PUT or POST.
500 Error Response
200 Response. But more importantly, the membership resource should then be populated with member triples exactly as if the indirect container were a direct container.
You may want to read up on the use of ldp:MemberSubject
in the LDP spec.
Docstrings have been entered with a format that doesn't help generating automatic API docs.
All docstrings should be reformatted to allow processing with Sphinx.
@scossu: If you follow the steps to configure an automated build on Docker Hub, I will submit another PR to make the Quick Start even quicker. (quickerer?)
Operating system: OS X
Python version: 3.6
LAKEsuperior release, branch, or commit #: 1.0.0alpha12 from pypi
Create an LDP-NR as with #47
The response includes a response header such as:
Link: <http://localhost:8000/ldp/ee7d7990-24d5-41d3-bc3b-0a59cfd2b739/fcr:metadata>; rel="describedby"; anchor="<http://localhost:8000/ldp/ee7d7990-24d5-41d3-bc3b-0a59cfd2b739>"
I would not expect to see the anchor
parameter to be enclosed in <>
characters.
That is, I would expect:
Link: <http://localhost:8000/ldp/ee7d7990-24d5-41d3-bc3b-0a59cfd2b739/fcr:metadata>; rel="describedby"; anchor="http://localhost:8000/ldp/ee7d7990-24d5-41d3-bc3b-0a59cfd2b739"
https://tools.ietf.org/html/rfc8288 -- this is the specification for web linking (which defines the anchor param)
The ABNF for the anchor param is "URI Reference", as defined by https://tools.ietf.org/html/rfc3986#section-4.1
LAKEsuperior release, branch, or commit #: master
POST http://localhost:8000/ldp/
http://localhost:8000/ldp0383c3b1-62da-4f94-b9f3-f7c3a778340d
http://localhost:8000/ldp/0383c3b1-62da-4f94-b9f3-f7c3a778340d
Implement an API method and comand-line utility for checking referential integrity of a repository.
This is critical to verify that a migration produces a consistent result.
A created resource MUST be a LDP-NR if the creation request includes a Link
header with a value of http://www.w3.org/ns/ldp#NonRDFSource
, independently of the content type specified.
When any of the GUnicorn workers reboot (after 256 requests by default configuration) LMDB readers are not released, i.e. the store is not closed properly. These readers eventually accumulate and the application quits when the max readers limit (126 by default) is reached.
Reverting to version has very light test coverage and has apparently been broken from previous refactorings.
The "resurrect" feature, which is reverting a tombstone to the latest active version, should also be completed as part of this.
Honor the Want--Digest
GET request header for LDP-NR as per https://fedora.info/2018/11/22/spec/#http-get-ldpnr
Implement fixity checks in Python API, REST API and CLI.
Retroactively opening for the record. Part of alpha 8.
Fixed in d2d7431
Operating system: OS X
Python version: 3.6
LAKEsuperior release, branch, or commit #: 1.0.0-alpha12 from pypi
curl -i <any resource> -H"Accept: application/ld+json"
The response is Turtle
The response should be JSON-LD
LDP requires that servers support JSON-LD serializations (I personally see no reason to support RDF+XML). And if you are already supporting Turtle, N-Triples tends also to be really easy to support.
PUT requests must always fail if server-managed triples are included. The handling
option of the Prefer
header must be deprecated. See https://fedora.info/2018/11/22/spec/#http-put-ldprs
Operating system: Various
Python version: N/A
LAKEsuperior release, branch, or commit #: 1.0.0a13
Create a direct container with PUT (currently POST is unavailable due to #56) and attempt to insert child resources.
(from @acoburn 's post)
Once the direct container is successfully created, I am unable to create child resources in it (neither via PUT or POST). This is the response:
"The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application."
But there is nothing in any of the logs, other than these lines in the access log:
127.0.0.1 - - [13/Apr/2018:11:24:50 -0400] "POST /ldp/dc4 HTTP/1.1" 500 291 "-" "curl/7.54.0"
127.0.0.1 - - [13/Apr/2018:11:35:11 -0400] "PUT /ldp/dc4/child1 HTTP/1.1" 500 291 "-" "curl/7.54.0"
Resources should be created under the container with the correct relationships.
Indirect containers might exhibit similar behavior. Needs testing.
Following README.MD instructions at line 7, when running the command:
./lsup-admin bootstrap
Fails with the following error for me
Reading configuration at /root/lake/lakesuperior/etc.defaults
Traceback (most recent call last):
File "./lsup-admin", line 7, in
import lakesuperior.env_setup
File "/root/lake/lakesuperior/lakesuperior/env_setup.py", line 2, in
from lakesuperior.globals import AppGlobals
File "/root/lake/lakesuperior/lakesuperior/globals.py", line 6, in
from lakesuperior.dictionaries.namespaces import ns_collection as nsc
ImportError: No module named dictionaries.namespaces
Operating system: various
Python version: N/A
LAKEsuperior release, branch, or commit #: 1.0.0a13
(from @acoburn 's email:)
[...] issue that I ran into related to trying to create a direct container via POST (PUT seems to work correctly). This is the file I'm using:
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX ldp: <http://www.w3.org/ns/ldp#>
<> dcterms:title "Direct Container" ;
ldp:membershipResource <http://localhost:8000/ldp/member> ;
ldp:hasMemberRelation dcterms:relation .
(assume the pre-existing presence of /ldp/member)
Whenever I try to create a new resource via POST, the new resource is stored as an LDP-NR. Here are the various commands I've used:
curl -i localhost:8000/ldp -XPOST --data-binary @dc.ttl
curl -i localhost:8000/ldp -XPOST --data-binary @dc.ttl -H"Content-Type: text/turtle"
curl -i localhost:8000/ldp -XPOST --data-binary @dc.ttl -H"Content-Type: text/turtle" -H"Slug: dc"
curl -i localhost:8000/ldp -XPOST --data-binary @dc.ttl -H"Content-Type: text/turtle" -H"Slug: dc" -H"Link: <http://www.w3.org/ns/ldp#DirectContainer>; rel=\"type\""
In every case, the new resource is an LDP-NR. Though when I use equivalent commands to create a resource via PUT, everything works correctly. This same issue seems to apply to regular RDF source/container resources as well: they are created correctly as an LDP-RS under PUT but not under POST.
Containers should be created with POST the same way they are created with PUT.
This may be the case with indirect containers as well. Needs testing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.