mhgrove / empire Goto Github PK
View Code? Open in Web Editor NEWJPA implementation for RDF
Home Page: http://groups.google.com/group/empire-rdf/
License: Apache License 2.0
JPA implementation for RDF
Home Page: http://groups.google.com/group/empire-rdf/
License: Apache License 2.0
missing a jar from the dist apparently
i know that at least the example setup/init is now out of date given the working config files and guice injection support for PersistenceContent & Unit. so that needs to get updated. and the longer introduction into the various annotations and such could probably also stand to be updated.
Those annotations clue in about the desired use of the property in the data. i think systems like hibernate do validation on beans before/during startup, so we could make sure that if you have a collection, you have a OneToMany on it, or make sure if you have a ManyToMany you're using a collection, that sort of thing.
jena, sesame, 4store, et. al. have only the minimum # of jars required for each distro
From the mailing list:
"if I query by interface, get back a synthesised object that correctly represents the properties in the interface, update it, then merge back Im finding that the fields and properties that arent in the interface have disappeared. I expected them to remain."
Their code to reproduce:
Lost fields and values.
public Interface X{
public String getValue();
}
public Class Y implements x{
public String getValue() {...}
public String getOtherValue(() {...}
}
i) Persist inst of Y - a sparql query shows value,otherValue, type=X, type=Y
ii) Read back via X into X' (get back synthesised instance of Ximpl)
iii) Update
iv) Merge X' back
Do a sparql query and see that no otherValue field or value, or type=Y - X' I see instead only value, type=X. is passed to the merge() call.
deprecated as of 0.6.4, need to remove later. this behavior is now controlled via the multiplicity annotations in JPA like OneToMany and their fetch() property.
Probably should put this on the @RdfProperty annotation, though i think for normal jpa stuff you can put it on like OneToMany/ManyToOne etc. So maybe we should look at RdfProperty first (no reason you can't have two different cascade strategies) and then if it's not there, look at any normal JPA style annotations which are attached.
Hello,
as I already mentioned in #49 on a fresh checkout on the master and the 0.7 branch the testQuerying test cases are failing:
...
java.lang.IllegalArgumentException: Query named 'sovietSpacecraft' does not exist.
at com.clarkparsia.empire.impl.RdfQueryFactory.createNamedQuery(Unknown Source)
at com.clarkparsia.empire.impl.EntityManagerImpl.createNamedQuery(Unknown Source)
at com.clarkparsia.empire.test.TestJPA.testQuerying(Unknown Source)
...
java.lang.IllegalArgumentException: Query named 'sovietSpacecraft' does not exist.
at com.clarkparsia.empire.impl.RdfQueryFactory.createNamedQuery(Unknown Source)
at com.clarkparsia.empire.impl.EntityManagerImpl.createNamedQuery(Unknown Source)
at com.clarkparsia.empire.test.TestJPA.testQuerying(Unknown Source)
...
java.lang.IllegalArgumentException: Query named 'sovietSpacecraftSPARQL' does not exist.
at com.clarkparsia.empire.impl.RdfQueryFactory.createNamedQuery(Unknown Source)
at com.clarkparsia.empire.impl.EntityManagerImpl.createNamedQuery(Unknown Source)
at com.clarkparsia.empire.test.TestJPA.testQuerying(Unknown Source)
...
I read the documentation about the ATP (Annotation Processing Tool) stuff docs/using-apt.texile. At the moment I would expect that this tool must be used to generate a empire.config file and this file will fix the failing test cases.
I looked in the "test.empire.config.properties" and noticed that there should be a file "empire.annotation.config" which is not available.
I will try to figure out how this atp tool is working and then I will try to integrate this in the ant test target.
Any help is appreciated.
/Manuel
if you could write jpql queries and we could compile that into sparql effectively (reverse d2rq), that'd make it easier for people to migrate from existing jpa applications and lessen the learning curve. would also help integration into a framework like play since they have a lot of core bits which use jpql.
there's a jpql parser out there, sun or apache i think, that might be useful for this
Transient lets you get read only properties, but we're making everything marked w/ transient completely invisible to the rdf marshalling/unmarshalling, which is the wrong behavior.
We add prefix definitions for the user based on @namespaces annotations to be helpful.
but if the user adds their own prefix defintion, this very well may result in a parse error -- report on mailing list seems to indicate this could happen. even if there is some other bug, this might be helpful to have.
I was not calculating the diff and updating the live object, i was hoping i could cheat and just swap the reference to the object but that does not work. So we'll have to do a find to get the object from the db, then iterate over the fields so they get their reverted values.
zip up docs, src jars and dist jars into versioned, dated release bundle
can shim this into the same place we pull this information out of the annotation index in the query factory. will need to update the implementations of the annotation provider though to grab this information.
we have one for properties, one for XML would be good too. possibly even RDF =)
basically any of these modifiers mean the field/property does not participate in the persistence fun
It will be great if the debug="true" parameter is added to all the javac tasks in build.xml so that the produced class files can contain information needed for debugging.
i think it's normal behavior to have an entity be attached to its entity manager, and you cant normally persist an entity in on db, and then in another w/ a different entity manager. though that is kind of nice behavior.
use the existing proxy stuff to lazy load each individual field. might want a flag to enable this. The JPA annotations like OneToMany or ManyToMany can specify this, but they don't really make sense for primitive fields, so we might want to introduce a flag on RdfProperty or a new annotation to control this. This field/annotation should probably also override anything specified by OneToMany/ManyToMany, etc.
http://jcp.org/en/jsr/detail?id=317
came out in december, should see what changed, what's new, what is relevant for pulling into empire.
According to the EJB3 specification, section 2.1.7:
"If there are no associated entities for a multi-valued relationship of an entity fetched from the database, the persistence provider is responsible for returning an empty collection as the value of the relationship."
This needs to be fixed in ToObjectFunction.apply()
either use javassist or guice to catch changes to persistent fields by wrapping the setters in something that will set a dirty flag, or do some sort of bean-style listener scheme to listen for changes to fields. this way we can avoid unnecessary persists when the object is not dirty.
this will be useful once cascading of persists is working to help avoid extraneous work.
From the mailing list, good suggestion for using the JDK instrumentation:
The implementation Ive built uses JDK instrumentation to find all loaded classes and inspect for annotations. It seems to work, and takes about 1sec to do its job.
i) The Instrument class - export as jar file with MANIFEST.mf like in ii)
import java.lang.instrument.*;
public class Instrument {
public Instrument() {
super();
// TODO Auto-generated constructor stub
}
private static Instrumentation inst;
public static Instrumentation getInstrumentation() { return inst; }
public static void premain(String agentArgs, Instrumentation inst) {
System.out.println(inst.getClass() + ": " + inst);
Instrument.inst = inst;
}
}
ii) the Manifest
Manifest-Version: 1.0
Implementation-Vendor: ArchiveLink
Implementation-Title: ArchiveLink Instrument
Implementation-Version: 0.0.1
Specification-Vendor: ArchiveLink
Name: net/archivelink/instrument/
Premain-Class: net.archivelink.instrument.Instrument
Specification-Title: ArchiveLink Instrument Class
Specification-Version: 0.0.1
iii) jvm option : -javaagent:c:\tmp\instrument.jar
iv) From the EmpireAnnotationProvider impl
@Override
public Collection<Class<?>> getClassesWithAnnotation(Class<? extends Annotation> theAnnotation) {
Set<Class<?>> aClasses = new HashSet<Class<?>>();
Instrumentation inst = Instrument.getInstrumentation();
//*** use -javaagent:pathToJarFile
//depends on Instrument being loaded as an agent to the JVM
//
//http://www.coderanch.com/t/329407/java/java/find-all-loaded-classes-classloaders
//http://download.oracle.com/javase/6/docs/api/java/lang/instrument/package-summary.html
Class[] css = inst.getAllLoadedClasses();
for ( Class aClass : css ) {
try {
if (aClass.isAnnotationPresent(theAnnotation)) {
aClasses.add(aClass);
}
else {
if ( LOGGER.isDebugEnabled() )
LOGGER.debug("Class " + aClass.getName() + "' " +
"does not actually have the specified annotation '" + theAnnotation + "'");
}
} catch (Exception e) {
if ( LOGGER.isDebugEnabled() )
LOGGER.debug("Exception during annotation inspection : " + e.toString() );
}
}
return aClasses;
}
will need to figure out lazy loading of primitives and basic types, for this, but would be nice to have
right now we don't do anything other than validate the the EntityManager is not closed and that you're not locking on something that is not a managable entity.
in the short term we might want to throw a persistence exception so users know that the lock was not obtained rather silently ignoring the request.
this will have to be pending approval and wider acceptance of sparql 1.1 though...
so if you have a Person bean mapped to foaf:Person, you can write like foaf:name in your queries, as it stands now, you'd have to put in the namespace prefix mappings yourself, or use full uri's.
I wanted to use an XML configuration but it seems that the DefaultEmpireModule class doesn't use it at all.
as a hook to get instances of an appropriate EntityManagerFactory/EntityManager etc. this looks to be the normal entry point into jpa managed systems that we should provide code for
both through our bytecode generation stuff w/ javassist and with the Guice stuff using providers in a module and/or @implementedby
Hello,
I just obtained a copy of empire and build is successfully.
$ git clone http://github.com/clarkparsia/Empire.git
$ git checkout HEAD .
$ ant dist
...
BUILD SUCCESSFUL
Total time: 7 seconds
To prove that my copy is working I ran the tests.
$ ant test
...
test:
[junit] Running com.clarkparsia.empire.test.EmpireTestSuite
[junit] Tests run: 133, Failures: 0, Errors: 3, Time elapsed: 153.038 sec
[junit] Test com.clarkparsia.empire.test.EmpireTestSuite FAILED
[junitreport] Processing /Users/manuel/Projects/Masterarbeit/workspace/Empire/dist/test-reports/TESTS-TestSuites.xml to /var/folders/L2/L2i4C3uNEvua2CsujXO3DU+++TI/-Tmp-/null1897162432
[junitreport] Loading stylesheet jar:file:/opt/local/share/java/apache-ant/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl
[junitreport] Transform time: 1433ms
[junitreport] Deleting: /var/folders/L2/L2i4C3uNEvua2CsujXO3DU+++TI/-Tmp-/null1897162432
BUILD SUCCESSFUL
Total time: 2 minutes 37 seconds
$ cat dist/test-reports/TESTS-TestSuites.xml
<?xml version="1.0" encoding="UTF-8" ?>
<testsuites />
$ open dist/test-reports/index.html
Besides the fact that 3 test around NamedQueries (Query named 'sovietSpacecraft' does not exist.) are failing (I will open another bug for that). The HTML reports did not get generated properly.
I added another line to the build.xml:
$ git diff
diff --git a/build.xml b/build.xml
index 4126ca0..976f3de 100644
--- a/build.xml
+++ b/build.xml
@@ -322,6 +322,7 @@
<junitreport todir="dist/test-reports">
<fileset dir="dist/test-reports">
<include name="TEST-*.xml" />
+ <include name="report.xml" />
</fileset>
<report todir="dist/test-reports" />
</junitreport>
Afterwards it generated the HTML files.
I am using a Mac with Mac OS X with Snow Leopard 64 Bit on an Intel Core Duo.
$ uname -a
Darwin Manuel-Blechschmidts-MacBook-Pro.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386
I use the newest Java Version available:
$ java -version
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02-279-10M3065)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01-279, mixed mode)
Further I use ant 1.7.1:
$ ant -version
Apache Ant version 1.7.1 compiled on June 27 2008
I hope that helps.
/Manuel
they're unreasonably slow, and we have to use them all the time for serialization/deserialization, so lets cache the results and speed things up a bit.
impl that is just SPARQL via sparql protocol and SPARQL update (can use jena for now to validate queries), no API dependency.
can probably just base this off the 4store API.
It will only generate bytecode for the given interface and does not look at the parent interfaces. So you can easily end up with an implementation of a class that's useable for the given interface, but when you call any methods on parent interfaces, you get AbstractMethodErrors, which is not good.
been operating under the assumption that it will be properties based, which is fine, but i need to hammer out the details for what should be in the properties file, how to have properties specified for different persistence units etc.
might make sense to have a little interface to the properties and have a reader for both xml and .properties formats.
i like properties format because its lightweight and easy to edit, but if you have a couple different persistence units in the same file, i can see where that will be more confusing to edit than xml. if we ever get some users, would be interested in finding out what they'd prefer
Alibaba in sesame has this sort of system:
@sparql("SELECT ?member "+
"WHERE { $this eg:bloodPressureHistory ?member . " +
" ?member eg:measurementDate ?date } ORDER BY ?date")
public abstract Result findBloodPressureByDate();
calling that method will execute that query. might be a useful non-JPA specific feature to add, though we can probably overload using namedqueries for this stuff
ENABLE_PROXY_OBJECTS controls basically the same thing as specifying FetchType on the mapping, except that it does it globally rather than per class. should be straightforward to support this based on our existing code.
Please consider maven support. This projects has quite a bit of dependencies and maven would really make life easier for developers and users.
Besides that - this is a great implementation.
Thanks a lot!
in EntityManagerImpl.merge the last step is to return the merged object, i'm calling find and pulling it fresh out of the db rather than returning the value passed in. this is causing testPreAndPostHooks to fail because i'm returning a different object than I saved (though they are .equals()) -- i'm not sure if the test is wrong, or if i should just return the same object i was given in merge, which would probably make the test pass.
scan a class and make sure that not only it has the minimum required annotations, but that they are used correctly, for example, ensure that they did not use a OneToOne on a Collection relation, that sort of thing.
we read from the db and populate the object, but we don't persist. i think that's useful because it gives you readonly properties.
however, hibernate and probably others, dont read either. so our implementation is different, which is probably not desireable. do we keep the different implementation and document it in bold letters, or add a @readonly annotation, or something similar, implement transient the same way as everyone else, and be done with it. i like having my cake and eating it too, but i also dont want to add more non-JPA annotations if avoidable.
this is probably a punt until people actually start using empire.
first thing to do is write our current implicit mapping out as a normal result mapping and get it working with that, and then go from there.
If you have triples asserted for an object which is mapped by jpa/empire, but are not mapped themselves, via @RdfProperty, these are lost when the RDF graph is converted into a Java object because there is nothing to map them onto. If you don't save the object, you won't lose the triples. But since we do a wipe and load for save operations rather than a diff, you can lose these unmapped triples.
So we need to track these unmapped triples or start doing saves as a diff operation. The diff route might be easier to do once SPARQL 1.1 is in the mix, but for now, it might just be easiest to add to the bytecode at runtime a collection of these unmapped triples, which upon save will be asserted directly into the result graph.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.