Comments (12)
There are lots of problems with the built-in parser, including incorrect interpretation of non-ASCII and/or encoded characters. My suggested course of action would be to get rid of the built-in parser and only use SERD. SERD gives spec-compatible results, and encoding is handled fine.
Only downside: direct reading of gzipped files only works with the built-in parser; piping is not supported either.
from hdt-cpp.
Have you used SERD and -f turtle
to create the HDT file? All other parsers are broken.
from hdt-cpp.
The N-Triples parser is broken (especially encoding things). SERD is in the Makefile, but might need to be switched on.
from hdt-cpp.
No experience with Raptor. Would be great if SERD supported quads as well… can't be too hard to implement.
from hdt-cpp.
Di you manage to compile the code? m1ci
from hdt-cpp.
yes, I use the hdt-cpp lib.
from hdt-cpp.
yeah, HDT has encoding issues. Are you using UTF-8?
from hdt-cpp.
yeah, HDT has encoding issues. Are you using UTF-8?
I use subject and the object of the triples are IRIs not URIs.
<http://ru.dbpedia.org/resource/Список_римско-католических_епархий_(структурный_вид)> <http://dbpedia.org/ontology/wikiPageWikiLink> <http://ru.dbpedia.org/resource/Епархия_Абаэтетубы> .
Not sure if this answers your question.
Have you used SERD and -f turtle to create the HDT file? All other parsers are broken.
Oh, I used the default n-triples. Maybe -f turtle
will fix the problem. Not sure if I used SERD.
from hdt-cpp.
Have you used SERD and -f turtle to create the HDT file? All other parsers are broken.
@RubenVerborgh I'd take a stab this week at addressing the parser situation. Could you elaborate, please, on your experience and observations regarding the different parsers that the library (ostensibly) supports at present?
from hdt-cpp.
@RubenVerborgh I'd also be in favor of removing the built-in parser. Serd is a high-quality library that's the right thing to do for Turtle and N-Triples support. Perhaps the gzip support could be contributed to Serd directly, benefiting a wider base of users.
Any observations regarding the Raptor integration? As HDT moves to support quads, we'll need N-Quads format support in hdt-cpp, and it probably makes sense to keep the optional Raptor dependency around for that, unless it is significantly problematic.
from hdt-cpp.
@RubenVerborgh True, I might contribute N-Quads support to Serd just so as to avoid the complicated and tangled web of Raptor's dependencies, which runs rather deep.
from hdt-cpp.
Fixed in #31.
from hdt-cpp.
Related Issues (20)
- Unused TABLESUM and coversizes in suffixtree
- Removed unneeded exception in BasicHDT
- Consolidate rdf2hdt Windows-specific implementation and base implementation
- Replace use of deprecated ftime() HOT 2
- Resolve "delete called on non-final" warnings.
- Test dumpDictionary not being called with an input HDT file
- Test case "properties" fails HOT 1
- Code formatting / beautifier needed. HOT 1
- Evaluate Parallel Hashmap for potential performance benefits HOT 2
- Add option to ignore error instead of throwing error HOT 5
- `make install` does not install triples/ directory -- hdt-it still active? HOT 1
- clang-format of libdcs [sic]
- hdt::QueryProcessor.searchJoin() gives incorrect results HOT 6
- Compile error on macOS with "make -j2" command HOT 1
- rdf2hdt stops without error message HOT 3
- Add encryption-at-rest to libraries HOT 1
- rdf2hdt produces invalid UTF8 values? HOT 1
- undefined reference to `hdt::HDTManager::mapHDT(char const*, hdt::ProgressListener*)'
- support for quads/named graphs HOT 3
- Memcpy to nullptr in CSD_HTFC::CSD_HTFC()
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hdt-cpp.