Giter VIP home page Giter VIP logo

clownface's Introduction

clownface

Build Status NPM Version

Clownface is a graph traversal library inspired by Gremlin which allows to query any RDF dataset in a concise and readable way.

Clownface greatly simplifies interacting with RDF data in JavaScript.

Quick start

The recommended way is to use clownface with and RDF/JS environment. It also requires DataFactory and DatasetFactory, for example those provided by @rdfjs/data-model and @rdfjs/dataset packages respectively, as well as @rdfjs/namespace.

npm install clownface @rdfjs/environment @rdfjs/data-model @rdfjs/dataset @rdfjs/namespace
import Environment from '@rdfjs/environment/Environment.js'
import NamespaceFactory from '@rdfjs/namespace/Factory.js'
import DatasetFactory from '@rdfjs/dataset/Factory.js'
import DataFactory from '@rdfjs/data-model/Factory.js'
import ClownfaceFactory from 'clownface/Factory.js'

const $rdf = new Environment([
  NamespaceFactory,
  DatasetFactory,
  DataFactory,
  ClownfaceFactory
])

const graph = $rdf.clownface()

Alternatively, if you already use @zazuko/env, it comes bundled with clownface and its dependencies.

import $rdf from '@zazuko/env'

const graph = $rdf.clownface()

Learn more

If you are new to RDF and JavaScript, consider our Getting Started guide that also covers Clownface basics.

For API documentation and examples, see http://zazuko.github.io/clownface/.

clownface's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clownface's Issues

Add .distinct method to reduce result to unique terms/dataset/graph

After traversing a graph, many times there are duplicates. Sometimes this is wanted (e.g. counting matching pathes), so it can't be the default, but many times only the unique set is required.

A .distinct method should be added that returns a new object with a reduced context with unique terms/dataset/graph.

Feature request: Provide some immutable update methods

Thanks for the great lib! I'm using it with react and it works fairly seamlessly after some figuring out. One nice to have would be some immutable update methods similar to the query methods. I.e. instead of adding to a dataset or removing it would return a new instance of clownface with a cloned version of the dataset with the changes made to it. This would be used in setState and useReducer then and it would make it less likely to silently update the dataset. At the moment I'm just running the updates and then calling setState to trigger and update along with some memoization to minimize component updates.

Improve list iteration

I was just handling an RDF List and I notice that it returns only an iterator of pointers. I would like to propose to extend it with additional getters which would likewise iterate terms and and values, similarly to how a pointer itself has the .terms and .values properties.

This will allow for simpler usage, like when doing spread and map. For example, trying to get all unique terms from a list

import TermSet from '@rdfjs/term-set'

let listPointer

-const set = new TermSet([...excludedPointer.list()].map(({ term }) => term))
+const set = new TermSet([...excludedPointer.list().terms])

Support language tags in `out`

We would like to add an optional parameter to narrow down the object returned by ptr.out() to specific language(s).

The new signage might look like below, accepting a single language tag string or array thereof

out(term: NamedNode, options: { language?: string | string[] } = { })

Examples

Given RDF like

ex:ananas a ex:Fruit ;
  rdfs:label "Pineapple" ;
  rdfs:label "Ananas"@pl ;
  rdfs:label "Ananas"@de ;
  rdfs:label "Ananász"@hu ;
  rdfs:label "Ananas"@sr-Latn ;
  rdfs:label "Ананас"@sr-Cyrl ;
  rdfs:label _:foo .

ex:apple a ex:Fruit ;
  rdfs:label "Apple"@en ;
  rdfs:label "Apfel"@de ;
  rdfs:label "Јабука"@sr-Cyrl ;.

ex:eggplant a ex:Vegetable ;
  rdfs:label "Psianka podłużna"@pl, "Bakłażan"@pl, "Oberżyna"@pl .

blank nodes and named nodes are never returned when language is used

To only get plain string (no language)

// "Pineapple"
ananas.out(rdfs.label, { language: '' })

// also only "Pineapple" because apple has only langStrings
fruit.out(rdfs.label, { language: '' })

To get only a specific language

// "Ananas"@de
ananas.out(rdfs.label, { language: 'de' })

// [ "Ananas"@de, "Apfel"@de ]
fruit.out(rdfs.label, { language: 'de' })

// [ "Apple"@en ]
// no pineapple because there is not @en label
fruit.out(rdfs.label, { language: 'en' })

// return empty (no matching language)
ananas.out(rdfs.label, { language: 'fr' })

To get one of multiple languages

An array can be used. For every input node, the languages are evaluated in their order in the array

// "Apple"@en
// only one language returned
apple.out(rdfs.label, { language: [ 'en', 'de' ] })

For multiple input nodes, only one language returned per node

// [ "Ananász"@hu, "Apfel"@de ]
// Hungarian for pineaple but German for apple
fruit.out(rdfs.label, { language: [ 'hu', 'de' ] })

To get any language

A wildcard (or undefined) can be used to select any language

// No Hungarian for apple
// An unspecified language will be selected (en or de)
apple.out(rdfs.label, { language: [ 'hu', '*' ] })

Support for secondary language

Exact match can be used all the same

// "Ананас"@sr-Cyrl 
ananas.out(rdfs.label, { language: [ 'sr-Cyrl' ] })

A primary language will match an arbitrary secondary tag. Below sr-Latn was not found and sr would match sr-Cyrl (but an exact match would come first still)

// "Јабука"@sr-Cyrl
apple.out(rdfs.label, { language: [ 'sr-Latn', 'sr' ] })

Multiple values for a language

All will be returned

// [ "Psianka podłużna"@pl, "Bakłażan"@pl, "Oberżyna"@pl ]
eggplant.out(`rdfs.label`, { language: pl })

Does clownface understand the abbriviation `a` instead of `rdf:type`?

TLDR: A simple yes or no answer on the title will satisfy me.

Currently having an issue where I get a text/turtle response where I get something like this:

@prefix schema: <http://schema.org/> .

<https://some.iri.com/for-example-1>
    a            schema:Dataset;
    schema:name  "Nice example name" .

<https://some.iri.com/for-example-2>
    a            schema:Dataset;
    schema:name  "Other example name" .

However understanding from examples:

I see the data used from https://github.com/zazuko/tbbt-ld/
Where the use of a in a text/turle is used for rdf:type:

So I might be not using the following of clownface incorrect in the project

data.has(Ns.rdf.Type, Ns.schema.Dataset)

Now our simplified code (I do not expect you to look at this)

import namespace, { NamespaceBuilder } from '@rdfjs/namespace'
import defaultFormats from '@rdfjs/formats-common'
import fetch from '@rdfjs/fetch'
import rdfExt from 'rdf-ext'
import DatasetExt from 'rdf-ext/lib/Dataset'
import clownface from 'clownface'

class Ns {
  static schema: NamespaceBuilder = namespace('http://schema.org/')
  static rdf: NamespaceBuilder = namespace(
    'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
  )
}

const RdfClientGet = async (
  url: string,
  params?: URLSearchParams,
  contentType = 'text/turtle',
): Promise<DatasetExt> => {
  let newUrl = url
  if (params) newUrl = url + '?' + params
  const response = await fetch(newUrl, {
    method: 'get',
    headers: {
      Accept: contentType,
      Authorization: 'Bearer ' + getToken(),
    },
  })

  if (response?.status == 404) Router.replace('404')

  const format = response.headers.get('content-type') as string
  const parser = defaultFormats.parsers.get(format)
  const dataset = rdfExt.dataset()
  if (response.body) {
    const stream = parser?.import(response.body as any)
    stream?.on('data', (quad) => dataset.add(quad))
    const streamPromise = new Promise((resolve, reject) => {
      stream?.on('end', () => resolve(stream.read))
      stream?.on('error', () => reject)
    })
    await streamPromise
  }

  if (response.status > 299) {
    throw new BadRequest(response, dataset)
  }
  return dataset
}

const getData = () => {
  RdfClientGet('https://example.com/')
    .catch((error) => {
      // custom error stuff
    })
    .then((value) => {
      if (value) {
        const data = clownface({ dataset: value })

        const quads = data.has(Ns.rdf.Type, Ns.schema.Dataset)

        quads.forEach((quad) => {
          const source = {
            iri: quad.term.value,
            name: {
              en: quad.out(Ns.schema('name'), { language: 'en' }).value,
              nl: quad.out(Ns.schema('name'), { language: 'nl' }).value,
            },
          }

          console.log(source)
          // Expected:
          // {
          //   iri: https://some.iri.com/for-example-1 // or -2
          //   name: {
          //     en: 'Nice example name' // or 'Other example name'
          //   }
          // }
          // Result:
          // Never able to loop over quads
          //
        })
      }
    })
}

Empty strings are mapped to undefined

When passing empty string to create a node with clownface, undefined is returned which causes hard to detect errors down the line.

The problem lies on this line: https://github.com/rdf-ext/clownface/blob/master/lib/term.js#L53-L55

I notice that the term is actually relying on how it handles null values and undefined as return value. Thus, I'd propose a change as follows so that empty string or zeros are not ignored.

-  if (!value) {
+  if (value === null || typeof value === 'undefined') {

Fix graph support

The factory doesn't provide the functionality to give a graph to the Clownface object. Also only a single graph should be supported until there is a concept for multiple graph support.

Add .graph and .graphs property

It should be possible to read the graph from the context. A readable .graph property like .term should be added. To handle cases with more then one graph also a .graphs property should be added.

Cannot delete specific objects

It is only possible to cf.deleteOut(predicates) which will remove all triples.

The methods should be extended to take a second param with the object(s) to delete

cf.deleteOut(predicate, objects)

Add method(s) to filter the results based on termType

There should be one or multiple methods to filter the results based on the termType. This can be done already with the .filter method, but there should be a shorter, more readable option to handle such cases. The filter code for a NamedNode would look like this:

cf.filter(cf => termType === 'NamedNode')

Allow creating empty NamedNodes

The namedNode method is causing problems when it's called with an empty string. The example below doesn't create a triple:

const resource = clownface({ dataset })
resource.addOut(ns.rdfs.label, resource.namedNode(''))

Rename?

If people search for "clownface," they're going to get "A deranged serial killer known as "Clownface" terrorises the residents of a small town. killer clownmaskmasked killerb horrorslasher."

How about this for a name: graphlim

list() does not always return an iterator

The function description of list() says that it always returns an iterator. In reality, it does not:

 if (this.term) {
      if (this.term.termType !== 'NamedNode' && this.term.termType !== 'BlankNode') {
        return null
      }

      if (!this.term.equals(this.namespace.nil) && !this.out(this.namespace.first).term) {
        return null
      }
    }

This causes issues in e.g. the shacl validator where a spread operator is applied to the list() result (see here)

Depending on the structure of your data, this leads to obscure errors such as listNode.list is not a function or its return value is not iterable

Add option to provide custom factory for terms

Currently all terms are generated using the @rdfjs/data-model factory. It should be possible to use a custom factory. The factory should be given as option to the constructor and the class instance keeps the factory and uses only the given factory to create Term instances.

Add Symbol.iterator to Clownface class

This could be used similarly like toArray() but also in for..of loops and to do direct destructuring

const people = cf({ dataset }).has(rdf.type, schema.Person)

for (const person of people) {
 ...
}

ptr.deleteList does not return when trying to delete something that's not a list.

ptr.deleteList does not return when trying to delete something that's not a list.

Example:

const parse = require('./support/parse.js')
const rdf = require('./support/factory')

const data = `
<http://buggy> <https://cube.link/view/argument> "2019-01-01T23:00:00.000Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
`

async function main () {
  const cf = await parse(data)
  const ptr = cf.node(rdf.namedNode('http://buggy'))
  ptr.deleteList(rdf.namedNode('https://cube.link/view/argument'))
}

main()

Perhaps some checking is required?

better handling of non-list or empty list object values in .list()

With the current code the following example creates an array with undefined as the only element.

const rdf = require('rdf-ext')
const clownface = require('.')

const subject = rdf.namedNode('http://example.org/subject')
const predicate = rdf.namedNode('http://example.org/predicate')
const ptr = clownface({ dataset: rdf.dataset(), term: subject })
const list = ptr.list(predicate)
console.log([...list].map(e => e.term))

The return value should be null for this case.

Here a full list of edge cases and the expected return values:

empty list

<subject> <predicate> ().
[]

non list object

<subject> <predicate> "test".
null

no matching triple

# no triple
null

The documentation should be extended to show how a non-list object can be combined with .out:

const list = shape.list(ns.sh.path)
const values = (list && [...list]) || [shape.out(ns.sh.path)]

.out with language not returning expected literals

Consider a resource with only tagged labels, with primary tgs

<> rdfs:label "foo"@en , "le foo"@fr , "das Foo"@de .

What should be the output of .out(rdfs.label, { language: [ 'en-US' ] }) ?

Right now it returns nothing but I think that a more specific language should implicitly be followed by less specific ones. So for example a filtered out by a tertiary tag

ptr.out(rdfs.label, { language: [ 'de-DE-1990', 'en' ] })

should be equivalent to

ptr.out(rdfs.label, { language: [ 'de-DE-1990', 'de-DE', 'de', 'en' ] })

Without context, addOut creates blank subjects

Same would likely apply to addIn and the created objects

const cf = require("clownface")
const { dataset } = require('rdf-ext')
const { rdf, schema } = require('@tpluscode/rdf-ns-builders')

const graph = cf({ dataset: dataset() })

graph.addOut(rdf.type, schema.Person)

;[...graph.dataset][0].subject

Current

The code above creates an undefined subject in that quad

Expected

Initially I thought that maybe this should be a silent no-op but then maybe it would just generate hard to track errors down the line for consumers.

Should it just throw?

add .literal and .namedNode methods

There should be a simpler way to create CF objects then calling .node with type, datatype or language parameters, if a Named Node context or Literals context with a specific datatype or language should be created. For this use case a little bit more explicit API would be useful. For Named Nodes it would be simply:

.namedNode(string iri)

for a Literal:

.literal(string value, string|NamedNode|ClownFace languageOrDatatype)

As there should be no : in the language string, the usage of the languageOrDatatype parameter can be detected automatically.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.