Giter VIP home page Giter VIP logo

publisci's Introduction

SciRuby meta gem Build Status

Tools for Scientific Computing in Ruby

Description

This gem acts as a meta gem which collects and provides multiple scientific gems, including numeric and visualization libraries.

Getting started

Installation:

gem install sciruby
gem install sciruby-full

If you want to have a full-blown installation, install sciruby-full.

Start a notebook server:

iruby notebook

Enter commands:

require 'sciruby'
# Scientific gems are auto loaded, you can use them directly!
plot = Nyaplot::Plot.new
sc = plot.add(:scatter, [0,1,2,3,4], [-1,2,-3,4,-5])

Take a look at gems.yml or the list of gems for interesting gems which are included in sciruby-full.

License

Copyright (c) 2010 onward, The Ruby Science Foundation.

All rights reserved.

SciRuby is licensed under the BSD 3-clause license. See LICENSE for details.

Donations

Support a SciRuby Fellow via Pledgie.

publisci's People

Contributors

wstrinz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

publisci's Issues

Optionally validate output with ruby-rdf

Currently the gem does not check whether its output is syntactically valid turtle rdf. Ruby-rdf's Repository class normally just ignores invalid statements, but it can be set to raise an error on invalid input by passing the validate: true parameter to an RDF::Repository.load() call.

The gem DSLs to_repository method (see here) uses the RDF::Repository interface, so adding the option to validate output should be as simple as adding the validate: true flag to the repo.load call, and handling and informing users about potential errors.

Plotrb writer

Plotrb is a Ruby plotting library, and a part of SciRuby. It offers an efficient DSL for creating javascript plots using Vega.

CubeViz shows the potential for using DataCube RDF to explore and visualize multi-dimensional data. Some similar applications could be achieved by creating a PubliSci Writer that output graphs using Plotrb.

The writer should take data formatted in RDF DataCube vocabulary and use it to create plots with the Plotrb DSL

See here for an example of creating bar charts with Plotrb.

For examples of Writer classes, see the csv and arff writers.

Redesign publisci-server web interface to be less ugly

The html interface for the server extension to the gem is very rudimentary at the moment. It'd be nice if it were somewhat more friendly instead of just a mess of a links and content.

We don't need anything too fancy, but any interface changes that would make the service easier to interact with are much appreciated.

Create an NMatrix transformer and writer

NMatrix is a linear algebra library written in Ruby, and one of the core gems of SciRuby. A PubliSci::Writers class that takes Data Cube RDF as input and creates NMatrix objects, or a Publisci::Readers that does the opposite, would help integrate this gem with the rest of SciRuby, and open up interesting possibilities for sharing and discovering datasets.

For examples of Writer classes, see the csv and arff writers. A transformer that takes matricies as inputs (from R) is also available for reference.

Allow custom transformers to be used in the dataset DSL

PubliSci's RDFization classes all use the Data Cube vocabulary to represent its input. The DSL and server extension don't necessarily require this however, and could serve as a useful interface for accessing other conversion tools.

Currently, the data function of the dsl is used by specifying the input source, and various Data Cube or parser specific parameters, for example

data do
  source 'https://github.com/wstrinz/publisci/raw/master/spec/csv/bacon.csv'

  dimension 'producer', 'pricerange'
  measure 'chunkiness'

  option 'label_column', 'producer'
end

Resolving this issue would, for example, involve modifying dataset_dsl.rb so that something like

data do
  engine :bio-interchange
  source 'https://gist.github.com/wstrinz/7165201/raw/0f688ba7041e828d3336bb530aa7495d94022af1/example.gff3'
end

to convert a GFF3 file using BioInterchange would be possible.

Refactor and improve structured node generation

allowing individual datapoint values to be rdf nodes with their own structured information (as opposed to simple literals) would significantly increase the expressive power of the gem's tools, and make compatibility with ontologies such as sio

A simple way to do this would be to use blank nodes, however this introduces problems in dataset sharing and interoperability, and is generally considered "bad style" in the Semantic Web community.

The alternative is generating unique URIs yourself. This should be possible using the dataset name and namespace, the label of the observation, the position of the value within the observation, and the depth of the structured node. A basic implementation is in the gem right now, but still has a few kinks and only works one node deep. For an example of the output, see this gist

A proper implementation should interpret arrays in its input as nodes, and generate uris and rdf for them. Whatever method is involved should be recursive or have some other way to allow nodes of arbitrary depth as well.

Replace manual generation of turtle strings with ruby-rdf where possible

Any part of the core data_cube.rb module that can be rewritten to use ruby-rdf rather than string manipulation would help with future development of the gem (as long as the tests still pass and performance doesn't take too big a hit)

Early on I made the decision to generate the rdf turtle output for the gem using mostly string manipulation. Ruby makes this easy and pretty, but ultimately it is a less flexible, testable, and reliable way than using good object oriented design to build up an output graph.

In retrospect, I should have used the excellent ruby-rdf gem more often for building statements, Converting ruby literals to RDF literals, and serializing or loading the output.

Port existing parsers over to new parser/generator interface

I've recently added a more standardized interface for the process of transforming input data to RDF, which existing Publisci::Readers classes should eventually be converted to.

Having a controlled interface will allow me to continue adding features such as new serialization formats without breaking existing transformers, and will help other understand and improve transformer code more effectively.

Currently this new format is not documented anywhere, but if someone is interested in tackling this issue I can make sure to write some!

Register Writers

Currently new transformers can be registered with the gem DSL using PubliSci::Dataset.register_reader(extension,class), but there is no equivalent method to register Writer classes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.