Giter VIP home page Giter VIP logo

vivo-pump's Introduction

The VIVO Pump

A general tool set for managing data in VIVO using data rectangles (rows and columns, including spreadsheets)

The Pump uses a definition file in JSON format that describes the nature of the rows -- entities in VIVO -- and the relationship of the columns to the graph of data in VIVO. Each row/column intersection is an "instruction" to the Pump:

  • blank or empty means do nothing
  • None means remove any value found in VIVO
  • a value means replace the value in VIVO with the value in the rectangle.

The Pump has two major operators:

  1. Get -- gets values from VIVO according to the definition and returns a rectangle
  2. Update -- uses a rectangle and updates VIVO according to the definition

Simple VIVO

Before you can use this tool please install the requirements:

pip install -r requirements.txt

A command line tool using the pump (Simple VIVO) is delivered with the pump. Simple VIVO supports data management of VIVO data from a "spreadsheet" -- a delimited file of rows and columns, and a corresponding definition file. Simple VIVO supports get and update, along with some reporting operators. For example:

python sv.py -a get -d org_def.json -s orgs.txt

will use the definition file org_def.json to get data from VIVO and return it in orgs.txt

python sv.py -a update -d person_def.json -s people.txt

will use the definition file person_def.json and the source data in people.txt to make updates in VIVO

Additional Features

  1. Enumerations -- each column can have a defined "enumeration" or substitution list that translates to and from codes you might find easier to use than internal VIVO codes.
  2. Filters -- each column can have an automated filter that takes the value in VIVO and "improves" it, providing a standardized representation. Phone numbers, for example, may be filtered to insure that each conforms to standard formatting.
  3. Set management. VIVO supports multiple values for many of its attributes a value -- research areas for a faculty member, for example. For such attributes, the Pump supports comparing the set of values in VIVO to the set provided in the spreadsheet, adding and removing as needed to insure that final set in VIVO is the set specified in spreadsheet.
  4. Handlers (coming soon) permit additional operations to be performed on column values. A photo handler might make a thumbnail and insure that the original and its thumbnail are placed in the filesystem where VIVO expects.

Use Cases

  1. Enterprise data management of data in VIVO. The "enterprise" produces data in rectangles -- lists of people who work at the institution, lists of holdings in the institutional repository for people at the institution, etc. These rectangles can be used by the Pump to update data in VIVO on an established schedule.
  2. Distributed data management. Definitions can be structured to provide subsets of entities for management at a local level. Separate spreadsheets of faculty per college, for example, would allow college offices to manage attributes locally.
  3. Special collections of data in VIVO. VIVO is often used to track data that is not otherwise available at the enterprise level. VIVO might maintain photographs of each building on campus. A simple spreadsheet can be maintained with the names of the photographs of each building. As buildings come and go, or photographs improve, the collection of buildings and their photos in VIVO can be updated.
  4. Data clean up. Using the get functionality, particular attributes can be retrieved from VIVO for review by a data manager, and/or automated improvement using filters (see Additional Features).
  5. Upgrades. Use JSON definitions for your current version to pull data out of an old VIVO and into spreadsheets.
    Use JSON definitions for the upgraded version to put data from your spreadsheets into your new VIVO.

vivo-pump's People

Contributors

mconlon17 avatar asura-asp avatar

Watchers

James Cloos avatar Kevin Hanson avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.