Giter VIP home page Giter VIP logo

cascade's Introduction

[Cascade]

Build Status Gem Version

The main goal of this gem is to provide some kind of template for parsing files. Usually, file parsing process consists of the following steps:

  1. Retrieve info from file
  2. Distinguish content from each file line
  3. Parse each column with corresponding parser
  4. Generate some kind of data record
  5. Save obtained record
  6. Handle errors
  7. Generate report

Cascade pretends to simplify main part of this step to save your time.

Installation

Install the cascade-rb package from Rubygems:

gem install cascade-rb

or add it to your Gemfile for Bundler:

gem 'cascade-rb'

Usage

Require gem files

require 'cascade'

Provide enumerable object for parsing and run it!

Cascade::DataParser.new(data_provider: Csv.open("data_test.csv")).call

Columns mapping

Parsing file description should have the following structure (example)

mapping:
  name: type

Columns parsing

There are already several defined field parsers (types):

  • currency
  • boolean
  • string

Feel free to add new field parsers through PR.

Components replaceability

There is a lot of DI in this gem, so, you can replace each component of the parser. Let's assume you want to parse JSON files instead of CSV, save this to ActiveRecord model, and you need Date fields parsing, ok! Writing new data provider:

class ParserJSON
  def open(file)
    JSON.parse(File.read(file))["rows"]
  end
end

Writing new data saver:

class PersonDataSaver
  def call(person_data)
    Person.create!(person_data)
  end
end

considering that there is no much logic even better

 PERSON_SAVER = -> (person_data) { Person.create!(person_data) }

Writing date parser:

class DateParser
  def call(value)
    Date.parse(value)
  end
end

or you can always use lambdas for such logic

DATE_PARSER = -> (value) { Date.parse(value) }

Provide all this stuff into data parser

Cascade::DataParser.new(
  data_provider: ParserJSON.new.open("data_test.csv"),
  row_processor: Cascade::RowProcessor.new(ext_parsers: { date: DateParser.new }),
  data_saver: PERSON_SAVER
 ).call

And that's all!

cascade's People

Contributors

ignat-z avatar evheny0 avatar alex-kovshovik avatar

Stargazers

 avatar kos-zenin avatar  avatar Alexander Belov avatar Sergey Kuchmistov avatar

Watchers

James Cloos avatar  avatar  avatar

Forkers

alex-kovshovik

cascade's Issues

TODO List

Phase 1

  • Write README
  • Exceptions throwing mechanism
  • Exceptions handling mechanism
  • Statistics collecting (Singleton + StatisticsCollectable)
  • Gem?
  • Examples
  • True/False value parser
  • Autopulling keys list (use fetch for columns instead of public_send)
  • cascade/lib/columns_values.rb YAML::load -> YAML.load_file
  • Require grouping
  • Configure HoundCI

Phase 2

  • Parser config per instance
  • Remove iso_countries_codes dependency
  • Remove CascadeCSV
  • Use DataProvider without any kind of additional operations (just waiting for iterable object)
  • Ability to mark unused columns
  • Use Registry pattern for default impls
  • Separate libraries using and requiring cascade files
  • Use Partial functions to limit wrong argument using
  • Move all related code into GitHub organization
  • Use Refinements for #reverse_merge
  • Use AAA pattern (Arrange Act Assert) in tests
  • Use more simple abstractions
  • Try to move smart fields mapping
  • Rename RowProcessor to EntityProcessor
  • Allow to use nested mapping
  • Allow to use recursive mapping

Backlog

  • Result report generator
  • Documentation
  • First line omitting
  • [Perfomance issue] Do not create parsers for each row (generate array of parsers and run through all) (7% win, any sanse?)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.