Giter VIP home page Giter VIP logo

Comments (4)

rsyring avatar rsyring commented on July 22, 2024

Please consider adding a section that compares bonobo to PETL.

from bonobo.

hartym avatar hartym commented on July 22, 2024

Yes, comparisons to other tools are planned.

In the list (feel free to complete it) :

  • airflow
  • bubbles
  • dataiku
  • dataprep
  • dask
  • hadoop and ecosystem
  • luigi
  • pandas
  • pentaho
  • petl
  • pygrametl
  • pypes
  • pytoolz
  • talend
  • ...

If some expert on any of those tools is available to help me make the more honest comparison possible, it'd be amazing.

from bonobo.

funkyfuture avatar funkyfuture commented on July 22, 2024

ciao, bonobo might be something that i need as a pythonic replacement of xslt, thus i consulted the docs to get a grip of it. i didn't find out whether it fits, but i found some questions that would help me to figure it out. maybe that helps you when you update the docs (which i would strongly suggest as the library looks promising, but it's hard to judge if it'd be suited for a task.)

  • what exact facilities are available to control the evaluation logic of a graph?
  • can a graph contain another graph?
  • how would one access contextual data from a transformation?
    • are there parameter injections like pytest's fixtures?
  • are there yet any concepts how to process trees, like xml?
  • how is a plugin distinguished from a python import in a module that contains transformation callables?

on a sidenote, what the heck is marketing-automation? how would that make the world a better place?

from bonobo.

hartym avatar hartym commented on July 22, 2024

Hi @funkyfuture

Not easy to understand what you're looking for. You're saying "pythonic replacement of xslt", and bonobo can transform xml into something else (or into another xml). Which sounds like what you say, but not certain about your use case and whether or not it would be an idea worth considering.

I'll try to answer your questions here, even if this would maybe suit more a discussion on slack than comments in another ticket. I'll consider your questions for a future F.A.Q. section in the doc (along with others, of course)

What exact facilities are available to control the evaluation logic of a graph?
This question I don't understand. Graph are not "evaluated" but are a tool to define the flow of data. Nodes in a graph are linked directionally, and there are FIFO queues between output of a node and input of the next, when the graph is executed (those queues are only created by the executor, and thus executions are isolated). Feel free to explain what you meant in different words if I did not answer.

Can a graph contain another graph?
There are no tools today in bonobo to insert a graph as a subgraph. It would be great to allow so, but there is a few design questions behind this, like what node you use as input and output of the subgraph, etc. Probably something that will come way after 1.0.

How would one access contextual data from a transformation? / are there parameter injections like pytest's fixtures?
You have the question and the answer here. You have parameter injections like pytest fixtures, and it is the way to go to access contextual data in a transformation. The API may evolve a bit though, because I feel it's a bit hackish, as it is. I mean, it's the right concept, but the exact syntax used make me feel it's not the best experience we can have. To understand how it works today, look at https://github.com/python-bonobo/bonobo/blob/0.2/bonobo/io/csv.py#L63 and class hierarchy.

Are there yet any concepts how to process trees, like xml?
There was the "xml mapper" in bonobo ancestor that had a bit of logic to explain how to go from a xml "blob" to lines of data (cf https://github.com/hartym/rdc.etl/blob/dev/rdc/etl/transform/map/xml.py). It's not exactly "tree processing", but as an ETL is a line-by-line processor, you need to be able to transform your tree in something more flat, and there may be a lot of different options to do so. Think depth first, width first, skip items or not, preprocess depending on type, etc. It may be better to just write your flattening logic in a function, then process it with regular tools as it's not a tree anymore.

How is a plugin distinguished from a python import in a module that contains transformation callables?
Transformation callables are just regular callables, and there is nothing that differentiate it from regular python callables. You can even use some callables both in an imperative programming context and in a transformation graph, no problem. Plugins in bonobo is a different concept that allows one to "enhance" executions in a generic way. For example, the console plugin enhance execution with a nice ANSI output that displays statistics while the execution is running (https://github.com/python-bonobo/bonobo/blob/0.2/bonobo/ext/console/plugin.py). I'd say, no need to think about this for standard ETL cases, it's more a way to extend the framework in itself than userland.

On a sidenote, what the heck is marketing-automation? how would that make the world a better place?
It is tagged as such because I have use cases where I use bonobo for marketing automation. It's probably a derivative usage and not the main point, but I guess there is such a use case (think IFTTT or Zappier, but programmatic).
Bonobo never promised to "make the world a better place", but I'd say it's a good thing for you if you're wasting time on repetitive marketing tasks and bonobo helps you automate it. My own sidenote: I don't understand why people tend to think marketing is a bad thing.

I hope it answers your questions, if not, let's have a chat on slack so I better understand your points.

from bonobo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.