Giter VIP home page Giter VIP logo

Comments (6)

kieferk avatar kieferk commented on September 2, 2024

Looks similar for sure. This is the first I've heard of this one. All of these packages (dfply, dplython, plydata) are python ports of the dplyr package so they are going to be pretty similar in syntax.

from dfply.

Make42 avatar Make42 commented on September 2, 2024

https://github.com/coursera/pandas-ply would be another one. They (except the new plydata) are summarized in http://fastml.com/piping-in-r-and-in-pandas/ - also dfply is mentioned. I am not quite sure, what the "graph of inspiration" is. I think your project is the newest, inspired by dplython, right?

I am a bit baffled, that all this very similar projects pop up and I do not quite understand why the respective open source programmers do not collaborate instead... can you shed some light on that?

from dfply.

kieferk avatar kieferk commented on September 2, 2024

OK I'll give you the breakdown, from my perspective at least. All of these packages are trying to port the incredible dplyr package from R to python. Because python is a significantly different language than R, people have taken different approaches to doing this (no lazy evaluation, for example, makes the porting less than trivial).

When I first started making this package the two I was aware of were pandas-ply and dplython. The latter was close to what I was hoping for, but it did not appear to be maintained actively anymore and I didn't like the fact that you were required to first convert your pandas DataFrame into the special DplyFrame object before piping would work.

Now it seems there are some even newer ones like plydata that I'm not super familiar with. Obviously I am biased, but I think that of all the options dfply is the "truest" to the dplyr syntax and is the most fleshed out. For example, doing something like >> select(starts_with('c')) is how it works in dplyr and only possible in dfply AFAIK.

As for why people aren't collaborating, I think it's mostly timing. The fact that dplython appeared dead inspired me to make my own. There's also some differences in opinion w/r/t syntax and how similar or different syntax is from 'dplyr'. On one extreme would be pandas-ply, which entirely forgoes the dplyr piping syntax and naming. On the other extreme is dfply that is as close to dplyr as possible.

Hope that helps clear things up!

from dfply.

Make42 avatar Make42 commented on September 2, 2024

It does nicely. Thanks!

pandas-ply looks pretty dead to me, too, to be honest.

PS: Now you only have to change mask into filter ;-). This always bugs me.

from dfply.

kieferk avatar kieferk commented on September 2, 2024

Can't change that one since filter is a standard and commonly-used function in Python!

I'll close this issue now, cheers.

from dfply.

has2k1 avatar has2k1 commented on September 2, 2024

@Make42, thank you for making me aware of dfply, I did not know about it. I have chosen to reply here instead of at has2k1/plydata#3 since I can do a little more than just acknowledge learning about dfply.

I knew about dplython when I created plydata, in fact I had thought about it long before and had a mock implementation with which I tried to influence the direction of dplython. I did not work and I could not adapt to dplython.

Specifically, I did not like the conversion to a special dataframe and I felt that the manager variable X was clunky. A string evaluation based implementation helps get around both issues.

Nonetheless, it does pain me to see a duplication of efforts in the open-source world. Concerning dfply and plydata, the efforts will probably go on since both are mature enough and have distinct design choices. That is, dfply goes for near syntax compatibility with dplyr e.g.

>> select(starts_with('c'))

Whereas plydata (>> notwithstanding) tries to be more "pythonic" e.g.

>> select(startswith='c')

Reason for the difference is, in pytthon str.startswith does not have an underscore and using a keyword argument reduces the number of variables in the global namespace.

Anyhow, if I had seen this project get off the ground I would have tried (with more zeal this time) to infect @kieferk with string evaluation ideas.

from dfply.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.