Giter VIP home page Giter VIP logo

Comments (4)

yshysh avatar yshysh commented on August 24, 2024

I can not understand what you mean。

from bigflow.

linearhinos avatar linearhinos commented on August 24, 2024

i mean, in addition to sum(), count(), could bigflow support mean()/variance() and other popular statistical function for PCollection ?

from bigflow.

acmol avatar acmol commented on August 24, 2024

Actually, you can use:

def mean(p):
    return p.sum() / p.count()   
    # this is a sugar for p.sum().map(lambda s, c: s / c, p.count())

to implement mean in one line.

then, you can use them in apply_values,
e.g.

p.group_by_key()\
  .apply_values(mean)

At the same time, if you want to use it to a global pcollection, you can just use apply:

p.apply(mean) 

or just call it directly:

mean(p)

Because it's easy to implement these functions, so we don't regard them as built-in methods.

If you find it difficult to write these functions, you can always use transforms.make_tuple(pobject1, pobject2).
E.g. You can use transforms.make_tuple to implement mean like this:

def mean(p):
    return transforms.make_tuple(p.sum(), p.count()).map(lambda (s, c): s/c)

And you can implement a method to get both sum and mean, and use them in apply_values like this:

def sum_and_mean(p):
    return transforms.make_tuple(p.sum(), p.apply(mean))

p.group_by_key().apply_values(sum_and_mean)

from bigflow.

chunyang-wen avatar chunyang-wen commented on August 24, 2024

I think there should be a module to provide available or useful functions.

from bigflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.