pelegm / drv Goto Github PK
View Code? Open in Web Editor NEWDiscrete random variables in Python made easy
Home Page: http://pelegm.github.io/drv
License: The Unlicense
Discrete random variables in Python made easy
Home Page: http://pelegm.github.io/drv
License: The Unlicense
For example,
In [1]: import drv.real
In [2]: ps = [1, 1, 2, 3, 5, 0, 13, 21, 34, 0]
In [3]: fp = drv.pspace.FDPSpace(ps)
In [4]: fr = drv.real.FRDRV('fr', fp, lambda x: x ** 2)
In [5]: fr.max
Out[5]: 81
In [6]: fr.xs
Out[6]: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
In [7]: fr.ps
Out[7]: [0.0125, 0.0125, 0.025, 0.0375, 0.0625, 0.0, 0.1625, 0.2625, 0.425, 0.0]
The maximum of the random variable should clearly be 64.
The following:
drv.core.DiscreteRandomVariable(name='test', xs=range(-3, 4), ps=[0,0,0,1,0,0,0]).max
should return 0, but it returns 3.
The first goal would be to allow Rational(1/6)
instead of 1./6
as the probability to roll 6 on a 6-sided die.
The second goal would be to allow ndk(n, k)
for n
and k
which are symbols.
Should be implemented. So far:
In [2]: import drv.real
In [3]: r = drv.real.FRDRV('r', drv.pspace.FDPSpace(range(10)[::-1]), lambda x: x**2)
In [4]: (r+1).mean
raises
Traceback (most recent call last):
File "<ipython-input-4-b4acbe6f318b>", line 1, in <module>
(r+1).mean
File "drv/real.py", line 62, in mean
return self.moment(1)
File "drv/real.py", line 123, in moment
return self.pspace.integrate(self._moment_func(n))
AttributeError: 'ProductDPSpace' object has no attribute 'integrate'
The following code:
b = dists.Binomial(10000, 0.1)
raises the following error:
--> 136 return 1.0 * choose(n, k) * p ** k * (1 - p) ** (n - k)
OverflowError: long int too large to convert to float
Test dump:
_________________________________________________________________________ test_sfunc __________________________________________________________________________
def test_sfunc():
with pytest.raises(ValueError):
> drv.sfunc({})
tests/test_rv.py:43:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = DRV(drv), sample = {}
def sfunc(self, sample):
""" Return the result of ``func`` on the sampled data, where *sample*
must contain all coordinates of self's probability space, but may also
contain coordinates of other probability spaces, which should be
ignored. In case not all coordinates are included in *sample*, raise
``ValueError``.
*sample* is a dictionary pointing probability spaces to contained
outcomes. """
ws = []
> for pspace in self.pspace.pspaces:
try:
ws.append(sample[pspace])
E AttributeError: 'DPSpace' object has no attribute 'pspaces'
drv/rv.py:40: AttributeError
Currently only FRDRV addition is implemented.
We treat sympy/sympy#8251 in CDPspace.integrate
; can we skip that?
Currently, the following happens:
In [1]: import sympy
In [2]: p = lambda n: 1 if n == 0 else 0
In [3]: n = sympy.Symbol('n', integer=True, nonnegative=True)
In [4]: p(n)
Out[4]: 0
So, instead, p
should be defined as a piecewise function, somehow.
With the following setup:
b = dists.Binomial(1000, 0.1)
the following operations
b.mean
b.variance
time 48μs and 110μs respectively. However, b.mean
is simply 100 and b.variance
is simply 90; so much quicker methods should be written for Binomial
. A quicker b.mean
should time about 300ns, about 150 times faster.
Similar quick methods should be implemented for other distributions.
For example,
In [1]: import drv.real
In [2]: ps = [0, 1, 2, 3, 5, 0, 13, 21, 34, 0]
In [3]: fp = drv.pspace.FDPSpace(ps)
In [4]: fr = drv.real.FRDRV('fr', fp, lambda x: x ** 2 + 7)
In [5]: fr.min
Out[5]: 7
In [6]: fr.xs
Out[6]: [7, 8, 11, 16, 23, 32, 43, 56, 71, 88]
In [7]: fr.ps
Out[7]:
[0.0,
0.012658227848101266,
0.02531645569620253,
0.0379746835443038,
0.06329113924050633,
0.0,
0.16455696202531644,
0.26582278481012656,
0.43037974683544306,
0.0]
Related: #6.
One option is to force all categories to be comparable, such as with using pairs (order, value)
. Another possible solution is not to implement cdf
for general RVs, and then there's a possibility to implement a mid-class that is "not-so-general" whose values have to be comparable, and in fact, have to be linearly-ordered.
I believe that I like the most the last idea.
A general PMF method for random variables should look as follows:
def indicator(self, k):
""" Return the indicator function of X=k. """
def ind(w, k=k, func=self.func):
return 1 if func(w) == k else 0
return ind
def pmf(self, k):
return self.pspace.integrate(self.indicator(k))
For example, make the pmf of the binomial distribution be something like
def pmf(self, k):
n, p = self.n, self.p
return choose(n, k) * p ** k * (1 - p) ** (n - k)
A few things to note:
choose
function is called binomial
in SymPyk
before putting it inside the calculation (not sure, we may wish to verify this)The sample space, if such exists, will be called Omega
. So in general, any probability space should have an Omega
attribute, where this might return a NotImplementedError
if it cannot be represented (as is the case, at the moment, for infinite probability spaces).
The following:
In [4]: ps = drv.pspace.FDPSpace([1] * 8 + [0] + [1]*2 + [0])
In [5]: d = drv.rv.FDRV('d', ps, lambda x: x**2)
In [6]: d.cdf(0)
raises
Traceback (most recent call last):
File "<ipython-input-6-66ba4bb58e45>", line 1, in <module>
d.cdf(0)
File "drv/rv.py", line 151, in cdf
return self._cdf(self.xs.index(k))
AttributeError: 'FDRV' object has no attribute 'xs'
The following
In [4]: ps = drv.pspace.FDPSpace([1] * 8 + [0] + [1]*2 + [0])
In [5]: d = drv.rv.FDRV('d', ps, lambda x: x**2)
In [6]: d.flatten()
raises
Traceback (most recent call last):
File "<ipython-input-6-a091bca363af>", line 1, in <module>
d.flatten()
File "drv/rv.py", line 187, in flatten
for p, x in zip(self.ps, self.xs):
AttributeError: 'FDRV' object has no attribute 'ps'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.