Giter VIP home page Giter VIP logo

gpfit's People

Contributors

1ozturkbe avatar bqpd avatar nanjekyejoannah avatar pgkirsch avatar whoburg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gpfit's Issues

plot_fit function

I think GPfit would benefit from a plot_fit method for 1D, and perhaps 2D, functions. This method should plot both the original data and the fitted function, and have the option of plotting in log space too.

unit tests are failing

Traceback below.

======================================================================
FAIL: test_MA (t_print_fit.t_print_MA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/whoburg/MIT/dev/gpfit/gpfit/tests/t_print_fit.py", line 17, in test_MA
    'w = 8.1e+03 * (u_0)**10 * (u_1)**11 * (u_2)**12'])
AssertionError: Lists differ: ['w = 2.72 * (u_1)**2 * (u_2)*... != ['w = 2.72 * (u_0)**2 * (u_1)*...

First differing element 0:
w = 2.72 * (u_1)**2 * (u_2)**3 * (u_3)**4
w = 2.72 * (u_0)**2 * (u_1)**3 * (u_2)**4

- ['w = 2.72 * (u_1)**2 * (u_2)**3 * (u_3)**4',
-  'w = 148 * (u_1)**6 * (u_2)**7 * (u_3)**8',
-  'w = 8.1e+03 * (u_1)**10 * (u_2)**11 * (u_3)**12']
+ ['w = 2.72 * (u_0)**2 * (u_1)**3 * (u_2)**4',
+  'w = 148 * (u_0)**6 * (u_1)**7 * (u_2)**8',
+  'w = 8.1e+03 * (u_0)**10 * (u_1)**11 * (u_2)**12']

======================================================================
FAIL: test_SMA (t_print_fit.t_print_SMA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/whoburg/MIT/dev/gpfit/gpfit/tests/t_print_fit.py", line 32, in test_SMA
    '    + 2 * (u_0)**0.769 * (u_1)**0.846 * (u_2)**0.923'])
AssertionError: Lists differ: ['w**0.0769 = 1.08 * (u_1)**0.... != ['w**0.0769 = 1.08 * (u_0)**0....

First differing element 0:
w**0.0769 = 1.08 * (u_1)**0.154 * (u_2)**0.231 * (u_3)**0.308
w**0.0769 = 1.08 * (u_0)**0.154 * (u_1)**0.231 * (u_2)**0.308

- ['w**0.0769 = 1.08 * (u_1)**0.154 * (u_2)**0.231 * (u_3)**0.308',
-  '    + 1.47 * (u_1)**0.462 * (u_2)**0.538 * (u_3)**0.615',
-  '    + 2 * (u_1)**0.769 * (u_2)**0.846 * (u_3)**0.923']
+ ['w**0.0769 = 1.08 * (u_0)**0.154 * (u_1)**0.231 * (u_2)**0.308',
+  '    + 1.47 * (u_0)**0.462 * (u_1)**0.538 * (u_2)**0.615',
+  '    + 2 * (u_0)**0.769 * (u_1)**0.846 * (u_2)**0.923']

======================================================================
FAIL: test_ISMA (t_print_fit.t_print_ISMA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/whoburg/MIT/dev/gpfit/gpfit/tests/t_print_fit.py", line 47, in test_ISMA
    '    + (1.82/w**0.0667) * (u_0)**0.667 * (u_1)**0.733 * (u_2)**0.8'])
AssertionError: Lists differ: ['1 = (1.08/w**0.0769) * (u_1)... != ['1 = (1.08/w**0.0769) * (u_0)...

First differing element 0:
1 = (1.08/w**0.0769) * (u_1)**0.154 * (u_2)**0.231 * (u_3)**0.308
1 = (1.08/w**0.0769) * (u_0)**0.154 * (u_1)**0.231 * (u_2)**0.308

Diff is 803 characters long. Set self.maxDiff to None to see it.

----------------------------------------------------------------------
Ran 47 tests in 0.136s

FAILED (failures=3)

make unit tests deterministic

unit tests should use fixed seeds to prevent non-determinism associated with initial guesses etc.

For example, this error seems to be sporadic, some times it passes, sometimes it fails:

FAIL [0.000s]: test_rms_error (gpfit.tests.t_ex6_3.t_ex6_3_ISMA)

Traceback (most recent call last):
File "/home1/jenkins/workspace/gpfit_PullRequest/buildnode/reynolds/gpfit/tests/t_ex6_3.py", line 21, in test_rms_error
self.assertTrue(self.rms_error < 5e-4)
AssertionError: False is not true

Should input data be pre- or post- log-transformation?

Given a data set with large numbers, GPfit runs into numerical overflow issues with exp().

x = [1200,
       13000,
       15000,
       16000,
       17000,
       18000,
       19000,
       30000,
       32000,
       34000]

y = [325000,
       250000,
       750000,
       2E6,
       7E6,
       750000,
       8E6,
       6E6,
       2E6,
       13E6,
       ]

Gives results like:

/Users/philippekirschen/Documents/MIT/Research/GPfit/gpfit/gpfit/fit.py:127: RuntimeWarning: overflow encountered in exp
  w_SMA = exp(y_SMA)
/Users/philippekirschen/Documents/MIT/Research/GPfit/gpfit/gpfit/fit.py:130: RuntimeWarning: overflow encountered in exp
  w = (exp(ydata)).T[0]


w**0.1 = 0 * (u_1)**34.9
    + inf * (u_1)**-5.28
    + 0 * (u_1)**198

Wondering if anything clever can be done here.

examples should be run as unit tests

... just like we do in gpkit. There's currently duplicate code such as t_ex6_1 living in both examples/ and in tests -- that code should live in examples/ and be called in t_examples just like in gpkit.

gpfit not updated to match gpkit

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
/Users/mjburton11/Documents/SuperUROP/gpkit-models/1682/gas_male/Datasets/fitDF70.py in <module>()
      2 
      3 import gpfit
----> 4 from gpfit.fit import fit
      5 import numpy as np
      6 import pandas as pd

/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/fit.py in <module>()
      6 from max_affine_init import max_affine_init
      7 from print_fit import print_ISMA, print_SMA, print_MA
----> 8 from gpkit.nomials import Posynomial, Monomial, Constraint, MonoEQConstraint
      9 from numpy import append, ones, exp, sqrt, mean, square
     10 

ImportError: cannot import name Constraint

different attributes in different installations

Hi all,

I'm using gpfit in two different machines. After I run a fitting routine in my data I get certain attributes for the output. The "good" machine shows:

pr.pprint(smafit.__dict__.keys())
['posymap',
 'mfac',
 'ivar',
 'constraint',
 'evaluate',
 'dvars',
 'numpy_bools',
 'bounds',
 'substitutions',
 'max_err',
 'fitdata',
 'varkeys',
 'rms_err']

Whereas the "bad" machine shows:

pr.pprint(smafit.__dict__.keys())
['oper',
 'unsubbed',
 'right',
 'evaluate',
 'nomials',
 'substitutions',
 'p_lt',
 'varkeys',
 'm_gt',
 'last_used_substitutions',
 'left']

I use the fitdata attribute to extract the coefficients in a nice way and import them im MATLAB for post-processing. Why is the attribute list so different in these two installations? I cloned gpfit today in the old (bad) machine to see if there was any update but it doesn't see to affect the attribute list. None of the attributes available in the "bad" machine contains the coefficient in a nice format for extraction.

Returned fit causes overflow error

Again, when trying to fit the compressor maps, I've gotten a number of returned fits with very large exponents. Sometimes, when these are plotted outside of log space, overflow errors occur. I'm not sure this can be avoided, and I don't think the data I'm using is too well conditioned, but it would be nice if there was someway to control how large the exponents were in the fit.

'''
w0.231 = 0.00187 * (u_1)-302 * (u_2)58.2
+ 3.75e-12 * (u_1)
-2.67e+03 * (u_2)496
+ 0.326 * (u_1)
-7.81 * (u_2)**0.962
+ 0.465 * (u_1)1.97 * (u_2)-0.525
'''

regularization

... we should implement it, to avoid huge fitted parameters.

  • ridge
  • lasso

Bug when K==1

near line 156 of fit.py, cstrt = MonomialEquality(cstrt, "=", 1) should I think be be cstrt = (mono == posy)?

index error

@bqpd I'm doing some D8 fits (the file is naca_cl0_fits.py the Tail Fits folder at commit convexengineering/SPaircraft@851b1d4 on D8 master. GPfit is throwing the following error and neither I nor @1ozturkbe know what it is..

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/Users/mayork/Documents/GpGit/d8/Tail Fits/naca_cl0_fits.py in <module>()
    100     X, Y = fit_setup(NACA, Re) # call fit(X, Y, 4, "SMA") to get fit
    101     F, A = plot_fits(NACA, Re)
--> 102     make_fit(NACA, Re)
    103     F.savefig("tail_fits/taildragpolar.pdf",
    104               bbox_inches="tight")

/Users/mayork/Documents/GpGit/d8/Tail Fits/naca_cl0_fits.py in make_fit(naca_range, re_range)
     65     print np.size(x)
     66     print np.size(y)
---> 67     fit(x, y, 3)
     68 
     69 def plot_fits(naca_range, re_range):

/Users/mayork/Documents/GpGit/gpfit/gpfit/fit.pyc in fit(xdata, ydata, K, ftype)
     70         w = Variable("w")
     71 
---> 72     params = get_params(ftype, K, xdata, ydata)
     73 
     74     # A: exponent parameters, B: coefficient parameters

/Users/mayork/Documents/GpGit/gpfit/gpfit/fit.pyc in get_params(ftype, K, xdata, ydata)
     24         return r, drdp
     25 
---> 26     ba = ba_init(xdata, ydata.reshape(ydata.size, 1), K).flatten('F')
     27 
     28     if ftype == "ISMA":

/Users/mayork/Documents/GpGit/gpfit/gpfit/ba_init.pyc in ba_init(x, y, K)
     77                       "full rank for local fitting." % (i-iinit, k))
     78         # now create the local fit
---> 79         b[:, k] = lstsq(X[inds.nonzero()], y[inds.nonzero()])[0][:, 0]
     80 
     81     return b

IndexError: index 59 is out of bounds for axis 0 with size 59

Can't import fit in the latest commit

I am on commit 648d114, and was about to try to fit something very basic when I got the following error from trying import fit.

In [1]: import fit
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-b7a6b9d011b5> in <module>()
----> 1 import fit

C:\Users\Berk\Dropbox (MIT)\MIT Senior Year\16.82\gpfit\gpfit\fit.py in <module>()
      2 from numpy import ones, exp, sqrt, mean, square, hstack
      3 from gpkit import NamedVariables, VectorVariable, Variable, NomialArray
----> 4 from .implicit_softmax_affine import implicit_softmax_affine
      5 from .softmax_affine import softmax_affine
      6 from .max_affine import max_affine

ValueError: Attempted relative import in non-package

??? Help?

Decide what (if anything) should print to screen during fitting

Currently the bverbose option is set to False in the code. This suppresses all print messages, which is nice and clean, but it also means that the user doesn't know if the fitting process reaches max iterations, or reaches max time etc.

A user may also want to know about the rate of residual convergence.

duplicate examples directory

examples in gpfit/examples/ should be updated / renamed as appropriate, moved to docs/source/examples, and added to tests/t_examples.

Empty array error

I've been trying to fit some compressor maps and am getting what appears to be a deterministic linear algebra error. I can get the error, then run the exact same code and it won't throw an error the second time. I'm not too familiar with gpfit so I'm not sure why this would occur.

Below is the error and some code which causes said error.

Traceback (most recent call last):
  File "/Users/mayork/Documents/GpGit/gpfit/gpfit/compressor_map_REAL_DATA_fitting.py", line 140, in <module>
    r,const = fit.fit(independent, dependent, 4, 'SMA')
  File "/Users/mayork/Documents/GpGit/gpfit/gpfit/fit.py", line 71, in fit
    bainit = max_affine_init(xdata, ydata, K)
  File "/Users/mayork/Documents/GpGit/gpfit/gpfit/max_affine_init.py", line 58, in max_affine_init
    if matrix_rank(X[inds, :]) < dimx + 1:
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 1543, in matrix_rank
    S = svd(M, compute_uv=False)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 1338, in svd
    _assertNoEmpty2d(a)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 222, in _assertNoEmpty2d
    raise LinAlgError("Arrays cannot be empty")
LinAlgError: Arrays cannot be empty
import numpy as np
import matplotlib.pyplot as plt
import fit

#variable to control if plotting occurs
PLOT = True

#-----------------------------------
#Compressor map
#make the data
#real data
##N = [.5, .6, .7, .75, .8, .85, .875, .9, .925, .95, .975, .985, 1, 1.025]
##pi= [[2.27690,2.21610,2.15520,2.00330,1.91210,1.66880],[3.21930,3.12810,3.00650,2.82410,2.61130,2.21610],[4.86100,4.58740,4.31380,4.16180,3.76660,3.18890],
##     [5.95550,5.65150,5.40830,5.19540,4.73940,4.00980],[7.68840,7.56680,7.32360,7.14120,6.62430,5.65150],[10.2117,9.72530,9.26930,8.93490,8.23560,7.20200],
##     [11.7318,11.3062,10.7590,10.2725,9.60370,8.38760],[13.6775,13.1303,12.5223,11.9446,11.2757,9.99890],[16.3529,15.8056,15.0760,14.4072,13.5255,12.1574],
##     [19.4235,18.9978,18.3594,17.4473,16.4745,14.9544],[23.4365,22.8284,22.0076,20.8523,19.8187,18.3898],[24.8350,24.3181,23.6493,22.5548,21.4604,20.1227],
##     [27.0239,26.2942,25.3822,24.1357,22.7372,21.5212],[29.0304,28.2704,27.2975,26.3246,24.5309,23.3453]]
##mbar = [[0.114208,0.118718,0.123241,0.126468,0.128404,0.129694],[0.155504,0.161956,0.165183,0.167763,0.170345,0.170345],[0.220674,0.229707,0.230998,0.232933,0.235515,0.236805],
##        [0.261969,0.274874,0.280681,0.282617,0.285843,0.287779],[0.345206,0.351659,0.364563,0.371015,0.376177,0.378759],[0.425216,0.436185,0.443283,0.448445,0.451026,0.452962],
##        [0.486515,0.496838,0.505227,0.509743,0.512324,0.512324],[0.560072,0.572332,0.580075,0.585237,0.586528,0.588463],[0.651697,0.667183,0.676216,0.682024,0.683314,0.684605],
##        [0.751710,0.770422,0.783972,0.792360,0.792360,0.792360],[0.871081,0.886569,0.896886,0.904634,0.907211,0.907862],[0.918187,0.934317,0.947862,0.958187,0.962057,0.963992],
##        [0.993033,0.997545,1.00142,1.00464,1.00593,1.00787],[1.05433,1.05820,1.06014,1.06337,1.06659,1.06659]]

##N = [.95, .985]
##pi= [[20.4125,20.4325,19.4235,18.9978,18.3594,17.4473,16.4745,14.9544],[25.8250,25.8350,24.8350,24.3181,23.6493,22.5548,21.4604,20.1227]]
##mbar = [[.552,.652,0.751710,0.770422,0.783972,0.792360,0.792360,0.79442360],[.7182,.8182,0.918187,0.934317,0.947862,0.958187,0.962057,0.964]]

uppi=[]
upm=[]
centerpi=26.19
centerm=1
for i in range(8):
    if i==0:
        uppi.extend([centerpi+.06])
        upm.extend([centerm-.217])
    if i==1:
        uppi.extend([centerpi+.063])
        upm.extend([centerm-.1368])
    if i==2:
        uppi.extend([centerpi+.042])
        upm.extend([centerm-.0618])
    if i==3:
        uppi.extend([centerpi])
        upm.extend([centerm])
    if i==4:
        uppi.extend([centerpi-.06])
        upm.extend([centerm+.0447])
    if i==5:
        uppi.extend([centerpi-.102])
        upm.extend([centerm+.0664])
    if i==6:
        uppi.extend([centerpi-.184])
        upm.extend([centerm+.0972])
    if i==7:
        uppi.extend([centerpi-.44])
        upm.extend([centerm+.0972])

uppi2=[]
upm2=[]
centerpi=17.9
centerm=.8
for i in range(8):
    if i==0:
        uppi2.extend([centerpi+.06])
        upm2.extend([centerm-.217])
    if i==1:
        uppi2.extend([centerpi+.063])
        upm2.extend([centerm-.1368])
    if i==2:
        uppi2.extend([centerpi+.042])
        upm2.extend([centerm-.0618])
    if i==3:
        uppi2.extend([centerpi])
        upm2.extend([centerm])
    if i==4:
        uppi2.extend([centerpi-.06])
        upm2.extend([centerm+.0447])
    if i==5:
        uppi2.extend([centerpi-.102])
        upm2.extend([centerm+.0664])
    if i==6:
        uppi2.extend([centerpi-.184])
        upm2.extend([centerm+.0972])
    if i==7:
        uppi2.extend([centerpi-.44])
        upm2.extend([centerm+.0972])

N=[1,.925]
pi=[uppi,uppi2]
mbar=[upm,upm2]
if PLOT == True:
#plot of data used in gpfit
    for i in range(len(N)):
        Nplot = N[i]*np.ones(len(mbar[0]))
        piplot = pi[i]
        mbarplot = mbar[i]
        plt.plot(mbarplot,piplot, '-r')
    plt.xlabel('Normalized Corrected Mass Flow')
    plt.ylabel('Fan Pressure Ratio')
    plt.title('E3 Fan Map')
    plt.show()

    for i in range(len(N)):
        Nplot = N[i]*np.ones(len(mbar[0]))
        piplot = pi[i]
        mbarplot = mbar[i]
        plt.plot(np.log(mbarplot),np.log(piplot), '-r')
    plt.xlabel('Log of Normalized Corrected Mass Flow')
    plt.ylabel('Log of Fan Pressure Ratio')
    plt.title('E3 Fan Map in Log Space')
    plt.show()

    for i in range(len(N)):
        Nplot = N[i]*np.ones(len(mbar[0]))
        invpiplot = np.ones(len(pi[i]))
        for j in range(len(pi[i])):
            invpiplot[j] = 1/(pi[i][j])
        mbarplot = mbar[i]
        plt.plot(np.log(mbarplot),np.log(invpiplot), '-r')
    plt.xlabel('Log of Normalized Corrected Mass Flow')
    plt.ylabel('Log of Fan Pressure Ratio')
    plt.title('E3 Fan Map in Log Space')
    plt.show()

#set up the data for the fit
Nfit = []
mbarfit = []
pifit = []

for i in range(len(N)):
    Nfit.extend(N[i]*np.ones(len(mbar[i])))
    for j in range(len(pi[i])):
        hold=pi[i][j]
        pifit.extend([1/hold])
    mbarfit.extend(np.divide(mbar[i],[1]))

#create the fit
independent = np.array([np.log(Nfit),np.log(mbarfit)])
dependent = np.log(pifit)
r,const = fit.fit(independent, dependent, 4, 'SMA')
print const

#plot the fit
nvec = np.linspace(.8, 1, 10)
mbarvec = np.linspace(.8,1,100)

for i in range(len(nvec)):
    N = nvec[i]
    pi=[]
    for j in range(len(mbarvec)):
        mbar = mbarvec[j]
        #fit to the tweaked data
        pi.extend([(0.282 * (N)**-3.56 * (mbar)**0.132
        + 9.75e-06 * (N)**-133 * (mbar)**49.7
        + 0.3 * (N)**-0.59 * (mbar)**-0.184
        + 0.306 * (N)**2.8 * (mbar)**0.0678)**(-1/.124)])
        #fit to the original data
##        pi.extend([(0.106 * (N)**-0.0299 * (mbar)**-0.129
##        + 0.119 * (N)**-0.0527 * (mbar)**-0.123
##        + 0.107 * (N)**0.028 * (mbar)**-0.146
##        + 0.102 * (N)**-0.0231 * (mbar)**-0.131
##        + 0.123 * (N)**-0.0233 * (mbar)**-0.131
##        + 0.136 * (N)**0.082 * (mbar)**-0.161)**(-1/.116)])

    plt.plot(mbarvec, pi, '-r')

#code for adding in the actual fan map data
##N = [.5, .6, .7, .75, .8, .85, .875, .9, .925, .95, .975, .985, 1, 1.025]
##pi= [[2.27690,2.21610,2.15520,2.00330,1.91210,1.66880],[3.21930,3.12810,3.00650,2.82410,2.61130,2.21610],[4.86100,4.58740,4.31380,4.16180,3.76660,3.18890],
##     [5.95550,5.65150,5.40830,5.19540,4.73940,4.00980],[7.68840,7.56680,7.32360,7.14120,6.62430,5.65150],[10.2117,9.72530,9.26930,8.93490,8.23560,7.20200],
##     [11.7318,11.3062,10.7590,10.2725,9.60370,8.38760],[13.6775,13.1303,12.5223,11.9446,11.2757,9.99890],[16.3529,15.8056,15.0760,14.4072,13.5255,12.1574],
##     [19.4235,18.9978,18.3594,17.4473,16.4745,14.9544],[23.4365,22.8284,22.0076,20.8523,19.8187,18.3898],[24.8350,24.3181,23.6493,22.5548,21.4604,20.1227],
##     [27.0239,26.2942,25.3822,24.1357,22.7372,21.5212],[29.0304,28.2704,27.2975,26.3246,24.5309,23.3453]]
##mbar = [[0.114208,0.118718,0.123241,0.126468,0.128404,0.129694],[0.155504,0.161956,0.165183,0.167763,0.170345,0.170345],[0.220674,0.229707,0.230998,0.232933,0.235515,0.236805],
##        [0.261969,0.274874,0.280681,0.282617,0.285843,0.287779],[0.345206,0.351659,0.364563,0.371015,0.376177,0.378759],[0.425216,0.436185,0.443283,0.448445,0.451026,0.452962],
##        [0.486515,0.496838,0.505227,0.509743,0.512324,0.512324],[0.560072,0.572332,0.580075,0.585237,0.586528,0.588463],[0.651697,0.667183,0.676216,0.682024,0.683314,0.684605],
##        [0.751710,0.770422,0.783972,0.792360,0.792360,0.792360],[0.871081,0.886569,0.896886,0.904634,0.907211,0.907862],[0.918187,0.934317,0.947862,0.958187,0.962057,0.963992],
##        [0.993033,0.997545,1.00142,1.00464,1.00593,1.00787],[1.05433,1.05820,1.06014,1.06337,1.06659,1.06659]]
##for i in range(len(N)):
##        Nplot = N[i]*np.ones(len(mbar[0]))
##        piplot = pi[i]
##        mbarplot = mbar[i]
##        plt.plot(mbarplot,piplot, '-b')

plt.xlabel('Normalized Corrected Mass Flow')
plt.ylabel('Pressure Ratio')
plt.title('Fan Map')
plt.show()

fit.py import error

Line 8 of fit.py reads

from gpkit.nomials import Posynomial, Monomial, Constraint, MonoEQConstraint

I'm getting an error that Constraint can't be imported. I'm guessing this has been moved during a gpkit refactor. I edited my local copy only to make line 8 read import gpkit and it works, obviously not a good long term fix.

fits with poorly conditioned data

I'm trying to fit this data. Not sure why but I'm really struggling . If i fit the full range of data, the fit isn't even close. Fitting to a subset of the entire range yields a much better result, but the returned fit still consistently underestimates the pressure ratio...any ideas for some data manipulation that could help? I've tried fitting to log(1/(p**2)), log(1/(2p)), log(1/(10p)) with fits ranging from 3-20 terms...I included some example plots below. I'm fairly confident my fitting code is correct due to the fact some of the fits are close.

what I want to fit
image

the data I am trying to fit in log space
image

fit from a subset of data range. The longer vertical tails are anticipated, I plotted over slightly larger range.
image

fit from the entire data range
image

printed solution significantly different from posynomial inequality

I think that significant figures on the auto printed fit equation and the posynomial output equation should be the same.

This is what I see when I run fit:

In [4]: fit(X,Y, 4, "SMA")
w**3.72 = 6.35e+10 * (u_1)**-0.243 * (u_2)**-3.43
    + 0.0247 * (u_1)**2.49 * (u_2)**-1.11
    + 2.03e-07 * (u_1)**12.7 * (u_2)**-0.338
    + 6.49e-06 * (u_1)**-1.9 * (u_2)**-0.681
Out[4]: 
(gpkit.PosynomialInequality(w**3.7 >= 0.0247*u_1**2.5*u_2**-1.1 + 2.03e-07*u_1**13*u_2**-0.34
+ 6.35e+10*u_1**-0.24*u_2**-3.4 + 6.49e-06*u_1**-1.9*u_2**-0.68),
 0.0048930297487385886)

The difference in significant figures from the printed solution and the gpkit.PosynomialInequalityresulted in pretty drastic gaps when I compared the fits to the actual data. jh01polarfit.pdf uses all the significant figures, jh01polarfit1.pdf uses the posynomial equation.
jh01polarfit.pdf
jh01polarfit1.pdf

list index out of range

I keep getting this error and I'm not sure why. I would be grateful for any help.

In [19]: fit(x_log,y_log,2,"SMA")
w**424 = 0 * (u_1)**1.5e+03
    + 0 * (u_1)**1.36e+03
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-19-920e59fab3f5> in <module>()
----> 1 fit(x_log,y_log,2,"SMA")

/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/fit.pyc in fit(xdata, ydata, K, ftype, varNames)
    156         # Create gpkit objects
    157         # SMA returns a constraint of the form w^alpha >= c1*u1^exp1 + c2*u2^exp2 +....
--> 158         posy  = Posynomial(exps, cs)
    159         mono = Monomial(w_exp,1)
    160         cstrt = (mono >= posy)

/Users/mjburton11/Documents/SuperUROP/gpkit/gpkit/nomials.pyc in __init__(self, exps, cs, require_positive, simplify, **descr)
    104 
    105         # init NomialData to create self.exps, self.cs, and so on
--> 106         super(Signomial, self).__init__(exps, cs, simplify=simplify)
    107 
    108         if self.any_nonpositive_cs:

/Users/mjburton11/Documents/SuperUROP/gpkit/gpkit/nomial_data.pyc in __init__(self, exps, cs, simplify)
     26             return
     27         if simplify:
---> 28             exps, cs = simplify_exps_and_cs(exps, cs)
     29         self.exps, self.cs = exps, cs
     30         self.any_nonpositive_cs = any(mag(c) <= 0 for c in self.cs)

/Users/mjburton11/Documents/SuperUROP/gpkit/gpkit/nomial_data.pyc in simplify_exps_and_cs(exps, cs, return_map)
    177     exps_ = tuple(matches.keys())
    178     cs_ = list(matches.values())
--> 179     if isinstance(cs_[0], Quantity):
    180         units = Quantity(1, cs_[0].units)
    181         cs_ = [c.to(units).magnitude for c in cs_] * units

IndexError: list index out of range

Here are my x_log and y_log arrays:

In [17]: x_log
Out[17]: 
array([ 1.60943791,  1.62964062,  1.64944325,  1.66886133,  1.68790953,
        1.70660166,  1.7249508 ,  1.74296931,  1.76066888,  1.77806062,
        1.79515506,  1.81196218,  1.82849148,  1.844752  ,  1.86075234,
        1.8765007 ,  1.89200488,  1.90727236,  1.92231023,  1.93712532,
        1.95172412,  1.96611286,  1.98029749,  1.99428373,  2.00807706,
        2.02168271,  2.03510573,  2.04835095,  2.06142304,  2.07432644,
        2.08706547,  2.09964425,  2.11206677,  2.12433686,  2.13645822,
        2.14843441,  2.16026887,  2.17196491,  2.18352573,  2.19495443,
        2.20625398,  2.21742728,  2.22847712,  2.23940619,  2.25021711,
        2.2609124 ,  2.27149451,  2.28196581,  2.29232859,  2.30258509])

In [18]: y_log
Out[18]: 
array([ 1.16385884,  1.21664897,  1.27131266,  1.32741476,  1.3845526 ,
        1.4423605 ,  1.50051215,  1.55872099,  1.61673936,  1.67435666,
        1.73139691,  1.78771599,  1.84319872,  1.89775602,  1.95132211,
        2.00385192,  2.05531868,  2.10571175,  2.15503457,  2.20330291,
        2.25054315,  2.29679084,  2.34208929,  2.38648828,  2.43004285,
        2.4728122 ,  2.51485854,  2.5562461 ,  2.59704002,  2.63730544,
        2.67710647,  2.71650529,  2.75556121,  2.79432986,  2.83286237,
        2.87120462,  2.90939664,  2.94747203,  2.98545754,  3.02337269,
        3.06122963,  3.09903299,  3.13677999,  3.17446052,  3.21205749,
        3.24954715,  3.2868996 ,  3.3240793 ,  3.3610457 ,  3.39775386])

add installation instructions to docs

Just received a request from a user who was confused about how to install GPfit. I've responded, but it made me realize we don't have install docs.

E-mail copied below:

Hi,

I've recently installed your GPkit python tool and I would like to test it by fitting my data as (I)SMA functions. However, when I try to run the example given for GPfit in:

http://gpfit.readthedocs.io/en/latest/examples.html

I get:

ImportError: No module named gpfit.fit

In the GPkit installation there's no reference to GPfit so I imagine this is an standalone package. How can it be installed?

Thanks in advance.

Refactor goals

  • top-level support of Fit Constraint Sets
  • reduce the amount of code
  • be more consistent with numpy/scipy convention
    • such as by having variables in different rows, not columns, or automatically transposing (#72)

All terms in multi-term SMA fits have very similar coefficients and exponents

This could just be the models I'm fitting to, but it seems like I every time I try to use SMA fits with multiple (K) terms, the result is a sum of K nearly identical terms, e.g. w**0.149 = 1.2 * (u_1)**0.0106 + 1.22 * (u_1)**0.0105. This has been my experience with a wide variety of relationship types, and it seems dubious.

Example t_ex6_1.py issue

Here the current issue with t_ex6_1. Probably due to missed update with gpfit

In [2]: %run gpfit/tests/t_e
gpfit/tests/t_ex6_1.py  gpfit/tests/t_ex6_3.py  

In [2]: %run gpfit/tests/t_ex6_1.py
1 = (0.95/w**0.0961) * (u_1)**0.0161
    + (0.996/w**0.165) * (u_1)**-0.0958
    + (0.975/w**0.112) * (u_1)**-0.0166
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/tests/t_ex6_1.py in <module>()
      3 from gpfit.fit import fit
      4 
----> 5 class t_ex6_1_ISMA(unittest.TestCase):
      6     '''
      7     ISMA unit tests based on example 6.1 from GPfit paper

/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/tests/t_ex6_1.py in t_ex6_1_ISMA()
     14     K = 3
     15 
---> 16     cstrt, rms_error = fit(x, y, K, "ISMA")
     17 
     18     def test_rms_error(self):

/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/fit.py in fit(xdata, ydata, K, ftype, varNames)
    110         # ISMA returns a constraint of the form 1 >= c1*u1^exp1*u2^exp2*w^(-alpha) + ....
    111         posy  = Posynomial(exps, cs)
--> 112         cstrt = Constraint(posy,1)
    113 
    114         # # If only one term, automatically make an equality constraint

/Users/mjburton11/Documents/SuperUROP/gpkit/gpkit/nomials.pyc in __init__(self, left, right, oper_ge)
    572         self.left, self.right = (pgt, plt) if oper_ge else (plt, pgt)
    573 
--> 574         p = plt / pgt
    575 
    576         if isinstance(p.cs, Quantity):

TypeError: unsupported operand type(s) for /: 'Monomial' and 'Posynomial'

Ability to pre-specify the power on an independent variable

This might sound like a strange request, and I haven't really thought about how feasible it is, but it would be nice if a user could specify the power on a certain independent variable based on a priori knowledge of the underlying relationships.

For example, if I am fitting a function, z = f(x, y) and I know that z should go with x^0.5, it would be nice to "fix" that part of the regression, so I end up with something like z = 4.58*x^0.5*y^0.234.

Make GPfit more object oriented(?)

It might be more elegant to make classes of fits, e.g. ISMA_fit, SMA_fit, MA_fit.

These classes could then have functions such as plot_fit (for 1D and 2D functions) and print_fit (which already exists).

Example code t_ex6_1.py not working

I just tried importing t_ex6_1.py into ipython and got the following error:

In [1]: import t_ex6_1
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-eff11dff08f3> in <module>()
----> 1 import t_ex6_1

/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/tests/t_ex6_1.py in <module>()
      1 import unittest
      2 from numpy import logspace, log, exp, log10
----> 3 from gpfit.fit import fit
      4 
      5 class t_ex6_1_ISMA(unittest.TestCase):

ImportError: No module named gpfit.fit

ValueError('Not enough data points')

Hi,

I'm trying to run a SMA fit on my data but I can't seem to enter it correctly. My x-data is a 1062x4 matrix and correspondingly my y-data is a 1062x1 vector.

x
array([[ -0.69314718, -1.2039728 , -13.81551056, -13.81551056],
[ -0.65392647, -1.2039728 , -13.81551056, -13.81551056],
[ -0.61618614, -1.2039728 , -13.81551056, -13.81551056],
...,
[ 0.37843644, 0.18232156, -11.51292546, -9.21034037],
[ 0.39204209, 0.18232156, -11.51292546, -9.21034037],
[ 0.40546511, 0.18232156, -11.51292546, -9.21034037]])

x.shape
(1062, 4)

y
array([-10.09725113, -10.0955659 , -10.09396532, ..., -5.87544124,
-5.87526203, -5.87509244])

y.shape
(1062,)

When I try to run the fit, I get the following error:

cSMA, errorSMA = fit(x,y,K,"SMA")
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/gpfit/fit.py", line 72, in fit
params = get_params(ftype, K, xdata, ydata)
File "/usr/local/lib/python2.7/dist-packages/gpfit/fit.py", line 26, in get_params
ba = ba_init(xdata, ydata.reshape(ydata.size, 1), K).flatten('F')
File "/usr/local/lib/python2.7/dist-packages/gpfit/ba_init.py", line 36, in ba_init
raise ValueError('Not enough data points')
ValueError: Not enough data points

I know that this is probably not a tool error but a user-keyboard bug, but I just don't get what could be the problem here. Any pointers are much appreciated.

Regards,

Lucho.

Clean up pylintrc

  • ensure consistency with gpkit (especially in method / class naming)
  • put fixme in the command-line flags
  • remove duplicate-code from the disable

Saving and loading fitted nomials

There should be some way of saving/loading/manipulating nomials because it can take a long time to generate them.
During loads, it would be convenient to be able to rename variables.

(Pickling worked in the old implementation of nomials, but it doesn't work anymore.)

index errors with large datasets?

reported by tony tao:

"on the surface it looks like a data size issue ("MemoryError" and "Iterator too large" errors) but when I truncate the data set to something that already worked before, it returns indexing errors which makes me believe it's something in GPfit, but I can't figure out what it is. "

"Actually (as usual, problem is solved after calling mayday) I may have figured it out and now I have a Cd model as well.

The training input data set is around 40,000 data points over 7 dimensions (originally 80,000), so it takes around 16 GB of memory to build the model, which explains the memory error running in Python(x,y). Running it in Ubuntu and deleting about half the training data seems to have fixed it.

The index error is caused by line 73 of the max_affine_init.py script where if the while loop isn't fulfilled by the end of the dataset, it calls the next index location which is out of bounds. "

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.