convexengineering / gpfit Goto Github PK
View Code? Open in Web Editor NEWFit posynomials to data
Home Page: http://gpfit.readthedocs.io/en/latest/
License: MIT License
Fit posynomials to data
Home Page: http://gpfit.readthedocs.io/en/latest/
License: MIT License
Traceback below.
======================================================================
FAIL: test_MA (t_print_fit.t_print_MA)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/whoburg/MIT/dev/gpfit/gpfit/tests/t_print_fit.py", line 17, in test_MA
'w = 8.1e+03 * (u_0)**10 * (u_1)**11 * (u_2)**12'])
AssertionError: Lists differ: ['w = 2.72 * (u_1)**2 * (u_2)*... != ['w = 2.72 * (u_0)**2 * (u_1)*...
First differing element 0:
w = 2.72 * (u_1)**2 * (u_2)**3 * (u_3)**4
w = 2.72 * (u_0)**2 * (u_1)**3 * (u_2)**4
- ['w = 2.72 * (u_1)**2 * (u_2)**3 * (u_3)**4',
- 'w = 148 * (u_1)**6 * (u_2)**7 * (u_3)**8',
- 'w = 8.1e+03 * (u_1)**10 * (u_2)**11 * (u_3)**12']
+ ['w = 2.72 * (u_0)**2 * (u_1)**3 * (u_2)**4',
+ 'w = 148 * (u_0)**6 * (u_1)**7 * (u_2)**8',
+ 'w = 8.1e+03 * (u_0)**10 * (u_1)**11 * (u_2)**12']
======================================================================
FAIL: test_SMA (t_print_fit.t_print_SMA)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/whoburg/MIT/dev/gpfit/gpfit/tests/t_print_fit.py", line 32, in test_SMA
' + 2 * (u_0)**0.769 * (u_1)**0.846 * (u_2)**0.923'])
AssertionError: Lists differ: ['w**0.0769 = 1.08 * (u_1)**0.... != ['w**0.0769 = 1.08 * (u_0)**0....
First differing element 0:
w**0.0769 = 1.08 * (u_1)**0.154 * (u_2)**0.231 * (u_3)**0.308
w**0.0769 = 1.08 * (u_0)**0.154 * (u_1)**0.231 * (u_2)**0.308
- ['w**0.0769 = 1.08 * (u_1)**0.154 * (u_2)**0.231 * (u_3)**0.308',
- ' + 1.47 * (u_1)**0.462 * (u_2)**0.538 * (u_3)**0.615',
- ' + 2 * (u_1)**0.769 * (u_2)**0.846 * (u_3)**0.923']
+ ['w**0.0769 = 1.08 * (u_0)**0.154 * (u_1)**0.231 * (u_2)**0.308',
+ ' + 1.47 * (u_0)**0.462 * (u_1)**0.538 * (u_2)**0.615',
+ ' + 2 * (u_0)**0.769 * (u_1)**0.846 * (u_2)**0.923']
======================================================================
FAIL: test_ISMA (t_print_fit.t_print_ISMA)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/whoburg/MIT/dev/gpfit/gpfit/tests/t_print_fit.py", line 47, in test_ISMA
' + (1.82/w**0.0667) * (u_0)**0.667 * (u_1)**0.733 * (u_2)**0.8'])
AssertionError: Lists differ: ['1 = (1.08/w**0.0769) * (u_1)... != ['1 = (1.08/w**0.0769) * (u_0)...
First differing element 0:
1 = (1.08/w**0.0769) * (u_1)**0.154 * (u_2)**0.231 * (u_3)**0.308
1 = (1.08/w**0.0769) * (u_0)**0.154 * (u_1)**0.231 * (u_2)**0.308
Diff is 803 characters long. Set self.maxDiff to None to see it.
----------------------------------------------------------------------
Ran 47 tests in 0.136s
FAILED (failures=3)
I've been trying to fit some compressor maps and am getting what appears to be a deterministic linear algebra error. I can get the error, then run the exact same code and it won't throw an error the second time. I'm not too familiar with gpfit so I'm not sure why this would occur.
Below is the error and some code which causes said error.
Traceback (most recent call last):
File "/Users/mayork/Documents/GpGit/gpfit/gpfit/compressor_map_REAL_DATA_fitting.py", line 140, in <module>
r,const = fit.fit(independent, dependent, 4, 'SMA')
File "/Users/mayork/Documents/GpGit/gpfit/gpfit/fit.py", line 71, in fit
bainit = max_affine_init(xdata, ydata, K)
File "/Users/mayork/Documents/GpGit/gpfit/gpfit/max_affine_init.py", line 58, in max_affine_init
if matrix_rank(X[inds, :]) < dimx + 1:
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 1543, in matrix_rank
S = svd(M, compute_uv=False)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 1338, in svd
_assertNoEmpty2d(a)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 222, in _assertNoEmpty2d
raise LinAlgError("Arrays cannot be empty")
LinAlgError: Arrays cannot be empty
import numpy as np
import matplotlib.pyplot as plt
import fit
#variable to control if plotting occurs
PLOT = True
#-----------------------------------
#Compressor map
#make the data
#real data
##N = [.5, .6, .7, .75, .8, .85, .875, .9, .925, .95, .975, .985, 1, 1.025]
##pi= [[2.27690,2.21610,2.15520,2.00330,1.91210,1.66880],[3.21930,3.12810,3.00650,2.82410,2.61130,2.21610],[4.86100,4.58740,4.31380,4.16180,3.76660,3.18890],
## [5.95550,5.65150,5.40830,5.19540,4.73940,4.00980],[7.68840,7.56680,7.32360,7.14120,6.62430,5.65150],[10.2117,9.72530,9.26930,8.93490,8.23560,7.20200],
## [11.7318,11.3062,10.7590,10.2725,9.60370,8.38760],[13.6775,13.1303,12.5223,11.9446,11.2757,9.99890],[16.3529,15.8056,15.0760,14.4072,13.5255,12.1574],
## [19.4235,18.9978,18.3594,17.4473,16.4745,14.9544],[23.4365,22.8284,22.0076,20.8523,19.8187,18.3898],[24.8350,24.3181,23.6493,22.5548,21.4604,20.1227],
## [27.0239,26.2942,25.3822,24.1357,22.7372,21.5212],[29.0304,28.2704,27.2975,26.3246,24.5309,23.3453]]
##mbar = [[0.114208,0.118718,0.123241,0.126468,0.128404,0.129694],[0.155504,0.161956,0.165183,0.167763,0.170345,0.170345],[0.220674,0.229707,0.230998,0.232933,0.235515,0.236805],
## [0.261969,0.274874,0.280681,0.282617,0.285843,0.287779],[0.345206,0.351659,0.364563,0.371015,0.376177,0.378759],[0.425216,0.436185,0.443283,0.448445,0.451026,0.452962],
## [0.486515,0.496838,0.505227,0.509743,0.512324,0.512324],[0.560072,0.572332,0.580075,0.585237,0.586528,0.588463],[0.651697,0.667183,0.676216,0.682024,0.683314,0.684605],
## [0.751710,0.770422,0.783972,0.792360,0.792360,0.792360],[0.871081,0.886569,0.896886,0.904634,0.907211,0.907862],[0.918187,0.934317,0.947862,0.958187,0.962057,0.963992],
## [0.993033,0.997545,1.00142,1.00464,1.00593,1.00787],[1.05433,1.05820,1.06014,1.06337,1.06659,1.06659]]
##N = [.95, .985]
##pi= [[20.4125,20.4325,19.4235,18.9978,18.3594,17.4473,16.4745,14.9544],[25.8250,25.8350,24.8350,24.3181,23.6493,22.5548,21.4604,20.1227]]
##mbar = [[.552,.652,0.751710,0.770422,0.783972,0.792360,0.792360,0.79442360],[.7182,.8182,0.918187,0.934317,0.947862,0.958187,0.962057,0.964]]
uppi=[]
upm=[]
centerpi=26.19
centerm=1
for i in range(8):
if i==0:
uppi.extend([centerpi+.06])
upm.extend([centerm-.217])
if i==1:
uppi.extend([centerpi+.063])
upm.extend([centerm-.1368])
if i==2:
uppi.extend([centerpi+.042])
upm.extend([centerm-.0618])
if i==3:
uppi.extend([centerpi])
upm.extend([centerm])
if i==4:
uppi.extend([centerpi-.06])
upm.extend([centerm+.0447])
if i==5:
uppi.extend([centerpi-.102])
upm.extend([centerm+.0664])
if i==6:
uppi.extend([centerpi-.184])
upm.extend([centerm+.0972])
if i==7:
uppi.extend([centerpi-.44])
upm.extend([centerm+.0972])
uppi2=[]
upm2=[]
centerpi=17.9
centerm=.8
for i in range(8):
if i==0:
uppi2.extend([centerpi+.06])
upm2.extend([centerm-.217])
if i==1:
uppi2.extend([centerpi+.063])
upm2.extend([centerm-.1368])
if i==2:
uppi2.extend([centerpi+.042])
upm2.extend([centerm-.0618])
if i==3:
uppi2.extend([centerpi])
upm2.extend([centerm])
if i==4:
uppi2.extend([centerpi-.06])
upm2.extend([centerm+.0447])
if i==5:
uppi2.extend([centerpi-.102])
upm2.extend([centerm+.0664])
if i==6:
uppi2.extend([centerpi-.184])
upm2.extend([centerm+.0972])
if i==7:
uppi2.extend([centerpi-.44])
upm2.extend([centerm+.0972])
N=[1,.925]
pi=[uppi,uppi2]
mbar=[upm,upm2]
if PLOT == True:
#plot of data used in gpfit
for i in range(len(N)):
Nplot = N[i]*np.ones(len(mbar[0]))
piplot = pi[i]
mbarplot = mbar[i]
plt.plot(mbarplot,piplot, '-r')
plt.xlabel('Normalized Corrected Mass Flow')
plt.ylabel('Fan Pressure Ratio')
plt.title('E3 Fan Map')
plt.show()
for i in range(len(N)):
Nplot = N[i]*np.ones(len(mbar[0]))
piplot = pi[i]
mbarplot = mbar[i]
plt.plot(np.log(mbarplot),np.log(piplot), '-r')
plt.xlabel('Log of Normalized Corrected Mass Flow')
plt.ylabel('Log of Fan Pressure Ratio')
plt.title('E3 Fan Map in Log Space')
plt.show()
for i in range(len(N)):
Nplot = N[i]*np.ones(len(mbar[0]))
invpiplot = np.ones(len(pi[i]))
for j in range(len(pi[i])):
invpiplot[j] = 1/(pi[i][j])
mbarplot = mbar[i]
plt.plot(np.log(mbarplot),np.log(invpiplot), '-r')
plt.xlabel('Log of Normalized Corrected Mass Flow')
plt.ylabel('Log of Fan Pressure Ratio')
plt.title('E3 Fan Map in Log Space')
plt.show()
#set up the data for the fit
Nfit = []
mbarfit = []
pifit = []
for i in range(len(N)):
Nfit.extend(N[i]*np.ones(len(mbar[i])))
for j in range(len(pi[i])):
hold=pi[i][j]
pifit.extend([1/hold])
mbarfit.extend(np.divide(mbar[i],[1]))
#create the fit
independent = np.array([np.log(Nfit),np.log(mbarfit)])
dependent = np.log(pifit)
r,const = fit.fit(independent, dependent, 4, 'SMA')
print const
#plot the fit
nvec = np.linspace(.8, 1, 10)
mbarvec = np.linspace(.8,1,100)
for i in range(len(nvec)):
N = nvec[i]
pi=[]
for j in range(len(mbarvec)):
mbar = mbarvec[j]
#fit to the tweaked data
pi.extend([(0.282 * (N)**-3.56 * (mbar)**0.132
+ 9.75e-06 * (N)**-133 * (mbar)**49.7
+ 0.3 * (N)**-0.59 * (mbar)**-0.184
+ 0.306 * (N)**2.8 * (mbar)**0.0678)**(-1/.124)])
#fit to the original data
## pi.extend([(0.106 * (N)**-0.0299 * (mbar)**-0.129
## + 0.119 * (N)**-0.0527 * (mbar)**-0.123
## + 0.107 * (N)**0.028 * (mbar)**-0.146
## + 0.102 * (N)**-0.0231 * (mbar)**-0.131
## + 0.123 * (N)**-0.0233 * (mbar)**-0.131
## + 0.136 * (N)**0.082 * (mbar)**-0.161)**(-1/.116)])
plt.plot(mbarvec, pi, '-r')
#code for adding in the actual fan map data
##N = [.5, .6, .7, .75, .8, .85, .875, .9, .925, .95, .975, .985, 1, 1.025]
##pi= [[2.27690,2.21610,2.15520,2.00330,1.91210,1.66880],[3.21930,3.12810,3.00650,2.82410,2.61130,2.21610],[4.86100,4.58740,4.31380,4.16180,3.76660,3.18890],
## [5.95550,5.65150,5.40830,5.19540,4.73940,4.00980],[7.68840,7.56680,7.32360,7.14120,6.62430,5.65150],[10.2117,9.72530,9.26930,8.93490,8.23560,7.20200],
## [11.7318,11.3062,10.7590,10.2725,9.60370,8.38760],[13.6775,13.1303,12.5223,11.9446,11.2757,9.99890],[16.3529,15.8056,15.0760,14.4072,13.5255,12.1574],
## [19.4235,18.9978,18.3594,17.4473,16.4745,14.9544],[23.4365,22.8284,22.0076,20.8523,19.8187,18.3898],[24.8350,24.3181,23.6493,22.5548,21.4604,20.1227],
## [27.0239,26.2942,25.3822,24.1357,22.7372,21.5212],[29.0304,28.2704,27.2975,26.3246,24.5309,23.3453]]
##mbar = [[0.114208,0.118718,0.123241,0.126468,0.128404,0.129694],[0.155504,0.161956,0.165183,0.167763,0.170345,0.170345],[0.220674,0.229707,0.230998,0.232933,0.235515,0.236805],
## [0.261969,0.274874,0.280681,0.282617,0.285843,0.287779],[0.345206,0.351659,0.364563,0.371015,0.376177,0.378759],[0.425216,0.436185,0.443283,0.448445,0.451026,0.452962],
## [0.486515,0.496838,0.505227,0.509743,0.512324,0.512324],[0.560072,0.572332,0.580075,0.585237,0.586528,0.588463],[0.651697,0.667183,0.676216,0.682024,0.683314,0.684605],
## [0.751710,0.770422,0.783972,0.792360,0.792360,0.792360],[0.871081,0.886569,0.896886,0.904634,0.907211,0.907862],[0.918187,0.934317,0.947862,0.958187,0.962057,0.963992],
## [0.993033,0.997545,1.00142,1.00464,1.00593,1.00787],[1.05433,1.05820,1.06014,1.06337,1.06659,1.06659]]
##for i in range(len(N)):
## Nplot = N[i]*np.ones(len(mbar[0]))
## piplot = pi[i]
## mbarplot = mbar[i]
## plt.plot(mbarplot,piplot, '-b')
plt.xlabel('Normalized Corrected Mass Flow')
plt.ylabel('Pressure Ratio')
plt.title('Fan Map')
plt.show()
... we should implement it, to avoid huge fitted parameters.
This might sound like a strange request, and I haven't really thought about how feasible it is, but it would be nice if a user could specify the power on a certain independent variable based on a priori knowledge of the underlying relationships.
For example, if I am fitting a function, z = f(x, y)
and I know that z should go with x^0.5
, it would be nice to "fix" that part of the regression, so I end up with something like z = 4.58*x^0.5*y^0.234
.
I keep getting this error and I'm not sure why. I would be grateful for any help.
In [19]: fit(x_log,y_log,2,"SMA")
w**424 = 0 * (u_1)**1.5e+03
+ 0 * (u_1)**1.36e+03
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-19-920e59fab3f5> in <module>()
----> 1 fit(x_log,y_log,2,"SMA")
/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/fit.pyc in fit(xdata, ydata, K, ftype, varNames)
156 # Create gpkit objects
157 # SMA returns a constraint of the form w^alpha >= c1*u1^exp1 + c2*u2^exp2 +....
--> 158 posy = Posynomial(exps, cs)
159 mono = Monomial(w_exp,1)
160 cstrt = (mono >= posy)
/Users/mjburton11/Documents/SuperUROP/gpkit/gpkit/nomials.pyc in __init__(self, exps, cs, require_positive, simplify, **descr)
104
105 # init NomialData to create self.exps, self.cs, and so on
--> 106 super(Signomial, self).__init__(exps, cs, simplify=simplify)
107
108 if self.any_nonpositive_cs:
/Users/mjburton11/Documents/SuperUROP/gpkit/gpkit/nomial_data.pyc in __init__(self, exps, cs, simplify)
26 return
27 if simplify:
---> 28 exps, cs = simplify_exps_and_cs(exps, cs)
29 self.exps, self.cs = exps, cs
30 self.any_nonpositive_cs = any(mag(c) <= 0 for c in self.cs)
/Users/mjburton11/Documents/SuperUROP/gpkit/gpkit/nomial_data.pyc in simplify_exps_and_cs(exps, cs, return_map)
177 exps_ = tuple(matches.keys())
178 cs_ = list(matches.values())
--> 179 if isinstance(cs_[0], Quantity):
180 units = Quantity(1, cs_[0].units)
181 cs_ = [c.to(units).magnitude for c in cs_] * units
IndexError: list index out of range
Here are my x_log
and y_log
arrays:
In [17]: x_log
Out[17]:
array([ 1.60943791, 1.62964062, 1.64944325, 1.66886133, 1.68790953,
1.70660166, 1.7249508 , 1.74296931, 1.76066888, 1.77806062,
1.79515506, 1.81196218, 1.82849148, 1.844752 , 1.86075234,
1.8765007 , 1.89200488, 1.90727236, 1.92231023, 1.93712532,
1.95172412, 1.96611286, 1.98029749, 1.99428373, 2.00807706,
2.02168271, 2.03510573, 2.04835095, 2.06142304, 2.07432644,
2.08706547, 2.09964425, 2.11206677, 2.12433686, 2.13645822,
2.14843441, 2.16026887, 2.17196491, 2.18352573, 2.19495443,
2.20625398, 2.21742728, 2.22847712, 2.23940619, 2.25021711,
2.2609124 , 2.27149451, 2.28196581, 2.29232859, 2.30258509])
In [18]: y_log
Out[18]:
array([ 1.16385884, 1.21664897, 1.27131266, 1.32741476, 1.3845526 ,
1.4423605 , 1.50051215, 1.55872099, 1.61673936, 1.67435666,
1.73139691, 1.78771599, 1.84319872, 1.89775602, 1.95132211,
2.00385192, 2.05531868, 2.10571175, 2.15503457, 2.20330291,
2.25054315, 2.29679084, 2.34208929, 2.38648828, 2.43004285,
2.4728122 , 2.51485854, 2.5562461 , 2.59704002, 2.63730544,
2.67710647, 2.71650529, 2.75556121, 2.79432986, 2.83286237,
2.87120462, 2.90939664, 2.94747203, 2.98545754, 3.02337269,
3.06122963, 3.09903299, 3.13677999, 3.17446052, 3.21205749,
3.24954715, 3.2868996 , 3.3240793 , 3.3610457 , 3.39775386])
Line 8 of fit.py reads
from gpkit.nomials import Posynomial, Monomial, Constraint, MonoEQConstraint
I'm getting an error that Constraint can't be imported. I'm guessing this has been moved during a gpkit refactor. I edited my local copy only to make line 8 read import gpkit and it works, obviously not a good long term fix.
I think that significant figures on the auto printed fit equation and the posynomial output equation should be the same.
This is what I see when I run fit
:
In [4]: fit(X,Y, 4, "SMA")
w**3.72 = 6.35e+10 * (u_1)**-0.243 * (u_2)**-3.43
+ 0.0247 * (u_1)**2.49 * (u_2)**-1.11
+ 2.03e-07 * (u_1)**12.7 * (u_2)**-0.338
+ 6.49e-06 * (u_1)**-1.9 * (u_2)**-0.681
Out[4]:
(gpkit.PosynomialInequality(w**3.7 >= 0.0247*u_1**2.5*u_2**-1.1 + 2.03e-07*u_1**13*u_2**-0.34
+ 6.35e+10*u_1**-0.24*u_2**-3.4 + 6.49e-06*u_1**-1.9*u_2**-0.68),
0.0048930297487385886)
The difference in significant figures from the printed solution and the gpkit.PosynomialInequality
resulted in pretty drastic gaps when I compared the fits to the actual data. jh01polarfit.pdf
uses all the significant figures, jh01polarfit1.pdf
uses the posynomial equation.
jh01polarfit.pdf
jh01polarfit1.pdf
I think GPfit would benefit from a plot_fit
method for 1D, and perhaps 2D, functions. This method should plot both the original data and the fitted function, and have the option of plotting in log space too.
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
/Users/mjburton11/Documents/SuperUROP/gpkit-models/1682/gas_male/Datasets/fitDF70.py in <module>()
2
3 import gpfit
----> 4 from gpfit.fit import fit
5 import numpy as np
6 import pandas as pd
/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/fit.py in <module>()
6 from max_affine_init import max_affine_init
7 from print_fit import print_ISMA, print_SMA, print_MA
----> 8 from gpkit.nomials import Posynomial, Monomial, Constraint, MonoEQConstraint
9 from numpy import append, ones, exp, sqrt, mean, square
10
ImportError: cannot import name Constraint
examples in gpfit/examples/
should be updated / renamed as appropriate, moved to docs/source/examples
, and added to tests/t_examples
.
The font changes 1/3 of the way through "fit", at least on my browser: http://gpfit.readthedocs.org/en/latest/
Set up CI just like in gpkit.
GitHub is currently advertising a lot of CI integrations: https://github.com/integrations/feature/code. @galbramc, any thoughts on whether those are worth looking in to, or is another Jenkins setup our best bet?
I have been trying to fit a SMA function with two terms to the data attached, but been getting the index error above. Don't exactly know what I am doing wrong. The data, code, and specific error are attached.
Hi all,
I'm using gpfit in two different machines. After I run a fitting routine in my data I get certain attributes for the output. The "good" machine shows:
pr.pprint(smafit.__dict__.keys())
['posymap',
'mfac',
'ivar',
'constraint',
'evaluate',
'dvars',
'numpy_bools',
'bounds',
'substitutions',
'max_err',
'fitdata',
'varkeys',
'rms_err']
Whereas the "bad" machine shows:
pr.pprint(smafit.__dict__.keys())
['oper',
'unsubbed',
'right',
'evaluate',
'nomials',
'substitutions',
'p_lt',
'varkeys',
'm_gt',
'last_used_substitutions',
'left']
I use the fitdata attribute to extract the coefficients in a nice way and import them im MATLAB for post-processing. Why is the attribute list so different in these two installations? I cloned gpfit today in the old (bad) machine to see if there was any update but it doesn't see to affect the attribute list. None of the attributes available in the "bad" machine contains the coefficient in a nice format for extraction.
overall system models: https://nwtc.nrel.gov/WISDEM
Cost models: https://nwtc.nrel.gov/taxonomy/term/23
Probably should fix this. Seems important if anyone wants to understand gpfit.
reported by tony tao:
"on the surface it looks like a data size issue ("MemoryError" and "Iterator too large" errors) but when I truncate the data set to something that already worked before, it returns indexing errors which makes me believe it's something in GPfit, but I can't figure out what it is. "
"Actually (as usual, problem is solved after calling mayday) I may have figured it out and now I have a Cd model as well.
The training input data set is around 40,000 data points over 7 dimensions (originally 80,000), so it takes around 16 GB of memory to build the model, which explains the memory error running in Python(x,y). Running it in Ubuntu and deleting about half the training data seems to have fixed it.
The index error is caused by line 73 of the max_affine_init.py script where if the while loop isn't fulfilled by the end of the dataset, it calls the next index location which is out of bounds. "
seemingly because which coverage
is returning a blank string!
Relatively straightforward fitting problems seem to be cause GPfit to reach the max time limit (5 seconds). Examples of such problems are ex61.py and ex63.py, which are both taken from the GPfit paper.
The fact they are reaching max time can be found be re-enabling the verbose
option
There should be some way of saving/loading/manipulating nomials because it can take a long time to generate them.
During loads, it would be convenient to be able to rename variables.
(Pickling worked in the old implementation of nomials, but it doesn't work anymore.)
Here the current issue with t_ex6_1
. Probably due to missed update with gpfit
In [2]: %run gpfit/tests/t_e
gpfit/tests/t_ex6_1.py gpfit/tests/t_ex6_3.py
In [2]: %run gpfit/tests/t_ex6_1.py
1 = (0.95/w**0.0961) * (u_1)**0.0161
+ (0.996/w**0.165) * (u_1)**-0.0958
+ (0.975/w**0.112) * (u_1)**-0.0166
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/tests/t_ex6_1.py in <module>()
3 from gpfit.fit import fit
4
----> 5 class t_ex6_1_ISMA(unittest.TestCase):
6 '''
7 ISMA unit tests based on example 6.1 from GPfit paper
/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/tests/t_ex6_1.py in t_ex6_1_ISMA()
14 K = 3
15
---> 16 cstrt, rms_error = fit(x, y, K, "ISMA")
17
18 def test_rms_error(self):
/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/fit.py in fit(xdata, ydata, K, ftype, varNames)
110 # ISMA returns a constraint of the form 1 >= c1*u1^exp1*u2^exp2*w^(-alpha) + ....
111 posy = Posynomial(exps, cs)
--> 112 cstrt = Constraint(posy,1)
113
114 # # If only one term, automatically make an equality constraint
/Users/mjburton11/Documents/SuperUROP/gpkit/gpkit/nomials.pyc in __init__(self, left, right, oper_ge)
572 self.left, self.right = (pgt, plt) if oper_ge else (plt, pgt)
573
--> 574 p = plt / pgt
575
576 if isinstance(p.cs, Quantity):
TypeError: unsupported operand type(s) for /: 'Monomial' and 'Posynomial'
I am on commit 648d114, and was about to try to fit something very basic when I got the following error from trying import fit.
In [1]: import fit
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-b7a6b9d011b5> in <module>()
----> 1 import fit
C:\Users\Berk\Dropbox (MIT)\MIT Senior Year\16.82\gpfit\gpfit\fit.py in <module>()
2 from numpy import ones, exp, sqrt, mean, square, hstack
3 from gpkit import NamedVariables, VectorVariable, Variable, NomialArray
----> 4 from .implicit_softmax_affine import implicit_softmax_affine
5 from .softmax_affine import softmax_affine
6 from .max_affine import max_affine
ValueError: Attempted relative import in non-package
??? Help?
It might be more elegant to make classes of fits, e.g. ISMA_fit
, SMA_fit
, MA_fit
.
These classes could then have functions such as plot_fit
(for 1D and 2D functions) and print_fit
(which already exists).
Massive overhaul of docs needed:
There is a wealth of data at http://web.mit.edu/airlinedata/www/default.html
Speak to Luke Jensen and co. about interesting correlations.
unit tests should use fixed seeds to prevent non-determinism associated with initial guesses etc.
For example, this error seems to be sporadic, some times it passes, sometimes it fails:
FAIL [0.000s]: test_rms_error (gpfit.tests.t_ex6_3.t_ex6_3_ISMA)
Traceback (most recent call last):
File "/home1/jenkins/workspace/gpfit_PullRequest/buildnode/reynolds/gpfit/tests/t_ex6_3.py", line 21, in test_rms_error
self.assertTrue(self.rms_error < 5e-4)
AssertionError: False is not true
I just tried importing t_ex6_1.py
into ipython
and got the following error:
In [1]: import t_ex6_1
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-eff11dff08f3> in <module>()
----> 1 import t_ex6_1
/Users/mjburton11/Documents/SuperUROP/gpfit/gpfit/tests/t_ex6_1.py in <module>()
1 import unittest
2 from numpy import logspace, log, exp, log10
----> 3 from gpfit.fit import fit
4
5 class t_ex6_1_ISMA(unittest.TestCase):
ImportError: No module named gpfit.fit
Given a data set with large numbers, GPfit runs into numerical overflow issues with exp().
x = [1200,
13000,
15000,
16000,
17000,
18000,
19000,
30000,
32000,
34000]
y = [325000,
250000,
750000,
2E6,
7E6,
750000,
8E6,
6E6,
2E6,
13E6,
]
Gives results like:
/Users/philippekirschen/Documents/MIT/Research/GPfit/gpfit/gpfit/fit.py:127: RuntimeWarning: overflow encountered in exp
w_SMA = exp(y_SMA)
/Users/philippekirschen/Documents/MIT/Research/GPfit/gpfit/gpfit/fit.py:130: RuntimeWarning: overflow encountered in exp
w = (exp(ydata)).T[0]
w**0.1 = 0 * (u_1)**34.9
+ inf * (u_1)**-5.28
+ 0 * (u_1)**198
Wondering if anything clever can be done here.
Just received a request from a user who was confused about how to install GPfit. I've responded, but it made me realize we don't have install docs.
E-mail copied below:
Hi,
I've recently installed your GPkit python tool and I would like to test it by fitting my data as (I)SMA functions. However, when I try to run the example given for GPfit in:
http://gpfit.readthedocs.io/en/latest/examples.html
I get:
ImportError: No module named gpfit.fit
In the GPkit installation there's no reference to GPfit so I imagine this is an standalone package. How can it be installed?
Thanks in advance.
@bqpd I'm doing some D8 fits (the file is naca_cl0_fits.py
the Tail Fits folder at commit convexengineering/SPaircraft@851b1d4 on D8 master. GPfit is throwing the following error and neither I nor @1ozturkbe know what it is..
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/Users/mayork/Documents/GpGit/d8/Tail Fits/naca_cl0_fits.py in <module>()
100 X, Y = fit_setup(NACA, Re) # call fit(X, Y, 4, "SMA") to get fit
101 F, A = plot_fits(NACA, Re)
--> 102 make_fit(NACA, Re)
103 F.savefig("tail_fits/taildragpolar.pdf",
104 bbox_inches="tight")
/Users/mayork/Documents/GpGit/d8/Tail Fits/naca_cl0_fits.py in make_fit(naca_range, re_range)
65 print np.size(x)
66 print np.size(y)
---> 67 fit(x, y, 3)
68
69 def plot_fits(naca_range, re_range):
/Users/mayork/Documents/GpGit/gpfit/gpfit/fit.pyc in fit(xdata, ydata, K, ftype)
70 w = Variable("w")
71
---> 72 params = get_params(ftype, K, xdata, ydata)
73
74 # A: exponent parameters, B: coefficient parameters
/Users/mayork/Documents/GpGit/gpfit/gpfit/fit.pyc in get_params(ftype, K, xdata, ydata)
24 return r, drdp
25
---> 26 ba = ba_init(xdata, ydata.reshape(ydata.size, 1), K).flatten('F')
27
28 if ftype == "ISMA":
/Users/mayork/Documents/GpGit/gpfit/gpfit/ba_init.pyc in ba_init(x, y, K)
77 "full rank for local fitting." % (i-iinit, k))
78 # now create the local fit
---> 79 b[:, k] = lstsq(X[inds.nonzero()], y[inds.nonzero()])[0][:, 0]
80
81 return b
IndexError: index 59 is out of bounds for axis 0 with size 59
Currently the bverbose
option is set to False
in the code. This suppresses all print messages, which is nice and clean, but it also means that the user doesn't know if the fitting process reaches max iterations, or reaches max time etc.
A user may also want to know about the rate of residual convergence.
This could just be the models I'm fitting to, but it seems like I every time I try to use SMA fits with multiple (K) terms, the result is a sum of K nearly identical terms, e.g. w**0.149 = 1.2 * (u_1)**0.0106 + 1.22 * (u_1)**0.0105
. This has been my experience with a wide variety of relationship types, and it seems dubious.
just like in gpkit, as discussed here: https://github.com/hoburg/gpfit/pull/26/files#r51971484
@bqpd: which code in gpkit makes this happen?
I'm trying to fit this data. Not sure why but I'm really struggling . If i fit the full range of data, the fit isn't even close. Fitting to a subset of the entire range yields a much better result, but the returned fit still consistently underestimates the pressure ratio...any ideas for some data manipulation that could help? I've tried fitting to log(1/(p**2)), log(1/(2p)), log(1/(10p)) with fits ranging from 3-20 terms...I included some example plots below. I'm fairly confident my fitting code is correct due to the fact some of the fits are close.
the data I am trying to fit in log space
fit from a subset of data range. The longer vertical tails are anticipated, I plotted over slightly larger range.
Currently auto-doc seems to work when the html files are made locally using make html
command, but doesn't work on the live version of the documentation site.
per @Ltrollinger, they are a nice interpretable transpose-invariant input.
... leading to inaccurate total times being output, among other issues.
Again, when trying to fit the compressor maps, I've gotten a number of returned fits with very large exponents. Sometimes, when these are plotted outside of log space, overflow errors occur. I'm not sure this can be avoided, and I don't think the data I'm using is too well conditioned, but it would be nice if there was someway to control how large the exponents were in the fit.
'''
w0.231 = 0.00187 * (u_1)-302 * (u_2)58.2
+ 3.75e-12 * (u_1)-2.67e+03 * (u_2)496
+ 0.326 * (u_1)-7.81 * (u_2)**0.962
+ 0.465 * (u_1)1.97 * (u_2)-0.525
'''
And @pgkirsch's docstrings are so lovely, they deserve a glossary.
Hi,
I'm trying to run a SMA fit on my data but I can't seem to enter it correctly. My x-data is a 1062x4 matrix and correspondingly my y-data is a 1062x1 vector.
x
array([[ -0.69314718, -1.2039728 , -13.81551056, -13.81551056],
[ -0.65392647, -1.2039728 , -13.81551056, -13.81551056],
[ -0.61618614, -1.2039728 , -13.81551056, -13.81551056],
...,
[ 0.37843644, 0.18232156, -11.51292546, -9.21034037],
[ 0.39204209, 0.18232156, -11.51292546, -9.21034037],
[ 0.40546511, 0.18232156, -11.51292546, -9.21034037]])
x.shape
(1062, 4)
y
array([-10.09725113, -10.0955659 , -10.09396532, ..., -5.87544124,
-5.87526203, -5.87509244])
y.shape
(1062,)
When I try to run the fit, I get the following error:
cSMA, errorSMA = fit(x,y,K,"SMA")
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/gpfit/fit.py", line 72, in fit
params = get_params(ftype, K, xdata, ydata)
File "/usr/local/lib/python2.7/dist-packages/gpfit/fit.py", line 26, in get_params
ba = ba_init(xdata, ydata.reshape(ydata.size, 1), K).flatten('F')
File "/usr/local/lib/python2.7/dist-packages/gpfit/ba_init.py", line 36, in ba_init
raise ValueError('Not enough data points')
ValueError: Not enough data points
I know that this is probably not a tool error but a user-keyboard bug, but I just don't get what could be the problem here. Any pointers are much appreciated.
Regards,
Lucho.
near line 156 of fit.py
, cstrt = MonomialEquality(cstrt, "=", 1)
should I think be be cstrt = (mono == posy)
?
... just like we do in gpkit. There's currently duplicate code such as t_ex6_1
living in both examples/ and in tests -- that code should live in examples/ and be called in t_examples just like in gpkit.
duplicate-code
from the disable:(
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.