nslatysheva / data_science_blogging Goto Github PK
View Code? Open in Web Editor NEWCode and markdown files for blog posts
License: GNU General Public License v3.0
Code and markdown files for blog posts
License: GNU General Public License v3.0
I have swapped them in my working version and I think it flows better. We should discuss it though
one illustration could show the nested nature of the using an RF in ensembling, as it is an ensemle model itself
write a system command for people to download the dataset from the source
stick the optimization code there and go into more detail about optimization tradeoffs and such
I think that would be a great plot to show why it would be a good idea to do the ensembling.
for example when getting into random forests, we can refer to Guiseppe's post
instead of the custom multilayer perceptron
I didn't quite have the time to figure out the randomized searching code, maybe have a look?
i think this could be at the top of the document with an explanation why it makes sense to mix these models in particular.
knn3scores = cross_val_score(knn3, XTrain, yTrain, cv = 5)
print knn3scores
print "Mean of scores KNN3:", knn3scores.mean()
[ 0.85714286 0.8206278 0.85201794 0.87892377 0.86936937]
Mean of scores KNN3: 0.855616346648
knn99scores = cross_val_score(knn99, XTrain, yTrain, cv = 5)
print knn99scores
print "Mean of scores KNN99:", knn99scores.mean()
[ 0.85267857 0.83856502 0.82511211 0.9058296 0.87387387]
Mean of scores KNN99: 0.859211834352
XTrain, XTest, yTrain, yTest = train_test_split(X, y, random_state = 1) #seed 1
knn = KNeighborsClassifier()
n_neighbors = np.arange(3, 151, 2)
grid = GridSearchCV(knn, [{'n_neighbors':n_neighbors}], cv = 10)
grid.fit(XTrain, yTrain)
cv_scores = [x[1] for x in grid.grid_scores_]
train_scores = list()
test_scores = list()
for n in n_neighbors:
knn.n_neighbors = n
knn.fit(XTrain, yTrain)
train_scores.append(metrics.accuracy_score(yTrain, knn.predict(XTrain)))
test_scores.append(metrics.accuracy_score(yTest, knn.predict(XTest)))
plt.plot(n_neighbors, train_scores, c = "blue", label = "Training Scores")
plt.plot(n_neighbors, test_scores, c = "brown", label = "Test Scores")
plt.plot(n_neighbors, cv_scores, c = "black", label = "CV Scores")
plt.xlabel('Number of K nearest neighbors')
plt.ylabel('Classification Accuracy')
plt.gca().invert_xaxis()
plt.legend(loc = "upper left")
plt.show()
grid
random
bayesian
gradient descent
check whether implemented
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.