datasciencecoursera's People
datasciencecoursera's Issues
HelloWorld.md
This is a markdown file
Test Code
library(diffusionMap)
library(scatterplot3d)
library(svd)
library(cluster)
library(fpc)
Bacis Diffusion Mapping Algorithm for Dimensionality Reduction
Input: High dimensional data set phy_test
Define kernel k(x,y) , x,y in phy_test
Create kernel matrix K=k(x,y)
Create diffusion matrix by normalising the rows of the kernel matrix
Calculate eigenvectors of diffusion matrix
Map to the d-dimensional diffusion space at time t, using the d dominant eigenvectors and -values
Output: Lower dimensional data set Y (with same amount of records)
Data Preparation
select variables (5 out of 80) and 100 records
myvars<- phy_test[c(11,12,21,66,67)]
mydata <- myvars[1:100,]
head(mydata)
nrow(mydata) #100
ncol(mydata) #5
Matrix will be transported for distance measuring row-wise
B=t(mydata)
nrow(B)
#5
ncol(B)
#100
Check euclidean distance as a reference
D=as.matrix(dist(B))
D
nrow(D)
#5
ncol(D)
#5 - but transponed it would be 100 !
D=as.matrix(dist(mydata))
nrow(D) #100
ncol(D) #100
Pre- Preparation: PCA
PCA_mydata=prcomp(mydata)
print(PCA_mydata)
plot(PCA_mydata, type = "l")
summary(PCA_mydata)
taking the first three components, 96% of the models variance will be covered.
In high dimensions, while small distances are meaningful, large
distances are (almost) meaningless
Connectivity kernel (defines a local measure of similarity within a certain neighbourhood)
initial set-up: scale parameter a=1
kernel <- function(x,y){
exp(-(abs(sum((x-y))^2))/1)
}
kernel(c(1,2), c(3,4))
#does not work , although i do not know why
K=as.matrix(kernel(A[i,],A[j,]))
error subscript out ob bounds will be shown, but can be ignored since results seem to be correct
diffusion_matrix <- function(DifMat){
K=matrix(nrow=nrow(DifMat),ncol=nrow(DifMat))
for (i in 1:nrow(DifMat)) {
#go directly into the column loop
for (j in 1:nrow(DifMat)) {
if (i==j) {
K[i,j]= 1
}
else
K[i,j]=kernel(DifMat[i,],DifMat[j,])
}
}
return(K)
}
test=t(B)
(diffusion_matrix(test)[1:6,1:6])
Create normalized Matrix(by row)
library(som)
L= diffusion_matrix(A)
L
P= normalize(L,byrow=TRUE)
P
eigen(P)$values
[1] 6.825530e+00 4.613574e+00 1.964817e+00 9.890712e-01 9.083846e-02 1.083078e-02 3.758092e-04 1.022369e-06
[9] 6.625611e-09 -9.310607e-13
ev=cbind(eigen(P)$values)[c(1:4)]
vec=cbind(eigen(P)$vectors)
V=vec[1:10,1:4]
V
QC: A x= \lambda x
here (not symmetric matrix): (A - \lambda*I(nrow(A))) * x = 0
cbind(V[1,])
Q=P[,1:4]
Q
mATRIX multiplication is done by operator: %*%
Q%*%cbind(V[1,])
ev[1]* cbind(V[1,])
tbd
wie dim reduction ?
einfach nur die spalten weiter nutzen die zu den grössten eigenvalues gehören (siehe hierzu PCA)
oder gibts bestimmtes mapping-verfahren ?
geeigneten Datensatz finden und über t>1 laufen lassen.
Visualisierung
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.