hyu-kim / mds-hypothesis-testing Goto Github PK

View Code? Open in Web Editor NEW

2.0 1.0 0.0 23.68 MB

Multidimensional scaling method for F-informed hypothesis testing

License: MIT License

R 2.13% Python 1.29% PureBasic 1.83% Jupyter Notebook 4.12% PostScript 3.81% HTML 86.82%

multivariate-statistics

mds-hypothesis-testing's Introduction

mds-hypothesis-testing

Multidimensional scaling method for F-informed hypothesis testing.
Paper link = https://arxiv.org/abs/2308.00354

Summary

Multidimensional scaling (MDS) is an unsupervised learning technique that preserves pairwise distances between observations and is commonly used for analyzing multivariate biological datasets. Recent advances in MDS have achieved successful classification results, but the configurations heavily depend on the choice of hyperparameters, limiting its broader application. Here, we present a self-supervised MDS approach informed by the dispersions of observations that share a common binary label (F-ratio). Our visualization accurately configures the F-ratio while consistently preserving the global structure with a low data distortion compared to existing dimensionality reduction tools. Using an algal microbiome dataset, we show that this new method better illustrates the community's response to the host, suggesting its potential impact on microbiology and ecology data analysis.

mds-hypothesis-testing's People

Contributors

Stargazers

Watchers

mds-hypothesis-testing's Issues

Case with a large lambda

Refer to a reply / comment to PR #13

Issue

Configuration diverges when code runs under 𝝀=10 using Site 1, 2

Idea

Our confirmatory term in objective function is not regularized and is proportional to hyperparameter. This may have caused a problem with a large hyperparameter.

I will come back to this after completing with outline document.

y1 vs y1s in mm.R

mm.R 파일에서 데이터의 레이블(y)값을 어떻게 둔 건지 궁금해서 이슈를 만들었어
데이터에서 label은 pt+, pt-값으로 나오는 애들인 줄 알았는데 아닌가?
근데 코드에서 label로 사용한 y1s는 result/labels_site1.txt 에서 가져오던데,

이 텍스트 파일의 각 열이 무슨 값들인지,
텍스트 파일의 값과 데이터에서 가져온 y1 (ifelse(site1@sam_data$Treatment == "Pt +", 1, 2)) 의 차이가 뭔지 궁금해

derivative for line 115 in gd.R

수빈아 gradient descent 함수 미분 꼴 형태를 확인하는 중이었는데, F statistic 미분 항에 분모 식이 빠진 것 같아서 코멘트를 남겨.

115 d_g <- 4 * (N-a)/a * sign(Fz_cur - F0) * (tmp11 * tmp12 - tmp21 * tmp22) (링크)

내가 맞게 본 건지 확인해줄 수 있을까?

Overleaf 너가 작성한 문서 formulation.tex (링크) 에도 같은 코멘트 남겼어

@soob-kim

Extrapolation with loess needed

mm.R 파일의 pair_by_rank 함수에서, 우리가 loess를 사용하기 때문에 주어진 데이터의 범위에서 벗어나는 새로운 데이터로 predict를 하려고 하면 NA를 주는 에러가 있어,
그래서 mm_cmds(nit=15, lambda=0.5, z0=zmds1, D=distmat1, y=y1) 을 돌리면 NA를 줘

여기에서처럼,

pair_by_rank <- function(D, z, y, fun){
  f0_sorted <- get_p(d=D, trt=y, fun=fun)$ratio_all
  fz_sorted <- get_p(mat=z, trt=y, fun=fun)$ratio_all
  N <- length(f0_sorted)
  mat_pair <- matrix(0, nrow=N, ncol=2)
  mat_pair[,1] <- f0_sorted
  mat_pair[,2] <- fz_sorted
  df_pair <- data.frame(data=mat_pair)
  colnames(df_pair) <- c('F0','Fz')
  loess_f <- loess(Fz ~ F0, data=df_pair, span=0.10,
                   control=loess.control(surface="direct")) ## Add this part
  return(list(pair=mat_pair, model=loess_f))
}

이렇게 하면 그 문제는 일단 사라져. 함수 수정할 때 이 부분도 참고해줘!

MM diagnosis

How come \Phi_z and estimated \Phi (i.e., f_z of \Phi_o) not differ from each other?

Created a branch to diagnose the issue

can't find zmds1

수빈 @soob-kim , 너가 작성한 gradient descent 코드를 살펴보는 중인데 (data/gd.R) 막히는 부분이 있어서. 130번째 줄을 보면 아래와 같이 되어있는데 (링크)

tmp <- gd_mds(nit = 100, D = as.matrix(dist1), z0 = zmds1)

혹시 zmds1이 어디서 온 건지 설명해줄 수 있을까?

형석

hyu-kim / mds-hypothesis-testing Goto Github PK

mds-hypothesis-testing's Introduction

mds-hypothesis-testing

Summary

mds-hypothesis-testing's People

Contributors

Stargazers

Watchers

mds-hypothesis-testing's Issues

Case with a large lambda

Issue

Idea

y1 vs y1s in mm.R

derivative for line 115 in gd.R

Extrapolation with loess needed

MM diagnosis

can't find zmds1

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent