Giter VIP home page Giter VIP logo

cluster-analysis-using-vmd-tcl's Introduction

CLUSTER-ANALYSIS-USING-VMD-TCL

TCL VMD SCRIPT TO DO CLUSTER ANALYSIS

 Authour : ANJI BABU KAPAKYALA,
           Dept. of Chemistry,
           C/O Dr.Nisanth N.Nair,
           IIT KANPUR, INDIA.
         ([email protected])


PURPOSE : To Perform the cluster analysis

USAGE   : source clustering.tcl in VMD Tk console.
     : clustering {Atomselection} {rmsd_cutoff} {step size} {frame_args}

Arguments :

 Atomselect  : Any atom selection

 rmsd_cutoff : RMSD cutoff

 Step_size   : STEP SIZE ( nothing but skip)

 frame_args  : Initial & final frame numbers

Default Arguments :

 Dist_func         : Distance Function is set to rmsd as default.

 num               : No. of Clusters are fixed to default vaule 3

EXAMPLES :

 EXAMPLE1 : clustering "(protein) and backbone" 1.0 2
 EXAMPLE2 : clustering "(protein) and backbone" 1.0 2 5
 EXAMPLE3 : clustering "(protein) and backbone" 1.0 2 5 25

Example1, measures the clusters of given selections with rmsd cutoff 1.0 and step size 2 for zeroth frame to final frame in trajectory.

Example2, measures the clusters of given selections with rmsd cutoff 1.0 and step size 2 from frame 5 to final frame in trajectory.

Example3, measures the clusters of given selections with rmsd cutoff 1.0 and step size 2 from frame 5 to frame 25 in trajectory.

(At present No. of clusters are fixed to 3) Default Distance function is rmsd. but you can change to fitrmsd or rgyd or rmsd

OUTPUT FILES : It shows the clusters on New graphical representation and stores the data in 4 files as well.

CLUSTER-A.pdb
CLUSTER-B.pdb
CLUSTER-C.pdb
UNCLUSTER.pdb
cluster.log

EXAMPLE OUTPUT of this code looks like this. ( on Tk console) #===================================================================#

WELCOME TO CLUSTERING ANALYSIS PLUGIN 

No. of frames Found in Trajectory: 99


Given Data :
=============

Atomselection     : (protein) and backbone 
Distance Function : rmsd (default) 
No. of clusters   : 3    (default) 
Rmsd Cutoff       : 1.0      
Step size         : 2 
Starting Frame    : 0 
End Frame         : 99 

Analysis will be performed on 0 to 99 frame(s) 

CLUSTER-A (41) :
18 0 2 8 10 12 14 16 20 22 24 26 28 30 32 34 36 38 40......etc 

CLUSTER-B (4)  :
 94 90 92 96 

CLUSTER-C (2)  :
4 6 

UNCLUSTRED (2) : 
64 82


Writing OUTPUT files.........

10 %    20 %    30 %    40 %    50 %    60 %    70 %    80 %    90 %    100 %      

OUTPUT files :
---------------
CLUSTER-A.pdb
CLUSTER-B.pdb
CLUSTER-C.pdb 
UNCLUSTER.pdb 
cluster.log

TCL Code :

    
proc clustering {sel1 rcutoff step_size args } {
#=====================Allignment=====================================
set n [molinfo top get numframes]
for {set i 0} {$i <= $n} {incr i} {
set ref_sel [ atomselect top  "protein and backbone" frame 0]
set compare_sel [ atomselect top "protein and backbone" frame $i]
set transform_matrx [ measure fit $compare_sel $ref_sel ]
$compare_sel  move $transform_matrx
}
puts "Aligned w.r.t protein backbone"
#=====================CLUSTERING==================================
#puts -nonewline "\nEnter Your Selction :"
#flush stdout
#gets stdin sel
#puts "$sel"
#-----default number of clusters
set number 3
#---------------------------------------#
set nframes [molinfo top get numframes]
#---------------Opening OUTPUT files
set file1 [open "CLUSTER-A.pdb" w]
set file2 [open "CLUSTER-B.pdb" w]
set file3 [open "CLUSTER-C.pdb" w]
set file4 [open "UNCLUSTER.pdb" w]
set logfile [open "cluster.log" w]
puts $logfile "\n Aligned w.r.t protein backbone \n"
#---------------Settingup arguments----#
  set sel $sel1
  set selA [atomselect top $sel ]
#---------------frames details----------#
  if {[llength $args] == 0} {
      set inf 0
      set nf $nframes
}
 if {[llength $args] == 1} {
    set inf [lindex $args 0]
    set nf $nframes
  }
  if {[llength $args] > 1} {
    set inf [lindex $args 0]
    set nf [lindex $args 1]
    }
#------------------total frames---------#
  set totframes [expr $nf - 1 ]
#------------------PRINT Given DATA-----#
 puts " \n \t  \t  WELCOME TO CLUSTERING ANALYSIS PLUGIN "
 puts "    \t  \t  =====================================\n\n"
 puts " \n No. of frames Found in Trajectory: $nframes\n\n"
 puts " Given Data :\n =============\n"
 puts " Atomselection     : $sel1 "
 puts " Distance Function : rmsd (default) "
 puts " No. of clusters   : 3    (default) "
 puts " Rmsd Cutoff       : $rcutoff      "
 puts " Step size         : $step_size "
 puts " Starting Frame    : $inf "
 puts " End Frame         : $nf \n"
 puts " Analysis will be performed on $inf to $nframes frame(s) \n\n\n"
#------------------------------------------#
#------------------PRINT Given DATA in logfile-----#
 puts $logfile " \n \t  \t  WELCOME TO CLUSTERING ANALYSIS PLUGIN "
 puts $logfile "    \t  \t  =====================================\n\n"
 puts $logfile " \n No. of frames Found in Trajectory: $nframes\n\n"
 puts $logfile " Given Data :\n =============\n"
 puts $logfile " Atomselection     : $sel1 "
 puts $logfile " Distance Function : rmsd (default) "
 puts $logfile " No. of clusters   : 3    (default) "
 puts $logfile " Rmsd Cutoff       : $rcutoff      "
 puts $logfile " Step size         : $step_size "
 puts $logfile " Starting Frame    : $inf "
 puts $logfile " End Frame         : $nf \n"
 puts $logfile " Analysis will be performed on $inf to $nframes frame(s) \n\n\n"
#------------------------------------------#
  # Cluster
  #set result [measure cluster $selA num $number cutoff $rmsdcutoff first 0 last $totframes  step 2 distfunc rmsd ]
#selupdate $calc_selupdate weight $calc_weight]
#  set nclusters [llength $result]
#  puts "$nclusters"
#----------------------------------------
# foreach {listA listB listC listD} [measure cluster $selA num $number cutoff $rmsdcutoff first $inf last $totframes  step 1 distfunc rmsd ] break
 foreach {listA listB listC listD} [measure cluster $selA num $number cutoff $rcutoff first $inf last $totframes  step $step_size distfunc rmsd ] break
    set nclustera [llength $listA] ; set nclusterb [llength $listB] ; set nclusterc [llength $listC] ; set nclusterd [llength $listD]
 puts " CLUSTER-A ($nclustera) :\n $listA \n\n CLUSTER-B ($nclusterb)  :\n $listB \n\n CLUSTER-C ($nclusterc)  :\n $listC \n\n UNCLUSTRED ($nclusterd) : \n $listD\n"
 puts $logfile  " CLUSTER-A ($nclustera) :\n $listA \n\n CLUSTER-B ($nclusterb)  :\n $listB \n\n CLUSTER-C ($nclusterc)  :\n $listC \n\n UNCLUSTRED ($nclusterd) : \n $listD\n"
#-----------------------------------------------
# update on graphical representation of the above clusters
#------ListA
menu graphics on
#mol delrep replica_number Mol_number
mol delrep 0 0
mol delrep 0 0
mol delrep 0 0
mol delrep 0 0
mol representation lines
mol selection $sel
mol addrep 0
mol drawframes 0 0 $listA
mol modcolor 0 0 ColorID 0
#------ListAB
mol representation lines
mol selection $sel
mol addrep 0
mol drawframes 0 1 $listB
mol modcolor 1 0 ColorID 1
#------ListC
mol representation lines
mol selection $sel
mol addrep 0
mol drawframes 0 2 $listC
mol modcolor 2 0 ColorID 4
#------ListD
mol representation lines
mol selection $sel
mol addrep 0
mol drawframes 0 3 $listD
mol modcolor 3 0 ColorID 7
mol showrep 0 2 0
mol showrep 0 0
#mol showrep 0 2 1
#mol showrep 0 2 0
#---------------WRITING OUTPUT----------------------------------------
puts " \n Writing OUTPUT files.........\n"
#--------------------Cycle starts
for {set i 0 ; set d 1} { $i<=$nframes} {incr i; incr d}  {
#------------------STATUS BAR
# show activity
#    if { [expr $d % 10] == 0 } {
#doble (number) gives floating point
     set percentage [expr double($d)*100/double($totframes )]
#int(number) gives intger
    if { [expr int($percentage) % 10 ] == 0 } {
      puts -nonewline " [expr int($percentage)] %     "
 flush stdout
     }
#   if { [expr $d % 500] == 0 } { puts " " }
#    flush stdout
#   }
#-------STORING CLUSTER-A
  foreach list1 $listA {
    if {$i == $list1 } {
#puts " Writing CLUSTER-A.xyz :$i"
#puts "$i : $list1 "
#set selA [atomselect top $sel frame $i ]
[atomselect top "all" frame $i] writepdb cluster$i.pdb
exec cat cluster$i.pdb >> CLUSTER-A.pdb
exec rm cluster$i.pdb
}
}
#-------STORING CLUSTER-B
  foreach list2 $listB {
    if {$i == $list2 } {
#puts "$i : $list2 "
#puts " Writing CLUSTER-B.xyz :$i"
#set selA [atomselect top $sel frame $i ]
[atomselect top "all" frame $i] writepdb cluster$i.pdb
exec cat cluster$i.pdb >> CLUSTER-B.pdb
exec rm cluster$i.pdb
}
}
#-------STORING CLUSTER-C
  foreach list3 $listC {
    if {$i == $list3 } {
#puts "$i : $list3 "
#puts " Writing CLUSTER-C.xyz :$i"
#set selA [atomselect top $sel frame $i ]
[atomselect top "all" frame $i] writepdb cluster$i.pdb
exec cat cluster$i.pdb >> CLUSTER-C.pdb
exec rm cluster$i.pdb
}
}
#-------STORING UN CLUSTER-D
  foreach list4 $listD {
    if {$i == $list4 } {
#puts "$i : $list4 "
#puts " Writing UNCLUSTER.xyz :$i"
#set selA [atomselect top $sel frame $i ]
[atomselect top "all" frame $i] writepdb cluster$i.pdb
exec cat cluster$i.pdb >> UNCLUSTER.pdb
exec rm cluster$i.pdb
}
}
}
close $file1
close $file2
close $file3
close $file4
#-----------Printing OUTPUT files
puts " \n\n OUTPUT files :\n---------------\n"
puts $logfile   " \n\n OUTPUT files :\n---------------\n"
puts " \n CLUSTER-A.pdb\n CLUSTER-B.pdb\n CLUSTER-C.pdb \n UNCLUSTER.pdb \n cluster.log \n"
puts $logfile " \n CLUSTER-A.pdb\n CLUSTER-B.pdb\n CLUSTER-C.pdb \n UNCLUSTER.pdb \n cluster.log \n"

#
puts "\n\n\n \t  \t $********** ANJI BABU KAPAKAYALA **********$\n\n\n"
puts $logfile "\n\n\n \t  \t $********** ANJI BABU KAPAKAYALA **********$\n\n\n"
close $logfile
}
#====================================================#
# TCL VMD Script to measure average strucure from given frames
#
# Usage  : avg_str <molid> <atom selection>
#
# Example : avg_str 0 "protein"
# Example : avg_str 1 "backbone"
#
#
proc avg_str {id sel} {
set nframes [molinfo $id get numframes]
puts " No. of frames : $nframes"
#--------------------#
set sel1 [atomselect $id $sel]
set avgpos [measure avpos $sel1]
# Moves the selected atoms to the average positions computed
 $sel1 set {x y z} $avgpos
# Write into pdb
$sel1 writepdb Avg-Pos-$id.pdb
puts "\n ********** KAPAKAYALA ANJI BABU **********$ "
}
#------------------------------------------------------#
#        WRITTEN BY ANJI BABU KAPAKAYALA             #
#====================================================#

$********** ANJI BABU KAPAKAYALA **********$

CHEERS ....!!!!!!

cluster-analysis-using-vmd-tcl's People

Contributors

anjibabuiitk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

cluster-analysis-using-vmd-tcl's Issues

Invalid command name "clustering"

I am trying to use the script to perform clustering on my trajectory, however I need to run the script on a supercomputer. In order to do this, I have added the following lines at the beginning to load my PSF and trajectory:


mol new protein.psf type psf
type dcd first 0 last -1 step 2 waitfor all

However, I then get the error invalid command name "clustering". I then tried the script on my local computer by loading the psf and dcd file in the GUI with a very high step value and then running the script without those two lines, and it does not seem to have an issue. How can I load the psf and dcd file within the tcl script without triggering this issue? I need to be able to run it on a supercomputer as the trajectory is too large to do on my computer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.