ts404 / alignstat Goto Github PK
View Code? Open in Web Editor NEWA tool for the statistical comparison of alternative multiple sequence alignments
Home Page: http://AlignStat.science.latrobe.edu.au
A tool for the statistical comparison of alternative multiple sequence alignments
Home Page: http://AlignStat.science.latrobe.edu.au
Rename to match_summary and add a number giving global average match score
Check that degapped sequences in both alignments are identical
Use more understandable names for $means and $results in align_alignments output.
$means --> $identity
$results --> $classification
Will need to update plot arguments and readme to reflect changes
Add item $score with global mean score
Fix examples for plots so they work with this structure
Generate a heatmap that simply uses colour to indicate Match / Merge / Split / Shift / Gapcon for each position in the reference alignment.
moved PAC$results[10,1](the score) to be its own item in the PAC list i.e. PAC$score
However, this now means that naming the rows creates an error since the row number is wrong. Probably needs fix in the rcpp
To:
plot_alignment_heatmap
plot_match_summary
plot_category_proportions
So that the naming is a little more consistent
Hello
I am interested in using alignstat to compare various alignment algorithm
I tried the online version but it does not seem to work: fasta alignments are loaded but nothing happen afterward.
I also installed the r package but I get an error message....
I tried loading my alignment with phylotools::read.fasta or with ape but I still have this error:
Error in path.expand(file) : invalid 'path' argument
In addition: Warning messages:
1: In if (tools::file_ext(alignment) == "clustal" | tools::file_ext(alignment) == :
the condition has length > 1 and only the first element will be used
2: In if (tools::file_ext(alignment) == "msf" | tools::file_ext(alignment) == :
the condition has length > 1 and only the first element will be used
3: In if (tools::file_ext(alignment) == "mase" | tools::file_ext(alignment) == :
the condition has length > 1 and only the first element will be used
I tried with alignments in msa or fasta format...
class(myFirstAlignment)
[1] "MsaAAMultipleAlignment"
attr(,"package")
[1] "msa"
or with
ref_df=phylotools::read.fasta(ref)
class(ref_df)
[1] "data.frame"
any suggestions?
For Plots.R and compare_alignments.R
Do this plot with ggplot geom_tile
When first installing:
library("AlikeAlignmentAligner", lib.loc="~/R/win-library/3.1")
Attaching package: ‘AlikeAlignmentAligner’
The following object is masked by ‘.GlobalEnv’:
percent
Warning message:
"package 'AlignStat' is not available for this version of R
A version of this package for your version of R might be available elsewhere,
see the ideas at https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages"
Place match_summary_plot "Av= xx.x%" score over with the legend?
edit plots so that they simply require the res_list as an argument input, rather than specific list items (res_list$xxx). No other inputs really make sense other than the specific output file of align_alignments.
Hello!
After modifying my alignments as best as possible to match the data.frame format of the example dataset ("reference_alignment"), I am receiving an error stating that the alignments do not contain the same set of sequences, even though they do. Below is the code I used to covert the alignments to data frames, and I have attached the fasta files (/Data) I would like to compare. Any help would be appreciated!
Data.zip
## R script ######
library("AlignStat")
library("phylotools")
library("stringr")
fas_dir <- file.path("~/Desktop/test")
fas_files <- list.files(path=fas_dir, pattern="*fasta")
list_df<-lapply(setNames(,fas_files), function(x,y,z,a) {
y<-as.data.frame(phylotools::read.fasta(x))
z<-stringr::str_split(y$seq.text, "")
a<-data.frame(matrix(unlist(z), nrow=length(z), byrow=T))
rownames(a)<-as.character(unlist(y[,1]))
as.data.frame(t(a), row.names=F, stringsAsFactors=T)
})
PAC.vm<-compare_alignments (list_df[["HIVenv_valign_cut.fasta"]],
list_df[["HIVenv_malign_cut.fasta"]])
plot_similarity_summary (PAC.vm, scale=TRUE, CS=FALSE, cys=FALSE, display=TRUE)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.