marekborowiec / amas Goto Github PK
View Code? Open in Web Editor NEWCalculate summary statistics and manipulate multiple sequence alignments
License: Other
Calculate summary statistics and manipulate multiple sequence alignments
License: Other
This is the script:
from amas import AMAS
sequences_path = './concatenated_alignment_copy.phy'
partitions_path = './partition_scheme_2.txt'
meta_aln = AMAS.MetaAlignment(in_files=[sequences_path], data_type="dna", in_format="phylip", cores=10)
parsed_parts = meta_aln.get_partitions(partitions_path)
partitions = meta_aln.get_partitioned(partitions_path)
The function meta_aln.get_partitioned(partitions_path) gives me the following error:
AttributeError: 'MetaAlignment' object has no attribute 'remove_empty'
Hello,
I love AMAS for my phylogenomics projects. I've had it running for 2 years, and I just realized that it broke sometime over the last month. This is true for several installations, as well as a fresh one. Not sure if I screwed up a dependency in Python3 or not ... I'm running latest Python .... that might be it, not sure.
here's the error I get running on Linux (Debian)
Traceback (most recent call last):
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 2017, in
main()
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 1985, in main
meta_aln.write_summaries(kwargs["summary_out"])
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 1514, in write_summaries
summary_out = self.get_summaries()
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 1452, in get_summaries
summaries = [alignment.get_summary() for alignment in alignments]
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 1452, in
summaries = [alignment.get_summary() for alignment in alignments]
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 953, in get_summary
data = self.summarize_alignment()
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 706, in summarize_alignment
self.no_missing_ambiguous = self.get_sites_no_missing_ambiguous()
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 784, in get_sites_no_missing_ambiguous
no_missing_ambiguous_sites = [self.get_site_no_missing_ambiguous(column) for column in range(self.get_alignment_length())]
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 784, in
no_missing_ambiguous_sites = [self.get_site_no_missing_ambiguous(column) for column in range(self.get_alignment_length())]
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 788, in get_site_no_missing_ambiguous
site = self.get_column(column)
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 776, in get_column
return [row[i] for row in self.matrix]
File "/mnt/griffin/chrwhe/software/AMAS/amas/AMAS.py", line 776, in
return [row[i] for row in self.matrix]
IndexError: list index out of range
Hi,
I have a alignment with 7902 seqs, when I'm running cms as AMAS.py summary -f fasta -i ./Clade.fasta -d dna -c 8 -o Clade_summary.txt
with any threads >2, then it gaves me err as below:
Traceback (most recent call last):
File "/home/cactus/miniconda3/bin/AMAS.py", line 10, in <module>
sys.exit(main())
File "/home/cactus/miniconda3/lib/python3.7/site-packages/amas/AMAS.py", line 1982, in main
meta_aln = MetaAlignment(**kwargs)
File "/home/cactus/miniconda3/lib/python3.7/site-packages/amas/AMAS.py", line 1039, in __init__
self.alignment_objects = self.get_alignment_objects()
File "/home/cactus/miniconda3/lib/python3.7/site-packages/amas/AMAS.py", line 1354, in get_alignment_objects
alignments = pool.map(self.get_alignment_object, self.in_files)
File "/home/cactus/miniconda3/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/cactus/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '[<amas.AMAS.DNAAlignment object at 0x2b6421bdcba8>]'. Reason: 'error("'i' format requires -2147483648 <= number <= 2147483647")'
Please help!
Miao
Hi Marek --
Thanks so much for writing AMAS. It's been a useful and easy program to use. I've been using the concat function a lot, and in particular to concatenate hundreds of UCE nexus alignments. I recognize that users can, and should, properly specify the alignment input type, but I noticed that AMAS still ran (returned no error) when the input alignment type has been misspecified. In this case I was able to concat interleaved nexus files despite incorrectly using the -i nexus, and not the -i nexus-int, flag. It ran without error, but it did produce concatenated alignments of odd length (e.g., shorter than the input alignments combined). It was only after checking the size of the resulting concatenated files that I noticed something was off. Anyway, might not be something to be concerned about, but I figured you might want to know, in case it was of interest.
Under certain conditions, AMAS does not work if the number of files to concatenate (or to make summary) exceeds a limit, e.g., 5,000 (the system report 'Argument list too long' error if, e.g., '*.fas' is used to define input files). Is there any chance to input file names using a file with files names specified? Or do you have any other solution? Thanks, Tomas
Good morning,
First of all, congratulations for AMAS, it is an extremely useful and powerful tool.
I am trying to concatenate 1065 fasta files. For this I wanted to use command
C:>python3 AMAS.py summary -f fasta -d dna -i *.fasta -c 4
but I get the error attached below (I tried changing to *fasta and I get the same error.). I don't know what I'm doing wrong
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "XXX\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Program FilesXXX\lib\multiprocessing\pool.py", line 48, in mapstar
return list(map(args))
File "C:\AMAS-master\amas\AMAS.py", line 1352, in get_alignment_object
aln = DNAAlignment(alignment, self.in_format, self.data_type)
File "C:\AMAS-master\amas\AMAS.py", line 682, in init
self.parsed_aln = self.get_parsed_aln()
File "C:\AMAS-master\amas\AMAS.py", line 694, in get_parsed_aln
aln_input = self.get_aln_input()
File "C:\AMAS-master\amas\AMAS.py", line 689, in get_aln_input
aln_input = FileParser(self.in_file)
File "C:\AMAS-master\amas\AMAS.py", line 435, in init
with FileHandler(in_file) as handle:
File "C:\AMAS-master\amas\AMAS.py", line 418, in enter
self.in_file = open(self.file_name, "r")
OSError: [Errno 22] Invalid argument: '.fasta'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\AMAS-master\amas\AMAS.py", line 2075, in
main()
File "C:\AMAS-master\amas\AMAS.py", line 2040, in main
meta_aln = MetaAlignment(**kwargs)
File "C:\AMAS-master\amas\AMAS.py", line 1047, in init
self.alignment_objects = self.get_alignment_objects()
File "C:\AMAS-master\amas\AMAS.py", line 1362, in get_alignment_objects
alignments = pool.map(self.get_alignment_object, self.in_files)
File "C:\Program Files\XXX\lib\multiprocessing\pool.py", line 367, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\XXX\pool.py", line 774, in get
raise self._value
OSError: [Errno 22] Invalid argument: '*.fasta'
Hello,
I'm unable to generate summary statistics using the summary command for a fasta file of dna sequences. I've attached the fasta file and the output file "summary.txt".
Here's the command I used:
amas summary -i Noctuidae_Amphipyrinae_Manuscript_1_v001.fasta -f fasta -
d dna --by-taxon
Best,
Kevin
fasta_summary.zip
Hello-
I am trying to concatenate genes with a partition file formatted by codon position. I am running the following:
python3 AMAS.py concat -f fasta -d dna -i renamed_for_mcmctree_* -n 12 -u phylip -t full_concat_align_for_mcmctree.out
But I get the message:
AMAS.py: error: unrecognized arguments: -n 12
I have tried using --codons instead of -n, and also -n 123, but I get the same message.
The line of code above works fine if I remove the "-n" flag.
Hi! Thanks for the great package, I really prefer it to Fasconcat.
Not really an issue, but I've added AMAS to bioconda, if you want to update the installation instructions:
conda install -c bioconda amas
Cheers!
Matt
Currently it doesn't seem to be possible to output partition files with codon positions when concatenating loci. Would it perhaps be possible to add an option for this, e.g generating a partition file like this?
DNA, p1=1-60\3,2-60\3
DNA, p2=3-60\3
or
DNA, p1=1-60\3
DNA, p2=2-60\3
DNA, p3=3-60\3
It would also be great to be able to split based on such partition files.
After running ipyrad, my resulting dataset with full loci has an aligned length of 255,532 bp, 72.6% missing site data and 7.8% variable. For the unlinked SNP selection from this set I would expect 100% of the resulting sites to be variable? But instead the uSNP alignment with 694 sites only has 28.2% variable sites - the only explanation I have for this is that iPyrad must count Ns (missing data) as candidate SNPs when selecting an unlinked SNP per locus? I think that would be unwanted behavior?
Hi,
I find that the AMAS summary command returns only zero's in the proportion columns (See example below). All other values (e.g. Parsimony_informative_sites) appear to be computed correctly.
Alignment_name No_of_taxa Alignment_length Total_matrix_cells Undetermined_characters Missing_percent No_variable_sites Proportion_variable_sites Parsimony_informative_sites Proportion_parsimony_informative .....
A1.aln 43 194 8342 354 0.0 186 0.0 173 0.0 .....
A2.aln 41 164 6724 292 0.0 152 0.0 140 0.0 .....
A3.aln 41 88 3608 48 0.0 64 0.0 53 0.0 .....
A4.aln 40 170 6800 157 0.0 144 0.0 126 0.0 .....
Thanks,
Tim.
Are ambiguous states or missing data included counted in the number of parsimony informative characters? I would think not but I am getting pretty high numbers so I wanted to double check. I couldn't find this information in the manuscript.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.