Giter VIP home page Giter VIP logo

max-clade's Introduction

Max-Clade

Extracts maximal clades from a list of unrooted trees. In an unrooted context we define clades to be the tree on either half of any bipartition. Maximal clades are clades which do not contain duplications, however every clade that they are a subset of does.

Dependencies

Usage

Input: File containing list of multi-copy trees in newick format

Output: Maximal clades listed as trees in newick format

python max_clade.py -i <input_file> -o <ouput_file> -t -d <delimiter>

Arguments (for both scripts)

  • -i: Input tree list file
  • -o: (optional) Output max clade list file
  • -t: (optional) Include trivial clades in output (filtered by default).
  • -d: (optional) Delimiter separating species name from rest of leaf label (default None).

Example

python max_clade.py -i example/gtrees-mult.trees

Algorithm Description

For each tree in the input do the following:

  • If the tree is rooted, remove the root and connect the root's children with an edge. Then arbitrarily select internal vertex and call it the root. (The vertex chosen does not affect the result in any way; it is necessary, however, for defining destinct sides for the bipartitions).
  • Postorder traversal: label each edge with True/False indicating whether there's a duplication event under it (under defined with respect to the rooting). Store references to all clades without duplications.
  • Preorder traversal: label each edge with True/False indicating whether there's a duplication event above it. Store references to all clades without duplications.
  • Iterate through all duplication free clades. For each one, check all the edges which define a superset clade one layer up. If any of them are duplication free, then considered clade is not maximal, otherwise it is.

max-clade's People

Contributors

jsdoublel avatar

Watchers

 avatar

max-clade's Issues

Potential bug

Take the following tree:

(((((25,2),(3,40)),(((((33,30),((22,34),((((14,39),14),(39,39)),(14,14)))),(((38,43),39),((((((((7,28),9),32),(37,(18,29))),(41,4)),(1,20)),(45,8)),(((((7,28),29),29),((28,7),18)),(37,9))))),(((((48,43),(21,((((9,37),41),4),((1,((21,32),(20,(45,8)))),(28,(29,(18,7))))))),6),27),((42,(16,(24,13))),(((47,5),44),(35,((15,46),(11,26))))))),(((17,(((19,19),19),10)),(23,31)),(36,(((12,36),(36,6)),12))))),(((((((3,40),((42,42),16)),(24,13)),(40,((24,13),(16,(40,3))))),38),((((48,43),43),48),(((5,((44,33),35)),47),(((((4,((37,(1,(8,45))),1)),(9,18)),41),((29,(28,7)),(((9,37),(21,20)),(32,1)))),((((25,31),(22,34)),42),(36,6)))))),((((48,43),((14,39),27)),(10,(((17,(2,19)),23),(((15,46),((11,15),30)),(((33,(35,47)),44),26))))),(6,12)))),(((92,((((((90,93),(70,((54,98),(87,56)))),((54,(70,98)),(((56,54),((93,87),(70,98))),90))),90),(93,((70,98),(98,70)))),((55,(55,55)),((((81,77),(62,(99,83))),94),(56,87))))),(((((((97,(86,58)),80),72),64),(69,((89,50),(95,(74,71))))),(73,52)),((((60,67),(96,63)),(((91,78),(79,49)),((59,(61,66)),(76,57)))),(100,(85,75))))),((((77,((81,((83,(62,99)),99)),62)),((((68,((65,100),75)),(82,85)),((((65,(100,65)),53),100),(51,(84,88)))),((92,(92,92)),(((((82,85),75),(((51,(68,100)),(88,84)),(65,53))),94),((55,77),(92,((81,99),(83,62)))))))),((55,((87,70),((((90,(70,87)),(93,54)),(56,98)),((90,(93,(54,56))),98)))),(((94,94),(94,94)),(83,55)))),(((82,((88,84),51)),(68,(65,53))),((((81,77),55),(55,(62,(99,83)))),((((99,62),(68,(((84,88),(65,100)),94))),(((65,100),(((51,(82,85)),75),(53,(100,65)))),(92,83))),((90,93),(((54,56),70),(98,87)))))))));

The current version produces the following output:

((3,40),25,2);
((45,8),(((((7,28),9),32),(37,(18,29))),(41,4)),(1,20));
((((47,5),44),(35,((15,46),(11,26)))),42,(16,(24,13)));
(47,5,((44,33),35));
((((9,37),(21,20)),(32,1)),29,(28,7));
((36,6),((25,31),(22,34)),42);
(((14,39),27),48,43);
(23,17,(2,19));
(26,(33,(35,47)),44);
((70,((54,98),(87,56))),90,93);
((56,87),((81,77),(62,(99,83))),94);
(((((60,67),(96,63)),(((91,78),(79,49)),((59,(61,66)),(76,57)))),(100,(85,75))),(((((97,(86,58)),80),72),64),(69,((89,50),(95,(74,71))))),(73,52));
((82,85),68,((65,100),75));
(((55,77),(92,((81,99),(83,62)))),(((82,85),75),(((51,(68,100)),(88,84)),(65,53))),94);
((68,(65,53)),82,((88,84),51));
((68,(((84,88),(65,100)),94)),99,62);
((((54,56),70),(98,87)),90,93);

I think that there are trees that have clades that are not maximal (they are subsets of some other clade of one of the output trees)

((82, 85), 68, ((65, 100), 75)) has a clade that is smaller than (((55, 77), (92, ((81, 99), (83, 62)))), (((82, 85), 75), (((51, (68, 100)), (88, 84)), (65, 53))), 94)!
  by {65,68,75,82,85,100} being smaller than {51,53,55,62,65,68,75,77,81,82,83,84,85,88,92,94,99,100}
((68, (65, 53)), 82, ((88, 84), 51)) has a clade that is smaller than (((55, 77), (92, ((81, 99), (83, 62)))), (((82, 85), 75), (((51, (68, 100)), (88, 84)), (65, 53))), 94)!
  by {51,53,65,68,82,84,88} being smaller than {51,53,55,62,65,68,75,77,81,82,83,84,85,88,92,94,99,100}
((68, (((84, 88), (65, 100)), 94)), 99, 62) has a clade that is smaller than (((55, 77), (92, ((81, 99), (83, 62)))), (((82, 85), 75), (((51, (68, 100)), (88, 84)), (65, 53))), 94)!
  by {62,65,68,84,88,94,99,100} being smaller than {51,53,55,62,65,68,75,77,81,82,83,84,85,88,92,94,99,100}

As reference, here is my program's output (but I have not thoroughly debugged mine yet, just in case it is useful)

((25,2),(3,40));
((17,(2,19)),23);
((48,43),((14,39),27));
((5,((44,33),35)),47);
(((33,(35,47)),44),26);
((24,13),(16,(40,3)));
((90,93),(70,((54,98),(87,56))));
(((56,54),((93,87),(70,98))),90);
(((90,(70,87)),(93,54)),(56,98));
((((25,31),(22,34)),42),(36,6));
((90,93),(((54,56),70),(98,87)));
((((81,77),(62,(99,83))),94),(56,87));
((42,(16,(24,13))),(((47,5),44),(35,((15,46),(11,26)))));
((((9,37),41),4),((1,((21,32),(20,(45,8)))),(28,(29,(18,7)))));
(((((82,85),75),(((51,(68,100)),(88,84)),(65,53))),94),((55,77),(92,((81,99),(83,62)))));
(((((((97,(86,58)),80),72),64),(69,((89,50),(95,(74,71))))),(73,52)),((((60,67),(96,63)),(((91,78),(79,49)),((59,(61,66)),(76,57)))),(100,(85,75))));

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.