Rosalind

Rosalind is a platform for learning bioinformatics and programming through problem solving.

Problems (285 in total @20181125)

Python 3.5+ is recommended until now. It's the future.

PV - Python Village

If you are completely new to programming, try these initial problems to learn a few basics about the Python programming language. You'll get familiar with the operations needed to start solving bioinformatics challenges in the Stronghold.

6 Problems: INI1 Installing Python; INI2 Variables and Some Arithmetic; INI3 Strings and Lists; INI4 Conditions and Loops; INI5 Working with Files; INI6 Dictionaries;

The following Problems will be finished soon.

BS - Bioinformatics Stronghold

Discover the algorithms underlying a variety of bioinformatics topics: computational mass spectrometry, alignment, dynamic programming, genome assembly, genome rearrangements, phylogeny, probability, string algorithms and others.

105 Problems： DNA Counting DNA Nucleotides; RNA Transcribing DNA into RNA; REVC Complementing a Strand of DNA; FIB Rabbits and Recurrence Relations; GC Computing GC Content; HAMM Counting Point Mutations; IPRB Mendel's First Law; PROT Translating RNA into Protein; SUBS Finding a Motif in DNA; CONS Consensus and Profile; FIBD Mortal Fibonacci Rabbits👍; GRPH Overlap Graphs; IEV Calculating Expected Offspring; LCSM Finding a Shared Motif; LIA Independent Alleles; MPRT Finding a Protein Motif; MRNA Inferring mRNA from Protein; ORF Open Reading Frames; PERM Enumerating Gene Orders; PRTM Calculating Protein Mass; REVP Locating Restriction Sites; SPLC RNA Splicing; LEXF Enumerating k-mers Lexicographically; LGIS Longest Increasing Subsequence; LONG Genome Assembly as Shortest Superstring; PMCH Perfect Matchings and RNA Secondary Structures; PPER Partial Permutations; PROB Introduction to Random Strings; SIGN Enumerating Oriented Gene Orderings; SSEQ Finding a Spliced Motif; TRAN Transitions and Transversions; TREE Completing a Tree; CAT Catalan Numbers and RNA Secondary Structures; CORR Error Correction in Reads; INOD Counting Phylogenetic Ancestors; KMER k-Mer Composition; KMP Speeding Up Motif Finding; LCSQ Finding a Shared Spliced Motif; LEXV Ordering Strings of Varying Length Lexicographically; MMCH Maximum Matchings and RNA Secondary Structures; PDST Creating a Distance Matrix; REAR Reversal Distance; RSTR Matching Random Motifs; SSET Counting Subsets; ASPC Introduction to Alternative Splicing; EDIT Edit Distance; EVAL Expected Number of Restriction Sites; MOTZ Motzkin Numbers and RNA Secondary Structures; NWCK Distances in Trees; SCSP Interleaving Two Motifs; SETO Introduction to Set Operations; SORT Sorting by Reversals; SPEC Inferring Protein from Spectrum; TRIE Introduction to Pattern Matching; CONV Comparing Spectra with the Spectral Convolution; CTBL Creating a Character Table; DBRU Constructing a De Bruijn Graph; EDTA Edit Distance Alignment; FULL Inferring Peptide from Full Spectrum; INDC Independent Segregation of Chromosomes; ITWV Finding Disjoint Motifs in a Gene; LREP Finding the Longest Multiple Repeat; NKEW Newick Format with Edge Weights; RNAS Wobble Bonding and RNA Secondary Structures; AFRQ Counting Disease Carriers; CSTR Creating a Character Table from Genetic Strings; CTEA Counting Optimal Alignments; CUNR Counting Unrooted Binary Trees; GLOB Global Alignment with Scoring Matrix; PCOV Genome Assembly with Perfect Coverage; PRSM Matching a Spectrum to a Protein; QRT Quartets; SGRA Using the Spectrum Graph to Infer Peptides; SUFF Encoding Suffix Trees; CHBP Character-Based Phylogeny; CNTQ Counting Quartets; EUBT Enumerating Unrooted Binary Trees; GASM Genome Assembly Using Reads; GCON Global Alignment with Constant Gap Penalty; LING Linguistic Complexity of a Genome; LOCA Local Alignment with Scoring Matrix; MEND Inferring Genotype from a Pedigree; MGAP Maximizing the Gap Symbols of an Optimal Alignment; MREP Identifying Maximal Repeats; MULT Multiple Alignment; PDPL Creating a Restriction Map; ROOT Counting Rooted Binary Trees; SEXL Sex-Linked Inheritance; SPTD Phylogeny Comparison with Split Distance; WFMD The Wright-Fisher Model of Genetic Drift; ALPH Alignment-Based Phylogeny; ASMQ Assessing Assembly Quality with N50 and N75; CSET Fixing an Inconsistent Character Set; EBIN Wright-Fisher's Expected Behavior; FOUN The Founder Effect and Genetic Drift; GAFF Global Alignment with Scoring Matrix and Affine Gap Penalty; GREP Genome Assembly with Perfect Coverage and Repeats; OAP Overlap Alignment; QRTD Quartet Distance; SIMS Finding a Motif with Modifications; SMGB Semiglobal Alignment; KSIM Finding All Similar Motifs; LAFF Local Alignment with Affine Gap Penalty; OSYM Isolating Symbols in Alignments; RSUB Identifying Reversing Substitutions;

BA - Bioinformatics Armory

Ready-to-use software tools abound for bioinformatics analysis. Whereas in the Stronghold you implement algorithms on your own, in the Armory you solve similar problems by using existing tools.

16 Problems: INI Introduction to the Bioinformatics Armory; DBPR Introduction to Protein Databases; GBK GenBank Introduction; FRMT Data Formats; MEME New Motif Discovery; NEED Pairwise Global Alignment; TFSQ FASTQ format introduction; PHRE Read Quality Distribution; PTRA Protein Translation; FILT Read Filtration by Quality; RVCO Complementing a Strand of DNA; SUBO Suboptimal Local Alignment; BPHR Base Quality Distribution; CLUS Global Multiple Alignment; ORFR Finding Genes with ORFs; BFIL Base Filtration by Quality;

BT - Bioinformatics Textbook Track

A collection of exercises to accompany Bioinformatics Algorithms: An Active-Learning Approach by Phillip Compeau & Pavel Pevzner. A full version of this text is hosted on stepic.org

124 Problems: BA10A Compute the Probability of a Hidden Path; BA10B Compute the Probability of an Outcome Given a Hidden Path; BA10C Implement the Viterbi Algorithm; BA10D Compute the Probability of a String Emitted by an HMM; BA10E Construct a Profile HMM; BA10F Construct a Profile HMM with Pseudocounts; BA10G Perform a Multiple Sequence Alignment with a Profile HMM; BA10H Estimate the Parameters of an HMM; BA10I Implement Viterbi Learning; BA10J Solve the Soft Decoding Problem; BA10K Implement Baum-Welch Learning; BA11A Construct the Graph of a Spectrum; BA11B Implement DecodingIdealSpectrum; BA11C Convert a Peptide into a Peptide Vector; BA11D Convert a Peptide Vector into a Peptide; BA11E Sequence a Peptide; BA11F Find a Highest-Scoring Peptide in a Proteome against a Spectrum; BA11G Implement PSMSearch; BA11H Compute the Size of a Spectral Dictionary; BA11I Compute the Probability of a Spectral Dictionary; BA11J Find a Highest-Scoring Modified Peptide against a Spectrum; BA1A Compute the Number of Times a Pattern Appears in a Text; BA1B Find the Most Frequent Words in a String; BA1C Find the Reverse Complement of a String; BA1D Find All Occurrences of a Pattern in a String; BA1E Find Patterns Forming Clumps in a String; BA1F Find a Position in a Genome Minimizing the Skew; BA1G Compute the Hamming Distance Between Two Strings; BA1H Find All Approximate Occurrences of a Pattern in a String; BA1I Find the Most Frequent Words with Mismatches in a String; BA1J Find Frequent Words with Mismatches and Reverse Complements; BA1K Generate the Frequency Array of a String; BA1L Implement PatternToNumber; BA1M Implement NumberToPattern; BA1N Generate the d-Neighborhood of a String; BA2A Implement MotifEnumeration; BA2B Find a Median String; BA2C Find a Profile-most Probable k-mer in a String; BA2D Implement GreedyMotifSearch; BA2E Implement GreedyMotifSearch with Pseudocounts; BA2F Implement RandomizedMotifSearch; BA2G Implement GibbsSampler; BA2H Implement DistanceBetweenPatternAndStrings; BA3A Generate the k-mer Composition of a String; BA3B Reconstruct a String from its Genome Path; BA3C Construct the Overlap Graph of a Collection of k-mers; BA3D Construct the De Bruijn Graph of a String; BA3E Construct the De Bruijn Graph of a Collection of k-mers; BA3F Find an Eulerian Cycle in a Graph; BA3G Find an Eulerian Path in a Graph; BA3H Reconstruct a String from its k-mer Composition; BA3I Find a k-Universal Circular String; BA3J Reconstruct a String from its Paired Composition; BA3K Generate Contigs from a Collection of Reads; BA3L Construct a String Spelled by a Gapped Genome Path; BA3M Generate All Maximal Non-Branching Paths in a Graph; BA4A Translate an RNA String into an Amino Acid String; BA4B Find Substrings of a Genome Encoding a Given Amino Acid String; BA4C Generate the Theoretical Spectrum of a Cyclic Peptide; BA4D Compute the Number of Peptides of Given Total Mass; BA4E Find a Cyclic Peptide with Theoretical Spectrum Matching an Ideal Spectrum; BA4F Compute the Score of a Cyclic Peptide Against a Spectrum; BA4G Implement LeaderboardCyclopeptideSequencing; BA4H Generate the Convolution of a Spectrum; BA4I Implement ConvolutionCyclopeptideSequencing; BA4J Generate the Theoretical Spectrum of a Linear Peptide; BA4K Compute the Score of a Linear Peptide; BA4L Trim a Peptide Leaderboard; BA4M Solve the Turnpike Problem; BA5A Find the Minimum Number of Coins Needed to Make Change; BA5B Find the Length of a Longest Path in a Manhattan-like Grid; BA5C Find a Longest Common Subsequence of Two Strings; BA5D Find the Longest Path in a DAG; BA5E Find a Highest-Scoring Alignment of Two Strings; BA5F Find a Highest-Scoring Local Alignment of Two Strings; BA5G Compute the Edit Distance Between Two Strings; BA5H Find a Highest-Scoring Fitting Alignment of Two Strings; BA5I Find a Highest-Scoring Overlap Alignment of Two Strings; BA5J Align Two Strings Using Affine Gap Penalties; BA5K Find a Middle Edge in an Alignment Graph in Linear Space; BA5L Align Two Strings Using Linear Space; BA5M Find a Highest-Scoring Multiple Sequence Alignment; BA5N Find a Topological Ordering of a DAG; BA6A Implement GreedySorting to Sort a Permutation by Reversals; BA6B Compute the Number of Breakpoints in a Permutation; BA6C Compute the 2-Break Distance Between a Pair of Genomes; BA6D Find a Shortest Transformation of One Genome into Another by 2-Breaks; BA6E Find All Shared k-mers of a Pair of Strings; BA6F Implement ChromosomeToCycle; BA6G Implement CycleToChromosome; BA6H Implement ColoredEdges; BA6I Implement GraphToGenome; BA6J Implement 2-BreakOnGenomeGraph; BA6K Implement 2-BreakOnGenome; BA7A Compute Distances Between Leaves; BA7B Compute Limb Lengths in a Tree; BA7C Implement AdditivePhylogeny; BA7D Implement UPGMA; BA7E Implement the Neighbor Joining Algorithm; BA7F Implement SmallParsimony; BA7G Adapt SmallParsimony to Unrooted Trees; BA8A Implement FarthestFirstTraversal; BA8B Compute the Squared Error Distortion; BA8C Implement the Lloyd Algorithm for k-Means Clustering; BA8D Implement the Soft k-Means Clustering Algorithm; BA8E Implement Hierarchical Clustering; BA9A Construct a Trie from a Collection of Patterns; BA9B Implement TrieMatching; BA9C Construct the Suffix Tree of a String; BA9D Find the Longest Repeat in a String; BA9E Find the Longest Substring Shared by Two Strings; BA9F Find the Shortest Non-Shared Substring of Two Strings; BA9G Construct the Suffix Array of a String; BA9H Pattern Matching with the Suffix Array; BA9I Construct the Burrows-Wheeler Transform of a String; BA9J Reconstruct a String from its Burrows-Wheeler Transform; BA9K Generate the Last-to-First Mapping of a String; BA9L Implement BWMatching; BA9M Implement BetterBWMatching; BA9N Find All Occurrences of a Collection of Patterns in a String; BA9O Find All Approximate Occurrences of a Collection of Patterns in a String; BA9P Implement TreeColoring; BA9Q Construct the Partial Suffix Array of a String; BA9R Construct a Suffix Tree from a Suffix Array;

AH - Algorithmic Heights

A collection of exercises in introductory algorithms to accompany "Algorithms", the popular textbook by Dasgupta, Papadimitriou, and Vazirani.

34 Problems: FIBO Fibonacci Numbers; BINS Binary Search; DEG Degree Array; INS Insertion Sort; DDEG Double-Degree Array; MAJ Majority Element; MER Merge Two Sorted Arrays; 2SUM 2SUM; BFS Breadth-First Search; CC Connected Components; HEA Building a Heap; MS Merge Sort; PAR 2-Way Partition; 3SUM 3SUM; BIP Testing Bipartiteness; DAG Testing Acyclicity; DIJ Dijkstra's Algorithm; HS Heap Sort; INV Counting Inversions; PAR3 3-Way Partition; SQ Square in a Graph; BF Bellman-Ford Algorithm; CTE Shortest Cycle Through a Given Edge; MED Median; PS Partial Sort; TS Topological Sorting; HDAG Hamiltonian Path in DAG; NWC Negative Weight Cycle; QS Quick Sort; SCC Strongly Connected Components; 2SAT 2-Satisfiability; GS General Sink; SC Semi-Connected Graph; SDAG Shortest Paths in DAG;

verohan / rosalind Goto Github PK

rosalind's Introduction

Rosalind

Problems (285 in total @20181125)

PV - Python Village

BS - Bioinformatics Stronghold

BA - Bioinformatics Armory

BT - Bioinformatics Textbook Track

AH - Algorithmic Heights

rosalind's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent