Bioinformatics Group at TRON - Translational Oncology at the Medical Center of the Johannes Gutenberg-University Mainz gGmbH's Projects
ArtiFusion is a tool to simulate artificial fusion events by modifying a given reference genome. The tool copies parts of the exonic sequence of gene A within the reference genome FASTA sequence into the downstream region of gene B and replaces the copied regions of gene A with Ns. The breakpoints are defined by using a size ratio between gene A and gene B and are always placed on exon-exon junctions. Intronic and intergenic regions remain unchanged. The approach can be used to benchmark fusion detection tools with realistic biological data. In contrast to simulating NGS reads (ART package, https://www.niehs.nih.gov/research/resources/software/biostatistics/art/index.cfm), we do not lose the biological relevance of sequencing data.
Conda recipes for the bioconda channel.
CoVigator - Monitoring SARS-CoV-2 mutations
Some data analysis and prototypes from the CoVigator project
A Nextflow pipeline for NGS variant calling on SARS-CoV-2. From FASTQ files to normalized and annotated VCF files from GATK, BCFtools, LoFreq and iVar.
EasyFuse is a pipeline for accurate fusion gene detection from RNA-seq data.
EasyFuse source code to build python package
Quantification of reads at defined positions to verify custom input sequences.
Python script for creating and editing Singularity images on a HPC servers without sudo rights.
A test of GitHub pages
Code related to the manuscript "Multiple instance learning to predict immune checkpoint blockade efficacy using neoantigen candidates"
Annotation of mutated peptide sequences with published or novel potential neoantigen descriptors
Repository to host tool-specific module files for the Nextflow DSL2 community!
Test data to be used for automated testing with the nf-core pipelines
Analysis code, data and figures on the omicron MHC binding paper
Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
In-silico method written in Python and R to determine HLA genotypes of a sample. seq2HLA takes standard RNA-Seq sequence reads in fastq format as input, uses a bowtie index comprising all HLA alleles and outputs the most likely HLA class I and class II genotypes (in 4 digit resolution), a p-value for each call, and the expression of each class.
R package to analyze aberrant splicing junctions in tumor samples to identify neoepitopes
Scripts related to the manuscript "Prediction of tumor-specific splicing from somatic mutations as a source of neoantigen candidates"
A study to compare structural variation (SV) predictions from 10X Genomics linked-reads sequencing (10XWGS) and conventional Illumina short-reads sequencing (cWGS).
an online cancer cell line catalogue integrating HLA type, predicted neo-epitopes, virus and gene expression
TronFlow documentation
Nextflow pipeline for BWA, BWA2 and STAR alignments
Nextflow pipeline for the preprocessing of BAM files based on GATK best practices. Marking duplicates, realignment around indels, base quality score recalibration (BQSR) and reporting of metrics are optional to maintain flexibility for different use cases.
A nextflow workflow for copy number calling
A nextflow pipeline implementing GATK's HaplotypeCaller best practices
A Nextflow workflow for HLA typing using HLA-HD