Giter VIP home page Giter VIP logo

splicingvariants_beta's Introduction

SplicingVariants_Beta

Developed and Maintained by Kaoru Ito (splicing.variant[at]gmail.com)

This document is for the paper "Identification of Pathogenic Mutations in LMNA and MYBPC3 That Alter RNA splicing" (Kaoru Ito and Parth Patel et al.) in PNAS 2017.
In the paper, we chose possibile splicing variants by in-silico prediction and tested them by cell-based splicing assay. Several perl and R scripts placed here were employed to perform the analysis.

*** In the paper, we focused on splcing variants that create / lose splice site. If you'd like to know about exon-skipping variants, please check SPANR (http://tools.genes.toronto.edu/)***

What is "Splicing Variant"(Splice-Altering Variant)?

In the intepretation of genetic variants, synonymous mutation is considered benign. Also missense mutaiton is on the fence in terms of its deleteriousness, whereas stop-gain/loss, framn-shift and splice-site (GT/AG-broken) mutations are considered damaging. These intepretations are NOT ALWAYS TRUE.
Supplementary Figure

Messenger RNA splicing, where intron regions flanked by a splice donor site and a splice acceptor site are cut out, occurs before protein syhthesis by mRNA translation.
In the mRNA splicing procedure, it's known that a nucleotide alteration that doens't change amino acide sequence can activate a cryptic splice site, which results in creating an aberrant splice donor/acceptor site in the middle of an exon / intron, followed by the exon truncation / intron retention in the protein. Additionally, not only GT/AG broken but also nearby nucleotide changes (donor:-3~+6bp, acceptor:-20~+3 from the splice juction) can disrupt the function of a canonical splice site.
Supplementary Figure

Several papers reported that such variants cause severe mendelian disroders, such as progeria syndrome and dilated cardiomyopathy However, because a method for detecting such splicing-altering variants in a high-throughput pipeline had not been developed, these variants were overlooled in usual NGS settings.

Here we present a high-throughput-friendly method to detecte candidates for a splicing-altering variants. Additionally we developed a cell-based splicing assay utilizing NGS technology for mulitiplexing analysis, where construct design and assessment of splicing alteration are automated.

Workflow

  1. Calculate scores to assess the possibility for a splicing variant and choose candidates for cell-based splicing assay
    -- Regress_Score.v0.##.R
  2. Design constructs for the candidates to perform cell-based splicing assay
    -- ConstructDesigner.v0.##.R
  3. Perform cell-based splicing assay (transfect constructs to cells, RNA extraction from the cells, prepare libraries for Miseq run
  4. Count normal / aberrant splciing products and calculate p-values from raw-fastq files (alignment not required)
    -- Make.inputframe.#.pl
    -- SpliceConstructSearchGrepV1.##.pl
    * # means developing version. Please use the latest one.

To run each script, please read README_regress.score.md, README_construct.designer.md and README_spliceconstructsearch.md.

Rationale for the cell-based splicing assay (p-values obtained from SpliceConstructSearchGrepV1.5.pl)

Because p-value of the assay reflects the degree of aberrant splicing which changes continuously, we investigated the relationship between the p-values and clinical diagnosis in variants from clinical databases (Supplementary Fig.). As expected, we found a significant correlation between -log10 p-value of the assay and severity of the clinical diagnosis (missense variants included p=8.2e-06, missense variants excluded p=1.2e-06, kendall’s rank correlation test). Since clinically likely-benign and benign variants never surpassed p<0.001 and just missense variants (which can be deleterious without affectingh splicing) in likely-pathogenic and pathogenic groups had p>0.001, we defined the threshold p=0.001 to distinguish pathogenic variants from benign variants in terms of a splicing variant. Supplementary Figure

splicingvariants_beta's People

Contributors

splicingvariant avatar

Stargazers

 avatar  avatar

Watchers

Jaspr Saris avatar  avatar

splicingvariants_beta's Issues

locating the correct transcript was failed?

I tried to use the local data to assess the splicing potentials in patient samples. here is the file for your replication as my local input file 'mutfile' (only single line for replication):

chr11:47333236C>T ENST00000545968

here was the command for my task.

`

R --slave --vanilla --args -mutfile test.mutfile -output mutscore.output \
-summarizeresult mutscore \
-sjdbout mutsjdbout \
-skipRegressScore -skipSRE -refdirectory ssfiles \
-skipcDNApos < Regress_Score.v0.97.R

`
But the error was thrown out. Here was error message:

Reading ISE sequences...
Reading a mutation file: test.mutfile
Reading Line # 1 : chr11:47333236C>T ENST00000545968 Accepted
Processing chr11:47333236C>T ENST00000545968 ...
Reverse strand gene found.
ChrPos information in a reverseStarnd gene. Therefore, change the allele info for cDNA.
Variant: chr11:47333236C>T is NOT on the indicated transcript: ENST00000545968 propably because it is on the downstream or UTR regions.
Writing splice site estimation result to: mutscore.output
Error in read.table(resultfilename, sep = "\t", header = T, stringsAsFactors = F) :
no lines available in input
Calls: TakeSigLines -> read.table

The script tried to find the transcript (Transcript: MADD-203 ENST00000349238.7) annotation in the forward strand, not the specified ID 'ENST00000545968', which is on the reverse strand.
So I have to transform the data to cDNA naming tag and gave it a try:
c.3288G>A ENST00000545968
The above format transformation seemed to guarantee the processing.
I am not sure if it is a bug.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.