Giter VIP home page Giter VIP logo

nm_nanopore_drosophila's Introduction

Detection of Nm sites from Nanopore direct RNA sequencing datasets using Epinano-5mer analysis

image

Table of contents

General Description

This repository contains the scripts for the analysis of EpiNano results produced from direct RNA sequencing of rRNA of Drosophila upon Knock-Down of Fibrillarin (FBL). The script is customised for the analysis of modifications whose signature in Nanopore sequencing is not limited to the modified position and diffuses in the neighboring positions (5 bases interval - here referred to as "kmer"), such as 2'-O-methylation (Nm).

Usage

For 5mer information, we use the Epinano (please visit https://github.com/enovoa/EpiNano for this first step) script TSV_to_Variants_Freq.py3 to analyse the samtotsv output, that generates the .tsv.per.site.var.per_site_var.5mer.csv files for each sample and replicate.

If instead you have EpiNano output at per site level, you can extract information about the neighboring positions at the 5mer level, by running:

python Slide_Variants.py file.plus_strand.per.site.csv 5

Once you have Epinano 5mer input ready, you can proceed to run Epinano_fivemer_analysis.R interactively in RStudio.

!!! The script requires to add custom input and output paths, file names and labels at the beginning !!!

The script is based on 2 replicates of 2 conditions. If you have more replicates you will have to edit the script accordingly. Explanation of the steps of the script is included in the script itself.

Input

  • Epinano 5mer table for each sample ( .tsv.per.site.var.per_site_var.5mer.csv)

  • a bed file with predicted modified sites if available (Nm_pos.bed)

You can find examples of the input files in the "input" folder.

Results

The script produces as output:

  • FBL_kmer_rep1_colors.bed and FBL_kmer_rep2_colors.bed contain a color code for the EpiNano scores of each kmer centered in each position in the reference. This bed file allows to visualise the scores (yellow to red= low to high score) in IGV for comparison.

  • FBL_kmer_rep1_all.txt and FBL_kmer_rep2_all.txt contain the EpiNano scores of each kmer centered in each position in the reference. Columns 4 and 5 contain the sum of errors for each kmer in non treated and FBL KD respectively, column 6 contains the difference between the sum of errors in the two conditions, referred to as EpiNano score. This file can be converted into a bedgraph by just changing extention from .txt to .bedgraph and can be loaded and visualized in IGV as barplot or heatmap.

  • FBL_kmer_replicable_positions_18S.txt and FBL_kmer_replicable_positions_28S.txt contain a list of sites in which a position has a score higher than 3*median of the other positions' score in both replicates for each long rRNA transcript.

  • a barplot of the scaled difference of the summed errors per 5mer (score) along the transcript per reference per comparison.

Dependencies and versions

Software Version
RStudio 4.2
python 3

Citation

If you find this work useful, please cite:

Sklias A, Cruciani S, Marchand V, et al. Comprehensive map of ribosomal 2'-O-methylation and C/D box snoRNAs in Drosophila melanogaster. Nucleic Acids Res. Published online February 28, 2024. doi:10.1093/nar/gkae139

Contact

If you have any issues running this code, please go first over previous issues. If you still can't figure it out based on the prior responses/issues raised, please open a new issue. Thanks!

nm_nanopore_drosophila's People

Contributors

soniacruciani avatar enovoa avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.