Giter VIP home page Giter VIP logo

pitviper's People

Contributors

camillelobry avatar paularthurm avatar

Watchers

 avatar

pitviper's Issues

Save alignment statistics - Bowtie2

Alignment statistics should be save in a results sub-directory, such as results/{token}/bowtie2/statistics.txt.

  • Save statistics in a text file

Bowtie2

Add bowtie2=2.2.4 and samtools=1.14 to env.

To do:

  • Preset options:

--very-fast
Same as: -D 5 -R 1 -N 0 -L 25 -i S,1,2.00

--fast
Same as: -D 10 -R 2 -N 0 -L 22 -i S,1,1.75

--sensitive
Same as: -D 15 -R 2 -N 0 -L 20 -i S,1,0.75 (default in --local mode)

--very-sensitive
Same as: -D 20 -R 3 -N 0 -L 20 -i S,1,0.50

Hardcoded cut-off

Min. reads cut-off in GUI.

Remove:

sgRNA_to_keep = sum_by_group[sum_by_group.value > 50].sgRNA.values

and

Move in GUI:

cts['below_threshold'] = cts['value'] < 100

A refactoring of the previous script is necessary to make the purpose of the script clear.

  • Add docstrings
  • Add a main function
  • Remove unused functions
  • Threshold from GUI/config

Add Union button

Add a check button for union selection instead of default intersection in integration module.

Run PitViper with subprocess

To improve the function, we could use Python's subprocess library to execute the command instead of using os.system to run the command. This would allow us to access the output of the command and take action based on it. Additionally, we could use string formatters to make the command string more readable and easier to debug. Here is an example of how the function could be improved:

import subprocess

def run_pitviper(configfile, dry_run, jobs, notebook):
    if notebook != "":
        nb_opt = f"--edit-notebook {notebook}"
    else:
        nb_opt = ""
    if dry_run:
        cmd = f"snakemake -s workflow/Snakefile -n --configfile {configfile} --use-conda --cores {jobs}"
    elif not dry_run:
        cmd = f"snakemake -s workflow/Snakefile --configfile {configfile} --use-conda --cores {jobs} {nb_opt}"
    print("Command:", cmd)
    result = subprocess.run(cmd, shell=True, capture_output=True)
    if result.returncode == 0:
        # command succeeded
    else:
        # command failed
  • Use subprocess instead of os.system

Normalization

To do:

  • Use raw counts for RRA and MLE
  • Add normalization option to RRA
  • Review the normalization process
  • Remove normalization option of MAGeCK counts

Lineplots of normalized counts

Improve visualization by:

  • Compute, display and link the mean normalized read counts of replicates per time-points
  • Display all individual points
  • Show normalization method

Heatmap with zeros

Example: SFPQ gene return an error when using show_sgRNA_counts(token):

ValueError: The condensed distance matrix must contain only finite values.

Should add a non-zero value to all cells before using Clustergrammer2.

Improve tool implementation in report

Extensibility was one of the primary goal of PitViper. However, the current implementation of the report make it very difficult and tedious to add new tools.

We should think to a manner to generalize how all functions are used.

Creating a tool agnostic class as a common interface for all results could be a solution. This class should have several characteristics:

  • Generalization: should work with different results and metrics, such as FDR, Bayesian Factor or any others uncertainty measures.
  • Common interface: the API should be consistent across all methods.

To create a new method, two features are mandatory: the name of the elements and at least one numerical score to rank the value (FDR, Bayesian Factor, etc.)

Rename in-house method

Find a better and more representative name for "in-house method". Then rename it in all scripts... :(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.