Giter VIP home page Giter VIP logo

wisecondorff's Introduction

WisecondorFF

A whole-genome sequencing based copy number variation detection tool. WisecondorFF builds further upon WisecondorX by integrating fragment size inferred from paired-end reads and combining these data with the within-sample comparison based on genome read coverage.

wisecondorff's People

Contributors

tomokveld avatar

Stargazers

 avatar Karma avatar  avatar Matthias De Smet avatar

Watchers

 avatar

wisecondorff's Issues

Add support for `_alt` chromosomes

Hi,

Would it be possible to add "support" for _alt chromosomes? This can be achieved by adding the read count of the _alts to their canonical couterparts. Without this step, you get an messed up image when aligned to a full hg38 reference.

Matthias

Error using detect command

I get a ValueError when using the detect command.

python3 main.py detect -i infile.npz -r reference.250kb.npz -o output

Where the reference was built using 19 NK samples (logfile for reference building in #1)

Traceback (most recent call last):
  File "/projects/karsanlab/jbridgers_dev/PEGASUS/KARSANBIO-2955_Testing_WisecondorFF/WisecondorFF/src/main.py", line 1203, in <module>
    main()
  File "/projects/karsanlab/jbridgers_dev/PEGASUS/KARSANBIO-2955_Testing_WisecondorFF/WisecondorFF/src/main.py", line 41, in wrap
    output = f(*args, **kwargs)
  File "/projects/karsanlab/jbridgers_dev/PEGASUS/KARSANBIO-2955_Testing_WisecondorFF/WisecondorFF/src/main.py", line 1199, in main
    args.func(args)
  File "/projects/karsanlab/jbridgers_dev/PEGASUS/KARSANBIO-2955_Testing_WisecondorFF/WisecondorFF/src/main.py", line 41, in wrap
    output = f(*args, **kwargs)
  File "/projects/karsanlab/jbridgers_dev/PEGASUS/KARSANBIO-2955_Testing_WisecondorFF/WisecondorFF/src/main.py", line 900, in wcr_detect
    _rc_results["results_nr"] + (_fs_results["results_nr"])
ValueError: operands could not be broadcast together with shapes (761,19) (761,) 

I'm not sure what the second shape is or where the 761 value comes from, but the 19 is probably to do with the 19 samples used to generate the reference. WisecondorX recommends at least 50 reference samples, is this an issue with WisecondorFF?

Error creating WisecondorFF reference

I'm having some issues building the WisecondorFF reference, but no issues converting from BAM to NPZ. I've run it with a log level for debugging. For context, I'm creating the reference from 97 normal samples (97 NPZ files). Snippet below, log file attached.

...
[INFO - 2022-03-17 15:46:23]: Removed 11512 bins that have < 500 observations
[INFO - 2022-03-17 15:46:23]: Scaling up by a factor of 50 from 5,000 to 250,000.
../WisecondorFF/src/main.py:343: RuntimeWarning: invalid value encountered in true_divide
  all_data = all_data / sum_per_sample
../WisecondorFF/src/main.py:346: RuntimeWarning: invalid value encountered in greater
  mask = sum_per_bin > 0
Traceback (most recent call last):
  File "../WisecondorFF/src/main.py", line 1171, in <module>
    main()
  File "../WisecondorFF/src/main.py", line 39, in wrap
    output = f(*args, **kwargs)
  File "../WisecondorFF/src/main.py", line 1167, in main
    args.func(args)
  File "../WisecondorFF/src/main.py", line 39, in wrap
    output = f(*args, **kwargs)
  File "../WisecondorFF/src/main.py", line 541, in wcr_reference
    args.binsize, args.refsize, fs_samples, fs_total_mask, fs_bins_per_chr, rc=False
  File "../WisecondorFF/src/main.py", line 389, in reference_prep
    pca_corrected_data, pca = train_pca(masked_data)
  File "../WisecondorFF/src/main.py", line 354, in train_pca
    pca.fit(t_data)
  File "/home/dlin/.conda/envs/pegasus2_pipeline/lib/python3.6/site-packages/scikit_learn-0.22.1-py3.6-linux-x86_64.egg/sklearn/decomposition/_pca.py", line 344, in fit
    self._fit(X)
  File "/home/dlin/.conda/envs/pegasus2_pipeline/lib/python3.6/site-packages/scikit_learn-0.22.1-py3.6-linux-x86_64.egg/sklearn/decomposition/_pca.py", line 391, in _fit
    copy=self.copy)
  File "/home/dlin/.conda/envs/pegasus2_pipeline/lib/python3.6/site-packages/scikit_learn-0.22.1-py3.6-linux-x86_64.egg/sklearn/utils/validation.py", line 594, in check_array
    context))
ValueError: Found array with 0 feature(s) (shape=(97, 0)) while a minimum of 1 is required.

peg2_ref-n519-29682026.txt

Errors making detection and creating reference

Hello! I'm trying to use WisecondorFF using reference npz which I made for WisecondorX, but I get an error:

$ python ~/bin/WisecondorFF/src/main.py detect -i npz/sample_1.npz -r ref/reference_60kb.npz -o results/sample_1
Traceback (most recent call last):
  File "/home/vray/bin/WisecondorFF/src/main.py", line 1203, in <module>
    main()
  File "/home/vray/bin/WisecondorFF/src/main.py", line 41, in wrap
    output = f(*args, **kwargs)
  File "/home/vray/bin/WisecondorFF/src/main.py", line 1199, in main
    args.func(args)
  File "/home/vray/bin/WisecondorFF/src/main.py", line 41, in wrap
    output = f(*args, **kwargs)
  File "/home/vray/bin/WisecondorFF/src/main.py", line 864, in wcr_detect
    ref = ref_npz["reference"].item()
  File "/home/vray/miniconda3/envs/bioinfo/lib/python3.6/site-packages/numpy/lib/npyio.py", line 259, in __getitem__
    raise KeyError("%s is not a file in the archive" % key)
KeyError: 'reference is not a file in the archive'

Then I have tried to make reference with WsecondorFF reference, but get an error too:

$ python ~/bin/WisecondorFF/src/main.py reference -i npz/*.npz -o ref_50kb.npz
Traceback (most recent call last):
  File "/home/vray/bin/WisecondorFF/src/main.py", line 1203, in <module>
    main()
  File "/home/vray/bin/WisecondorFF/src/main.py", line 41, in wrap
    output = f(*args, **kwargs)
  File "/home/vray/bin/WisecondorFF/src/main.py", line 1199, in main
    args.func(args)
  File "/home/vray/bin/WisecondorFF/src/main.py", line 41, in wrap
    output = f(*args, **kwargs)
  File "/home/vray/bin/WisecondorFF/src/main.py", line 537, in wcr_reference
    args.binsize, args.refsize, fs_samples, fs_total_mask, fs_bins_per_chr, rc=False
  File "/home/vray/bin/WisecondorFF/src/main.py", line 389, in reference_prep
    pca_corrected_data, pca = train_pca(masked_data)
  File "/home/vray/bin/WisecondorFF/src/main.py", line 354, in train_pca
    pca.fit(t_data)
  File "/home/vray/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sklearn/decomposition/_pca.py", line 351, in fit
    self._fit(X)
  File "/home/vray/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sklearn/decomposition/_pca.py", line 398, in _fit
    ensure_2d=True, copy=self.copy)
  File "/home/vray/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sklearn/base.py", line 420, in _validate_data
    X = check_array(X, **check_params)
  File "/home/vray/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/home/vray/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sklearn/utils/validation.py", line 661, in check_array
    context))
ValueError: Found array with 0 feature(s) (shape=(9, 0)) while a minimum of 1 is required.

I have tried to set RC_CLIP_ABS lower, but it doesn't help.

This is one of my npz
sample_1.zip

For aligning I use bwa-mem2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.