christacaggiano / celfie Goto Github PK
View Code? Open in Web Editor NEWcfDNA cell type of origin estimation
License: GNU Affero General Public License v3.0
cfDNA cell type of origin estimation
License: GNU Affero General Public License v3.0
hi Christa, our thesis student added a random seed flag for numpy in celfie to that the output is reproducible (https://github.com/rmvpaeme/celfie/blob/ad1ff869ad08cc382a84ff7addd0af59d74a1519/EM/em.py#L360). It might be interesting to add this to the main branch.
Dear Christa,
Thank you for developing celfie.
I am trying to apply the analysis to a very controlled dilution experiment we conducted, but I was puzzled to observe that celfie returned often negative fractions in the cell type estimation. Also, different runs lead to very different results (numerical instability?). Is this behaviour expected? I checked the input files and they seem to be fine.
I am deconvolving 20 samples simultaneously using 3 known pure components + 1 unknown, and other much simpler tools based on quadratic programming or regression on beta seem to easily capture the correct fractions up to few % of error.
Best
Hi Christa,
Thank you for developing CelFiE.
When I use the tim.sh to generate my own TIMs, I found the tim_summed.txt output file was not for regions but for CpG sites. After debugging, I found there is a code error in this line "awk -v window="$window_size" '{print $1 "\t" $2-$window "\t" $3+$window}' $output_file"sites" > $output_file""$window_size". It should be no $ before window inside awk.
Hope this is helpful.
Congratulations on the manuscript: https://www.biorxiv.org/content/10.1101/2020.01.15.907022v1"""
I think this is a really interesting concept.
It looks like you plan to put notebooks here at some point: https://github.com/christacaggiano/celfie/tree/master/jupyter_notebooks
Please update this issue when you do, as it would be great to see.
Best, Evan Biederstedt
Hello,
I'm trying to run CelFiE from the celfie_env conda enviroment
Command:
python /path/celfie/TIMs/tim.py input_path output 100 27 15 2
And I get the following output:
Traceback (most recent call last):
File "/usr/local/hurcs/miniconda3/envs/python-3.7/lib/python3.7/site-packages/numpy/core/__init__.py", line 22, in <module>
from . import multiarray
File "/usr/local/hurcs/miniconda3/envs/python-3.7/lib/python3.7/site-packages/numpy/core/multiarray.py", line 12, in <module>
from . import overrides
File "/usr/local/hurcs/miniconda3/envs/python-3.7/lib/python3.7/site-packages/numpy/core/overrides.py", line 7, in <module>
from numpy.core._multiarray_umath import (
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/cs/icore/ekushele/celfie/TIMs/tim.py", line 2, in <module>
import numpy as np
File "/usr/local/hurcs/miniconda3/envs/python-3.7/lib/python3.7/site-packages/numpy/__init__.py", line 150, in <module>
from . import core
File "/usr/local/hurcs/miniconda3/envs/python-3.7/lib/python3.7/site-packages/numpy/core/__init__.py", line 48, in <module>
raise ImportError(msg)
ImportError:
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.
We have compiled some common reasons and troubleshooting tips at:
https://numpy.org/devdocs/user/troubleshooting-importerror.html
Please note and check the following:
* The Python version is: Python3.9 from "/cs/icore/ekushele/miniconda3/envs/celfie_env/bin/python"
* The NumPy version is: "1.21.3"
and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.
Original error was: No module named 'numpy.core._multiarray_umath'
My conda_env.yml file (I upgraded numpy to 1.21.3):
name: celfie
channels:
- defaults
dependencies:
- numpy=1.21.2
- pandas
- bottleneck
- seaborn
Hi,
When I usu one sample, celfie runs well:
python celfie.py celfie_samples_combined_ref.txt dir 1
But when I try the same with two samples, it fails:
python celfie.py celfie_samples_combined_ref.txt dir 2
Here is the error message:
Traceback (most recent call last):
File "celfie.py", line 373, in <module>
alpha, gamma, ll = em(
File "celfie.py", line 182, in em
a, g = maximization(p0, p1, x, x_depths, y, y_depths)
File "celfie.py", line 131, in maximization
new_alpha[n, :] = np.dot(p1[:, :, n], x[n, :]) + np.matmul(
File "<__array_function__ internals>", line 200, in dot
TypeError: can't multiply sequence by non-int of type 'float'
Input is in correct format. Do you have any idea what could go wrong here? Thanks
Hi Christa,
Thank you for developing CelFiE. I am trying to run CelFiE on 2 samples (adult normal lung tissues-WGBS). After getting the coverage file from Bismark, I ran your prepare_bismark.sh script and created the input to CelFiE with the reference TIMs you provided. I ran CelFiE using the following command for both samples:
python em.py SRR3269863_reference_file_tims.txt 00-deconvolution_with_celfie 1 1000 0 1 0.001 1
python em.py SRR3274240.2_reference_file_tims.txt 00-deconvolution_with_celfie 1 1000 0 1 0.001 100
However, the cell_proportions from CelFiE for both samples indicate >90% placenta which should not be the case.
I am attaching the input files here for each sample as well as the output from CelFiE. I assumed that the output array from the pickle file is ordered in the same way as your reference_file_key.txt.
SRR3269863_reference_file_tims.txt
SRR3274240.2_reference_file_tims.txt
CelFiE_deconvolution_results.xlsx
Any help will be appreciated.
Thanks, Fayaz
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.