Giter VIP home page Giter VIP logo

monod2's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

monod2's Issues

Interpreting methylation haplotypes

Hi Dinh,

I'm running your tool to extract methylation haplotypes from bam files, and I'm currently comparing the reads displayed in IGV with the methylation haplotypes generated from your tool. I'm seeing discrepancies, and I'm wondering if you can help me interpret the results.

My bam was aligned by bsmap, and the following command was used to generate the haplotype file:
sh scripts/bam2cghap_v1.sh WGBS ng.3805/MHBS.txt /allcpg/hg19.fa.allcpgs.txt.gz test.bam test

I compared the output with IGV:

example

The screenshot is from IGV's bisulfite CG mode, in which everything in blue are bases that are not methylated at CG motifs, referring to T's in your haplotype block output. If a base is methylated, it would be colored red, referring to C's in your haplotype block output. I have added the genomic coordinates right below the reads in IGV, and below the screenshot is the output from bam2cghap_v1.sh

Something isn't matching up: we don't see any reads with C's in IGV, and the T's doesn't account for all the reads in the region in IGV. Not sure if you have use IGV as a gold standard to compare your methylation haplotypes. Could the aligner (bsmap) be contributing to this?

A second example, from your data:
./scripts/make-mappable-bins.sh BAMfiles/CCT.bam 10
sh scripts/bam2cghap_v1.sh WGBS CCT.RD10_80up.genomecov.bed allcpg/hg19.fa.allcpgs.txt.gz BAMfiles/CCT.bam CCT

screen shot 2018-01-19 at 3 34 00 pm

And here's the haplotype file in the region:

chr11:377279-377364 CCCCCCCCCCCCCC 1 377282,377288,377304,377312,377314,377316,377321,377324,377336,377340,377344,377346,377348,377356
chr11:377279-377364 CCCCCCCCCCTCCC 1 377282,377288,377304,377312,377314,377316,377321,377324,377336,377340,377344,377346,377348,377356
chr11:377279-377364 CCCCCCCCTTCCCC 1 377282,377288,377304,377312,377314,377316,377321,377324,377336,377340,377344,377346,377348,377356

Again, not all the reads in IGV account for the haplotype file, and I have checked that none of the reads here are PCR duplicates.

Would be interested to hear what you think about this.

Thanks,
Chris

Extracting methylation haplotypes from sequence alignment files

Hi dinhdiep,

I'm very interested in your work and want to conduct methylation haplotype analysis on my bisulfite sequencing data. I am having trouble generating CpG haplotype files using your code. When I try the test example:
./scripts/bam2cghap.sh allcpg/cpg.small.txt.gz BAMfiles/Colon_primary_tumor_sept9_promoter.bam test

I get this error for many lines:
Use of uninitialized value $ref_seq in concatenation (.) or string at bin/getHaplo_PE_cgOnly.pl line 161, line 28.

Could you help me to get this to work? I would really appreciate it.

getHaplo_PE_cgOnly.pl

Hello,
Is there a quick way to fit the getHaplo_PE_cgOnly.pl into single end read data?
Thanks,

mapping

could I do alignment with bismark ?
and use it`s BAM file do methylation haplotype analysis

/bin/cgHap2MHBs_parallel.pl

Hi dinhdiep,
I want to use your method to generate methylation haplotype blocks.
When running your script for Identifying the methylation haplotype blocks:
/bin/cgHap2MHBs_parallel.pl [hapInfo file] [min R2] [snpBed file] [outPrefix] ,what's type file "[snpBed file]" should be given ? It is the files in "data" folder named snp138_part_.gz ? I have tried used those snp138_part_.gz files, but couldn't got the expected results , what's the potential problem ?Can you help me address this problem? Thank you very much!
Best wishes!

Missing allcpg directory in the repository

Hi dinhdiep,
I am very interested in your great work and want to use your method to generate MHL for liver cancer. When running your script I found there are some missing in the examples. For example, in this command:

./scripts/bam2cghap.sh allcpg/cpg.small.txt.gz BAMfiles/Colon_primary_tumor_sept9_promoter.bam test

I can not find a allcpg directory in your repository. With out the file, I can't figure out the format and generate my own file that is compatible with the command.

And in this command,

./scripts/cghap2mhbs.sh results/HaploInfo/chr22.sub.hapInfo.txt example/N37_10_tissue_pooled.autosomes.RD10_80up.genomecov.bed 0.3 chr22

The results/HaploInfo/chr22.sub.hapInfo.txt is missing. I assume that this file is generated from one step of your script.

Can you help me address this problem? Thanks!

Issue with readme and problem of setting working directory

Hi @dinhdiep

Thanks for sharing this nice software to the community.

I have reviewed the README file and found that there is a wrong instruction at Identifying the methylation haplotype blocks

cghap2mhbs.sh [haplotype file] [target bed] [minimum LD R2 cutoff] [output name prefix]

This should be changed into cghap2mhbs.sh [haplotype file] [target bed] [minimum LD R2 cutoff] [snp files] [output name prefix]

The source code scripts/cghap2mhbs.sh says

if [ -z "$4" ]
then
        echo "usage: $0 [haplotype file] [target bed] [minimum LD R2 cutoff] [snp file] [output name prefix]"
        exit 0
fi

In addition to above readme suggestion, there are problems regarding to global or local PATH settings in some scripts or code.

For example, the perl script bin/cgHap2MHBs_parallel.pl wants to run bin/hapInfo_maskSNPs2mld_block.pl as shown below

sub run_process{
#  my $hap_info_file = 
#  my $bin_file = 
#  my $min_r2 = 
#  my $cg_snp_bed_file =
  my $file = shift;
  my $cmd = "bin/hapInfo_maskSNPs2mld_block.pl hapinfo_$file.tmp $minR2 $snpBedFile > $outNamePrefix.$file";
  system($cmd);
  unlink("hapinfo_$file.tmp");
}

But when I execute bin/cgHap2MHBs_parallel.pl perl cannot find bin/hapInfo_maskSNPs2mld_block.pl.

Below is how my system complains about above problem.

sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found
sh: 1: bin/hapInfo_maskSNPs2mld_block.pl: not found

This happens not only this perl script but some other shell scripts.

It would be great if you make a comment about path settings or fix those issue.

Thanks for reading this long comment.

typo in dependency

"require(gplot2)" contain a typo and it should be "require(ggplot2)", which has already been listed. So, "require(gplot2)" should be removed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.