Giter VIP home page Giter VIP logo

Comments (21)

marcus1487 avatar marcus1487 commented on August 16, 2024 2

I can help on some of the megalodon options here. Megalodon has a lot of outputs in order to facilitate porting into tools such as methplotlib. Megalodon is designed with two main modified base output types 1) per-read and 2) aggregated.

Per-read outputs include the mod_mappings.bam above. Megalodon can also output cram directly (via --mappings-format cram) and sort the mappings (via --sort-mappings). The mod_mappings output takes the per-read modified base calls and anchors them to the reference sequence (so these reads will artificially contain exactly the reference sequence for each read). mod_basecalls are anchored only to the basecalls and not mapped, so not directly applicable to methplotlib (I think).

The per-read modified base text format is meant as a dump of the per-read scores database mostly for debugging purposes. I would not recommend this as a standard format that should used as input and is subject to change without notice.

Aggregated formats returns results aggregated across reads at each applicable reference position. For the aggregated formats, Megalodon provides 3 options 1) bedmethyl 2) modvcf and 3) wiggle (mostly like a bedgraph but more compact). The modVCF is not really a standard format, but there is no standard format (that I'm aware of) to store multiple modifications in a single file. VCF was the closest, but required adding some fields for strandedness.

@JWDebler that METEORE script does not compute the methylation frequencies, but instead just converts from the format from bedmethyl to a consistent text format for comparison to other algorithms within the METEORE framework. I believe this could essentially be accomplished with a simple awk script to move columns around.

I'm happy to help mediate the porting of megalodon outputs into methplotlib. Let me know if any adjustments or recommendations would be helpful.

from methplotlib.

marcus1487 avatar marcus1487 commented on August 16, 2024 1

I think the error is the same bug @wdecoster noted in #31. Hopefully a resolution can be found there.

The bedgraph input to methplotlib is slightly different than the mod_mappings. The mod_mappings contain per-read methylation calls, so can provide more information (phasing/linked methylation) while the bedgraph/bedmethyl is aggregated over reads (so links between calls between each read are lost). I'm not sure what outputs methplotlib has in terms of per-read modified base calls, so this may be a moot point here, but thought the distinction was important.

For converting from bedmethyl to bedgraph the following one-liner should work: awk 'BEGIN{OFS="\t"} {$4=$11; NF=4; print}' sample.bedmethyl > sample.bedgraph. You may need to add a header to load this file in some genome browsers.

from methplotlib.

wdecoster avatar wdecoster commented on August 16, 2024

Hi Paula,

Yes, that shouldn't be a problem. methplotlib doesn't really care about the type of modification it shows. Which file format did you get from megalodon?
It is possible that I need to add an additional file format as input type, but that shouldn't be too much trouble.

Cheers,
Wouter

from methplotlib.

PaulaRomeroLozano avatar PaulaRomeroLozano commented on August 16, 2024

Hi Wouter,

Thank you very much for the quick response.

That would be incredibly useful!

I am still getting familiar with all the megalodon outputs and which ones are required and "compatible" for downstream analysis.

Please find below the file formats I got when running megalodon:

  1. mod_mapping.bam: I was thinking in sort it and convert it to a bedgraph using bedtools?
    image
    2.modified_bases.5mc.bed: (0-based)
    image
    3.per_read_modified_bases: as a .txt and .bd file (natural log)
    image

I was thinking in trying to make those files "nanopolish like" so I could use your scripts, but if you could add an extra input type that would be very helpful.

Thank you so much in advance,
Best,

Paula

from methplotlib.

wdecoster avatar wdecoster commented on August 16, 2024

Please keep me posted how things go for any of these formats. It may not be necessary to make them too similar to nanopolish files.

  1. methplotlib can take cram files with nucleotide modifications in the Mm tag, so your bam file should not need too many changes. Converting to cram should be enough. That said, I could also add an option to use bam directly.

  2. That looks like a bedmethyl format. You can quite easily make a bedgraph out of that, which is currently one of the compatible input formats. I could in the future also add bedmethyl input directly.

  3. Could you share a larger example of that format? I'll add it as a compatible input too. You can find my email address in my profile.

I'm afraid I won't have time to add the code for these formats this week.

from methplotlib.

PaulaRomeroLozano avatar PaulaRomeroLozano commented on August 16, 2024

Thank you very much for your help and feedback.

  1. I will go with this option first then.

  2. I will send you a small subset of the data by email.

Thank you again and I will keep you posted for sure :)

from methplotlib.

JWDebler avatar JWDebler commented on August 16, 2024

I found an R script in the METEORE repo that takes the megalodon .bed file and computes methylation frequencies like the nanopolish calculate_methylation_frequency.py script does. In its original form the output doesn't work with methplotlib yet, but it shouldn't take much to get the output close enough to nanopolish format. I don't know enough R though to do it.

from methplotlib.

PaulaRomeroLozano avatar PaulaRomeroLozano commented on August 16, 2024

Hello Marcus,

Thank you so much for your help.

It would be super useful to be able to port megalodon outputs into methplotlib! :)
Since I got mod_mapping.bam as output, could it be possible to simply convert that bam to cram?
I tried:
samtools sort mod_mappings.bam -o mod_mappings.sorted.bam
samtools view -C -T genome.fa mod_mappings.bam > mod_mappings.cram
samtools index -c mod_mappings.sorted.cram

I did so, and then I tried to run methplotlib but I got an error:

methplotlib -m /mnt/fs01/RUNS/minion_run/DNA_meth/megalodon_results/subset_1_5G_norm/mod_mappings.cram -n calls -w chr17:17,722,291-17,725,186 -g /media/data/reference/Homo_sapiens.GRCh38.90.chrplus.gtf --simplify
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/a8499ca51d6fb77332c2d242923994eb": Protocol not supported
Traceback (most recent call last):
File "/home/paula/.local/bin/methplotlib", line 8, in
sys.exit(main())
File "/home/paula/.local/lib/python3.8/site-packages/methplotlib/methplotlib.py", line 26, in main
meth_browser(meth_data=meth_data,
File "/home/paula/.local/lib/python3.8/site-packages/methplotlib/methplotlib.py", line 52, in meth_browser
meth_traces = plots.methylation(meth_data, dotsize=dotsize, binary=binary)
File "/home/paula/.local/lib/python3.8/site-packages/methplotlib/plots.py", line 120, in methylation
make_per_read_meth_traces_phred(table=meth.table,
File "/home/paula/.local/lib/python3.8/site-packages/methplotlib/plots.py", line 148, in make_per_read_meth_traces_phred
table = table.join(df_heights, on="read_name")
File "/home/paula/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 7874, in join
return self._join_compat(
File "/home/paula/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 7890, in _join_compat
return merge(
File "/home/paula/.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 74, in merge
op = _MergeOperation(
File "/home/paula/.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 656, in init
self._maybe_coerce_merge_keys()
File "/home/paula/.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 1165, in _maybe_coerce_merge_keys
raise ValueError(msg)
ValueError: You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat

The methplotlib logs reported:
2021-04-08 14:51:00,334 methplotlib 0.17.0 started.
2021-04-08 14:51:00,334 Python version is: 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0]
2021-04-08 14:51:00,334 Arguments are: Namespace(bed=None, binary=False, dotsize=4, example=False, fasta=None, gtf='/media/data/reference/Homo_sapiens.GRCh38.90.chrplus.gtf', methylation=['/mnt/fs01/RUNS/minion_run/DNA_meth/megalodon_results/subset_1_5G_norm/mod_mappings.cram'], names=['calls'], outfile=None, qcfile=None, simplify=True, smooth=5, split=False, static=None, store=False, window='chr17:17,722,291-17,725,186')
2021-04-08 14:51:00,334 Processing chr17_17722291_17725186
2021-04-08 14:51:00,335 File /mnt/fs01/RUNS/minion_run/DNA_meth/megalodon_results/subset_1_5G_norm/mod_mappings.cram is of type ont-cram
2021-04-08 14:51:00,557 Collected methylation data for 1 datasets
2021-04-08 14:51:00,658 Created QC plots

I am not sure why is not working and how I can fix it...

Another option would be to convert the bedmethyl file into bedgraph but I haven't been able to succeed on this either.

Any feedback or recommendation would be very much appreciated.

Kind regards,
Paula

from methplotlib.

Fatihlrcfs avatar Fatihlrcfs commented on August 16, 2024

hi all,

I am working on methylation outputs to visualize that. I have megalodon(files: modified_bases.5mc.bed, per_read_modified_base_calls.db and per_read_modified_base_calls.txt) and ı am wondering to use these file in Methplotlib. Could you say me the steps basically because ı am a freshman in this field? Many thanks. Kind Regards.

from methplotlib.

wdecoster avatar wdecoster commented on August 16, 2024

The megalodon input is in BAM/CRAM format, but you don't have those?

from methplotlib.

Fatihlrcfs avatar Fatihlrcfs commented on August 16, 2024

Hi @wdecoster
ı am using comman below to run megalodon and these (fmodified_bases.5mc.bed, per_read_modified_base_calls.db and per_read_modified_base_calls.txt) files were generated. thanks
code: megalodon debarcodedmultifast5/barcode06/ --outputs mods --reference wholegenome.fasta --mod-motif m CG 0 --write-mods-text --processes 40 --guppy-server-path /cluster/lrcfs/2397405/bin/ont-guppy-cpu-6.0.1/bin/guppy_basecall_server --guppy-params "-d /cluster/lrcfs/2397405/bin/rerio/basecall_models/ --num_callers 40" --guppy-timeout 400 --guppy-config res_dna_r941_min_modbases_5mC_CpG_v001.cfg --overwrite --output-directory megalodonresultbarcode06

from methplotlib.

wdecoster avatar wdecoster commented on August 16, 2024

I think --outputs with mod_mappings or mod_basecalls is what you need, but I don't use megalodon myself

from methplotlib.

Fatihlrcfs avatar Fatihlrcfs commented on August 16, 2024

okey, @wdecoster ı will configure my code and run with a small data set to generate the required files. thanks

from methplotlib.

carolinehey avatar carolinehey commented on August 16, 2024

Hi @wdecoster

Earlier you wrote this:

  1. methplotlib can take cram files with nucleotide modifications in the Mm tag, so your bam file should not need too many changes. Converting to cram should be enough. That said, I could also add an option to use bam directly.
  2. That looks like a bedmethyl format. You can quite easily make a bedgraph out of that, which is currently one of the compatible input formats. I could in the future also add bedmethyl input directly.
  3. Could you share a larger example of that format? I'll add it as a compatible input too. You can find my email address in my profile.

As for 1. i have a cram file with modifications in the Mm tag and would really like to use methplotlib. However, i get an error regarding the -n argument.

code:
methplotlib -m merged_bam_sup_modbases_sed.cram -w chr15:23000000-25500000
usage: methplotlib [-h] [-v] -m METHYLATION [METHYLATION ...] -n NAMES [NAMES ...] -w WINDOW [-g GTF] [-b BED] [-f FASTA] [--simplify] [--split]
[--static STATIC] [--binary] [--smooth SMOOTH] [--dotsize DOTSIZE] [--minqual MINQUAL] [--example] [-o OUTFILE] [-q QCFILE]
methplotlib: error: the following arguments are required: -n/--names

How to deal with this?

from methplotlib.

wdecoster avatar wdecoster commented on August 16, 2024

Have you tried setting the -n argument?
methplotlib -m merged_bam_sup_modbases_sed.cram -w chr15:23000000-25500000 -n test

from methplotlib.

carolinehey avatar carolinehey commented on August 16, 2024

Sorry for my ignorance.
Yes i have but get an error. Thought it might be something with wrong arguments but now i see that my error is similar to what other people have reported earlier.

methplotlib -m merged_bam_sup_modbases_sed.cram -n test -w chr15:23000000-25500000

[E::easy_errno] Libcurl reported error 77 (Problem with the SSL CA cert (path? access rights?))
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/f036bd11158407596ca6bf3581454706": Input/output error
Traceback (most recent call last):
File "/home/prom/anaconda3/envs/methplotlib/bin/methplotlib", line 8, in
sys.exit(main())
File "/home/prom/anaconda3/envs/methplotlib/lib/python3.10/site-packages/methplotlib/methplotlib.py", line 26, in main
meth_browser(meth_data=meth_data,
File "/home/prom/anaconda3/envs/methplotlib/lib/python3.10/site-packages/methplotlib/methplotlib.py", line 60, in meth_browser
fig = utils.create_subplots(num_methrows,
File "/home/prom/anaconda3/envs/methplotlib/lib/python3.10/site-packages/methplotlib/utils.py", line 250, in create_subplots
return plotly.subplots.make_subplots(
File "/home/prom/anaconda3/envs/methplotlib/lib/python3.10/site-packages/_plotly_utils/importers.py", line 39, in getattr
raise AttributeError(
AttributeError: module 'plotly' has no attribute 'subplots'

2022-05-18 06:24:15,208 methplotlib 0.20.1 started.
2022-05-18 06:24:15,208 Python version is: 3.10.2 | packaged by conda-forge | (main, Jan 14 2022, 08:02:09) [GCC 9.4.0]
2022-05-18 06:24:15,208 Arguments are: Namespace(methylation=['merged_bam_sup_modbases_sed.cram'], names=['test'], window='chr15:23000000-25500000', gtf=None, bed=None, fasta=None, simplify=False, split=False, static=None, binary=False, smooth=5, dotsize=4, minqual=20, example=False, outfile=None, qcfile=None, store=False)
2022-05-18 06:24:15,208 Processing chr15_23000000_23833334
2022-05-18 06:24:15,350 File merged_bam_sup_modbases_sed.cram is of type cram
2022-05-18 06:24:22,103 Collected methylation data for 1 datasets
2022-05-18 06:24:22,628 Created QC plots
2022-05-18 06:24:36,421 Prepared methylation traces.
2022-05-18 06:24:36,421 Making browser in split mode, with 1 modification rows.

When i try to use the bam file it looks like this:

methplotlib -m merged_bam_sup_modbases_sed.bam -n test -w chr15:23000000-25500000
Traceback (most recent call last):
File "/home/prom/anaconda3/envs/methplotlib/bin/methplotlib", line 8, in
sys.exit(main())
File "/home/prom/anaconda3/envs/methplotlib/lib/python3.10/site-packages/methplotlib/methplotlib.py", line 26, in main
meth_browser(meth_data=meth_data,
File "/home/prom/anaconda3/envs/methplotlib/lib/python3.10/site-packages/methplotlib/methplotlib.py", line 60, in meth_browser
fig = utils.create_subplots(num_methrows,
File "/home/prom/anaconda3/envs/methplotlib/lib/python3.10/site-packages/methplotlib/utils.py", line 250, in create_subplots
return plotly.subplots.make_subplots(
File "/home/prom/anaconda3/envs/methplotlib/lib/python3.10/site-packages/_plotly_utils/importers.py", line 39, in getattr
raise AttributeError(
AttributeError: module 'plotly' has no attribute 'subplots'

Is my input file the problem or something else?

from methplotlib.

wdecoster avatar wdecoster commented on August 16, 2024

AttributeError: module 'plotly' has no attribute 'subplots' is not related to your input file. Could you check the version of your plotly module? How did you install methplotlib?

from methplotlib.

carolinehey avatar carolinehey commented on August 16, 2024

I used pip install methplotlib

My plotly module is 5.8.0

from methplotlib.

wdecoster avatar wdecoster commented on August 16, 2024

I just installed plotly 5.8.0, and it works here. Please try in an interactive python session if it works, and print the version as well, to make sure that not an older version is imported.

from methplotlib.

DiegoZavallo avatar DiegoZavallo commented on August 16, 2024

Hi, I have get the same error when I run methplotlib, but it output an .html results that I don´t understand. I don´t have gff ad gtf file for this organism

image

methplotlib -m modified_bases.5mC.bedgraph -n calls -f Paspalum_notatum_genome.fasta -w CM039683.1:231000-234000


Reading modified_bases.5mC.bedgraph would be faster with bgzip and 'tabix -p bed'.
Traceback (most recent call last):
  File "/home/diego/Documents/INTA/Nanopore/env/bin/methplotlib", line 8, in <module>
    sys.exit(main())
  File "/home/diego/Documents/INTA/Nanopore/env/lib/python3.8/site-packages/methplotlib/methplotlib.py", line 26, in main
    meth_browser(meth_data=meth_data,
  File "/home/diego/Documents/INTA/Nanopore/env/lib/python3.8/site-packages/methplotlib/methplotlib.py", line 89, in meth_browser
    fig = utils.create_subplots(num_methrows, split=False, annotation=bool(bed or gtf))
  File "/home/diego/Documents/INTA/Nanopore/env/lib/python3.8/site-packages/methplotlib/utils.py", line 261, in create_subplots
    return plotly.subplots.make_subplots(
  File "/home/diego/Documents/INTA/Nanopore/env/lib/python3.8/site-packages/_plotly_utils/importers.py", line 39, in __getattr__
    raise AttributeError(
AttributeError: module 'plotly' has no attribute 'subplots'

My plotly version is also 5.8.0.

And I run it on an env/ in which I installed methplotlib

Do you have any clue?

from methplotlib.

YingziZhang-github avatar YingziZhang-github commented on August 16, 2024

I think the error is the same bug @wdecoster noted in #31. Hopefully a resolution can be found there.

The bedgraph input to methplotlib is slightly different than the mod_mappings. The mod_mappings contain per-read methylation calls, so can provide more information (phasing/linked methylation) while the bedgraph/bedmethyl is aggregated over reads (so links between calls between each read are lost). I'm not sure what outputs methplotlib has in terms of per-read modified base calls, so this may be a moot point here, but thought the distinction was important.

For converting from bedmethyl to bedgraph the following one-liner should work: awk 'BEGIN{OFS="\t"} {$4=$11; NF=4; print}' sample.bedmethyl > sample.bedgraph. You may need to add a header to load this file in some genome browsers.

I think the error is the same bug @wdecoster noted in #31. Hopefully a resolution can be found there.

The bedgraph input to methplotlib is slightly different than the mod_mappings. The mod_mappings contain per-read methylation calls, so can provide more information (phasing/linked methylation) while the bedgraph/bedmethyl is aggregated over reads (so links between calls between each read are lost). I'm not sure what outputs methplotlib has in terms of per-read modified base calls, so this may be a moot point here, but thought the distinction was important.

For converting from bedmethyl to bedgraph the following one-liner should work: awk 'BEGIN{OFS="\t"} {$4=$11; NF=4; print}' sample.bedmethyl > sample.bedgraph. You may need to add a header to load this file in some genome browsers.

Hi, I want to convert my bedmethyl file to bedgraph file. I used the one-liner and got an empty output. Would you please help check my problem? Many thanks!

My bedmethyl format is like this:

chr15   17170182        17170183        5mC     1000    -       17170182        17170183        0,0,0   1       0.00    1       0       0
chr15   17170225        17170226        5mC     0       -       17170225        17170226        0,0,0   1       0.00    0       0       0
chr15   17170391        17170392        5mC     0       -       17170391        17170392        0,0,0   1       0.00    0       0       0
chr15   17170477        17170478        5mC     0       -       17170477        17170478        0,0,0   1       0.00    0       0       0
chr15   17170495        17170496        5mC     1000    -       17170495        17170496        0,0,0   1       0.00    1       0       0 

from methplotlib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.