Giter VIP home page Giter VIP logo

Comments (21)

pwwang avatar pwwang commented on June 3, 2024

Note that you need to upgrade vcfstats to v0.3.0:

  1. TiTv numbers

    vcfstats --vcf examples/sample.vcf \
    --outdir examples/ \
    --config examples/config.toml \
    --title 'TiTv numbers' --figtype col \
    --formula 'COUNT(1) ~ TITV' \
    --ggs 'theme(axis_text_x=element_text(vjust=1))'

    titv-numbers col

  2. INFO_DP

    vcfstats --vcf examples/sample.vcf \        
    --outdir examples/ \
    --config examples/config.toml \
    --title "INFO_DP" \
    --macro examples/mymacros.py \
    --formula 'INFO_DP ~ 1' 

    info-dp histogram

    vcfstats --vcf examples/sample.vcf \    
    --outdir examples/ \
    --config examples/config.toml \
    --title "INFO_DP-Chroms" \
    --macro examples/mymacros.py \
    --formula 'INFO_DP ~ CONTIG' \
    --figtype boxplot

    info-dp-chroms boxplot

  3. Number of missing calls

    vcfstats --vcf examples/sample.vcf \   
    --outdir examples/ \
    --config examples/config.toml \
    --title "N_missings" \
    --macro examples/mymacros.py \
    --formula 'SUM(MISSINGs) ~ SAMPLES'

    n-missings col

  4. Per locus missings

    vcfstats --vcf examples/sample.vcf \        
    --outdir examples/ \
    --config examples/config.toml \
    --title "Per_locus_missings" \
    --macro examples/mymacros.py \
    --formula 'N_MISSING ~ 1'

    per-locus-missings histogram

    vcfstats --vcf examples/sample.vcf \        
    --outdir examples/ \
    --config examples/config.toml \
    --title "Per_locus_missings" \
    --macro examples/mymacros.py \
    --formula 'N_MISSING ~ CONTIG' \
    --figtype boxplot

    per-locus-missings boxplot

  5. N_hets

    vcfstats --vcf examples/sample.vcf \               
    --outdir examples/ \
    --config examples/config.toml \
    --title "N_hets-sample1" \
    --formula 'COUNT(1, GTTYPEs[HET]{0}) ~ CONTIG'

    n-hets-sample1 col

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Hi,
Thank you so much!
I'll try it on my data.
I'm wondering, if it would be possible to plot the percent of the heterozygous calls per locus as a histogram and multiple boxplots (one for each chromosome as on the "INFO_DP-Chroms" ) in addition to the figure "5.N_hets"?
Thank you again.

from vcfstats.

pwwang avatar pwwang commented on June 3, 2024

% Het calls distribution:

vcfstats --vcf examples/sample.vcf \
    --outdir examples/ \  
    --config examples/config.toml \
    --title "Percent_hets" \
    --formula 'Percent_HETs ~ 1' \
    --macro examples/mymacros.py

percent-hets histogram

% Het calls distribution against chromosomes:

vcfstats --vcf examples/sample.vcf \      
      --outdir examples/ \
      --config examples/config.toml \
      --title "Percent_hets_chroms" \
      --formula 'Percent_HETs ~ CONTIG' \
      --macro examples/mymacros.py \
      --figtype boxplot

percent-hets-chroms boxplot

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Hi,
Thank you! The plots "Percent_hets" and "Percent_hets_chroms" doesn't work. I got errors:
"Traceback (most recent call last):
File "/home/tkiy/miniconda3/envs/VCF_stat/bin/vcfstats", line 8, in
sys.exit(main())
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/cli.py", line 195, in main
ones = get_instances(opts, samples)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/cli.py", line 91, in get_instances
Instance(
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/instance.py", line 142, in init
self.formula = Formula(formula, samples, passed, title)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 280, in init
self.Y, self.X = PARSER.parse(formula)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/lark.py", line 561, in parse
return self.parser.parse(text, start=start, on_error=on_error)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parser_frontends.py", line 107, in parse
return self.parser.parse(stream, start, **kw)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 41, in parse
return self.parser.parse(lexer, start)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 171, in parse
return self.parse_from_state(parser_state)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 179, in parse_from_state
state.feed_token(token)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 150, in feed_token
value = callbacksrule
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parse_tree_builder.py", line 111, in call
return self.node_builder(filtered)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parse_tree_builder.py", line 309, in f
return wrapper(func, name, children, None)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/visitors.py", line 390, in _vargs_inline
return f(*children)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/visitors.py", line 374, in f
return _f(self, *args, **kwargs)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 20, in term
return Term(str(name), items, samples)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 72, in init
raise ValueError("Term {!r} has not been registered.".format(name))
ValueError: Term 'Percent_HETs' has not been registered.
Exception ignored in: <function Instance.del at 0x7febe2fae560>
Traceback (most recent call last):
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/instance.py", line 176, in del
del self.data
AttributeError: data
"

"Traceback (most recent call last):
File "/home/tkiy/miniconda3/envs/VCF_stat/bin/vcfstats", line 8, in
sys.exit(main())
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/cli.py", line 195, in main
ones = get_instances(opts, samples)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/cli.py", line 91, in get_instances
Instance(
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/instance.py", line 142, in init
self.formula = Formula(formula, samples, passed, title)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 280, in init
self.Y, self.X = PARSER.parse(formula)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/lark.py", line 561, in parse
return self.parser.parse(text, start=start, on_error=on_error)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parser_frontends.py", line 107, in parse
return self.parser.parse(stream, start, **kw)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 41, in parse
return self.parser.parse(lexer, start)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 171, in parse
return self.parse_from_state(parser_state)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 179, in parse_from_state
state.feed_token(token)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 150, in feed_token
value = callbacksrule
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parse_tree_builder.py", line 111, in call
return self.node_builder(filtered)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parse_tree_builder.py", line 309, in f
return wrapper(func, name, children, None)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/visitors.py", line 390, in _vargs_inline
return f(*children)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/visitors.py", line 374, in f
return _f(self, *args, **kwargs)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 20, in term
return Term(str(name), items, samples)
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 72, in init
raise ValueError("Term {!r} has not been registered.".format(name))
ValueError: Term 'Percent_HETs' has not been registered.
Exception ignored in: <function Instance.del at 0x7fcd4e4c6680>
Traceback (most recent call last):
File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/instance.py", line 176, in del
del self.data
AttributeError: data
"

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Additionally, there are a two warnings for the "INFO_DP" and "Per_locus_missings":

"INFO_DP: (plotnine) /home/tkiy/miniconda3/envs
/VCF_stat/lib/python3.10/site-packages/plotnine/sta
ts/stat_bin.py:95: PlotnineWarning: 'stat_bin()'
using 'bins = 2546'. Pick better value with
'binwidth'.
"

"Per_locus_missings: (plotnine) /home/tkiy/mini
conda3/envs/VCF_stat/lib/python3.10/site-packages/p
lotnine/stats/stat_bin.py:95: PlotnineWarning:
'stat_bin()' using 'bins = 270'. Pick better value
with 'binwidth'.
"
The INFO_DP plot looks so tiny regarding the scale of the X axis. How can i manage that?
If you will need the examples of the plots, could you explain me please, how can i attach it here?

from vcfstats.

pwwang avatar pwwang commented on June 3, 2024

Hi, Thank you! The plots "Percent_hets" and "Percent_hets_chroms" doesn't work. I got errors: "Traceback (most recent call last): File "/home/tkiy/miniconda3/envs/VCF_stat/bin/vcfstats", line 8, in sys.exit(main()) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/cli.py", line 195, in main ones = get_instances(opts, samples) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/cli.py", line 91, in get_instances Instance( File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/instance.py", line 142, in init self.formula = Formula(formula, samples, passed, title) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 280, in init self.Y, self.X = PARSER.parse(formula) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/lark.py", line 561, in parse return self.parser.parse(text, start=start, on_error=on_error) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parser_frontends.py", line 107, in parse return self.parser.parse(stream, start, **kw) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 41, in parse return self.parser.parse(lexer, start) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 171, in parse return self.parse_from_state(parser_state) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 179, in parse_from_state state.feed_token(token) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 150, in feed_token value = callbacksrule File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parse_tree_builder.py", line 111, in call return self.node_builder(filtered) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parse_tree_builder.py", line 309, in f return wrapper(func, name, children, None) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/visitors.py", line 390, in _vargs_inline return f(*children) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/visitors.py", line 374, in f return _f(self, *args, **kwargs) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 20, in term return Term(str(name), items, samples) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 72, in init raise ValueError("Term {!r} has not been registered.".format(name)) ValueError: Term 'Percent_HETs' has not been registered. Exception ignored in: <function Instance.del at 0x7febe2fae560> Traceback (most recent call last): File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/instance.py", line 176, in del del self.data AttributeError: data "

"Traceback (most recent call last): File "/home/tkiy/miniconda3/envs/VCF_stat/bin/vcfstats", line 8, in sys.exit(main()) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/cli.py", line 195, in main ones = get_instances(opts, samples) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/cli.py", line 91, in get_instances Instance( File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/instance.py", line 142, in init self.formula = Formula(formula, samples, passed, title) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 280, in init self.Y, self.X = PARSER.parse(formula) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/lark.py", line 561, in parse return self.parser.parse(text, start=start, on_error=on_error) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parser_frontends.py", line 107, in parse return self.parser.parse(stream, start, **kw) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 41, in parse return self.parser.parse(lexer, start) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 171, in parse return self.parse_from_state(parser_state) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 179, in parse_from_state state.feed_token(token) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parsers/lalr_parser.py", line 150, in feed_token value = callbacksrule File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parse_tree_builder.py", line 111, in call return self.node_builder(filtered) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/parse_tree_builder.py", line 309, in f return wrapper(func, name, children, None) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/visitors.py", line 390, in _vargs_inline return f(*children) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/lark/visitors.py", line 374, in f return _f(self, *args, **kwargs) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 20, in term return Term(str(name), items, samples) File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/formula.py", line 72, in init raise ValueError("Term {!r} has not been registered.".format(name)) ValueError: Term 'Percent_HETs' has not been registered. Exception ignored in: <function Instance.del at 0x7fcd4e4c6680> Traceback (most recent call last): File "/home/tkiy/miniconda3/envs/VCF_stat/lib/python3.10/site-packages/vcfstats/instance.py", line 176, in del del self.data AttributeError: data "

It says Percent_HETs macro cannot be found, have you tried the examples/mymacros.py file with --macro argument?

from vcfstats.

pwwang avatar pwwang commented on June 3, 2024

The INFO_DP plot looks so tiny regarding the scale of the X axis. How can i manage that?
If you will need the examples of the plots, could you explain me please, how can i attach it here?

There is no way to interpolate the binwidth/nbins for now. You probably want to try density or freqpoly plots.

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Hi,
I've checked the command lines i used for the "Percent_hets" and "Percent_hets_chroms" plots. I used --macro for these plots. The rest plots using --macro argument works fine. May be you not pushed the code for these two plots as they ware the latest changes you made? How can i check that?

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

There are no decorators for "Percent_hets" and "Percent_hets_chroms" plots in the mymacros.py file:

from vcfstats.macros import continuous


@continuous
def INFO_DP(variant):
    """DP from INFO"""
    return variant.INFO["DP"]


@continuous
def MISSINGs(variant):
    """DP from INFO"""
    # convert boolean array to int
    # gts012 = True
    return (variant.gt_types == 3) + 0


@continuous
def N_MISSING(variant):
    """DP from INFO"""
    # convert boolean array to int
    # gts012 = True
    return variant.num_unknown

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

I've tried both density and freqpoly options for the INFO_DP plot. It still looks weird:
info-dp freqpoly-1
info-dp density
It is required approximately 90Gb of RAM to produce The INFO_DP plot on my vcf file (110K SNPs for 96 individuals).
The plot INFO_DP-Chroms is weird too:
info-dp-chroms boxplot

May be all that because i deal with the GBS data.

from vcfstats.

pwwang avatar pwwang commented on June 3, 2024

Hi, I've checked the command lines i used for the "Percent_hets" and "Percent_hets_chroms" plots. I used --macro for these plots. The rest plots using --macro argument works fine. May be you not pushed the code for these two plots as they ware the latest changes you made? How can i check that?

I forgot to merge the commit from dev branch. Now it should be there.

from vcfstats.

pwwang avatar pwwang commented on June 3, 2024

I've tried both density and freqpoly options for the INFO_DP plot. It still looks weird:

That could also be due to the skewness of your depth distribution.
You can also try --savedata to see what the data for plotting looks like.

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Hi,
Thank you for your response. The Percent_hets and Percent_hets_chroms plots are working now. Unfortunately these are not very informative for my data. Is it possible with --savedata argument to export the statistics (e.g. Percent_hets_chroms for the whole data set ) to the file to import and visualize it in the R environment later?
proportion-hets-chroms boxplot

proportion-hets histogram

from vcfstats.

pwwang avatar pwwang commented on June 3, 2024

Sure you can. To save that for each locus, just use the formula Percent_HETs ~ 1 with --savedata.
You may also try to add filters for your plots using vcfstats. For example: Percent_HETs[0, 0.1] ~ CONTIG to plot loci with Percent_HETs between 0 and 0.1 only.

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Thank you so much! These options are extremely helpful!

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Hi,
I'm wondering how can i cite vcfstats in the publications?
Regards,
Denis

from vcfstats.

pwwang avatar pwwang commented on June 3, 2024

You can just link the url of the repo. It is not published yet.

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Thank you!

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

Hi,
I'm wondering if it would be possible to make a two additional plots?

  1. Barplot with just the two bars - number of diallelic variants and number of multiallelic variants.
  2. More detailed barplot reflecting the number of diallelic, tri-, tetra-, etc. allelic variants.

Regards,
Denis

from vcfstats.

pwwang avatar pwwang commented on June 3, 2024

By "diallelic" I assume you meant "biallelic"?

from vcfstats.

DenisGoryunov avatar DenisGoryunov commented on June 3, 2024

from vcfstats.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.