nilesh-tawari / chronqc Goto Github PK
View Code? Open in Web Editor NEWManuscript describing ChronQC is now available online in Bioinformatics
Home Page: https://doi.org/10.1093/bioinformatics/btx843
License: MIT License
Manuscript describing ChronQC is now available online in Bioinformatics
Home Page: https://doi.org/10.1093/bioinformatics/btx843
License: MIT License
Hi Nilesh, we've been using ChronQC in my diagnostic lab for a little while now and we noticed one problem: When a run fails in the pipeline and we need to re-run it, the command chronqc database --update
adds the new values for that run on top of the previous data point for that same run, instead of replacing it. That makes the plot legend messy and hard to read. See screenshot attached.
Is there a way to "clean up" the database for a given run, before re-adding it?
Thanks!
Roxane
I'm also getting the following error when I try to create a plot:
$ chronqc plot -o chromqc/ chronqc_db/chronqc.stats.sqlite AshTrio chronqc_db/chronqc.default.json -f
Started ChronQC
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/pandas/core/window.py", line 211, in _prep_values
values = ensure_float64(values)
File "pandas/_libs/algos_common_helper.pxi", line 311, in pandas._libs.algos.ensure_float64
ValueError: could not convert string to float: 'data'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/chronqc", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.6/dist-packages/chronqc/chronqc.py", line 168, in main
args.func(args)
File "/usr/local/lib/python3.6/dist-packages/chronqc/chronqc.py", line 21, in run_plot
chronqc_plot.main(args)
File "/usr/local/lib/python3.6/dist-packages/chronqc/chronqc_plot.py", line 676, in main
df_chart = mean_and_stdev(df, column_name, win=win, per_sample=per_sample)
File "/usr/local/lib/python3.6/dist-packages/chronqc/chronqc_plot.py", line 218, in mean_and_stdev
df_dup_all = rolling_mean(df_dup_all, Duplicates, win)
File "/usr/local/lib/python3.6/dist-packages/chronqc/chronqc_plot.py", line 186, in rolling_mean
df_dup_all['mean'] = df_dup_all.rolling(win).mean().round(2)[Duplicates]
File "/usr/local/lib/python3.6/dist-packages/pandas/core/window.py", line 1728, in mean
return super(Rolling, self).mean(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/window.py", line 1072, in mean
return self._apply('roll_mean', 'mean', **kwargs)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/window.py", line 841, in _apply
values = self._prep_values(b.values)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/window.py", line 214, in _prep_values
"".format(values.dtype))
TypeError: cannot handle this type -> object
My input files are:
$ cat chronqc_db/chronqc.default.json
[
{
"table_name": "chronqc_stats_data",
"chart_type": "time_series_with_mean_and_stdev",
"chart_properties": {
"y_value": "FastQC_mqc-generalstats-fastqc-avg_sequence_length"
}
},
{
"table_name": "chronqc_stats_data",
"chart_type": "time_series_with_mean_and_stdev",
"chart_properties": {
"y_value": "FastQC_mqc-generalstats-fastqc-percent_duplicates"
}
},
{
"table_name": "chronqc_stats_data",
"chart_type": "time_series_with_mean_and_stdev",
"chart_properties": {
"y_value": "FastQC_mqc-generalstats-fastqc-percent_fails"
}
},
{
"table_name": "chronqc_stats_data",
"chart_type": "time_series_with_mean_and_stdev",
"chart_properties": {
"y_value": "FastQC_mqc-generalstats-fastqc-percent_gc"
}
},
{
"table_name": "chronqc_stats_data",
"chart_type": "time_series_with_mean_and_stdev",
"chart_properties": {
"y_value": "FastQC_mqc-generalstats-fastqc-total_sequences"
}
}
]
$ sqlite3 -cmd 'SELECT * from chronqc_stats_data;' chronqc_db/chronqc.stats.sqlite
FastQC|all_sections|NIST7086_CGTACTAG_L002_R2_001|/mnt/data/TD01-GV1001_L2_R2/fastqc_data.txt|/mnt/data/TD01-GV1001_L2_R2/fastqc_data.txt|2019-01-22 00:00:00|data|95.229524961251|28.1154060602541|25.0|49.0|21523781.0|AshTrio
FastQC|all_sections|TD01-GV1001_L2.R1|/mnt/data/TD01-GV1001_L1_R1/fastqc_data.txt|/mnt/data/TD01-GV1001_L1_R1/fastqc_data.txt|2019-01-22 00:00:00|data|97.0165501126405|29.1116572397277|25.0|49.0|21523781.0|AshTrio
FastQC|all_sections|TD01-GV1001_L3.R2|/mnt/data/TD01-GV1001_L3_R2/fastqc_data.txt|/mnt/data/TD01-GV1001_L3_R2/fastqc_data.txt|2019-01-22 00:00:00|data|94.9493237371004|27.7428646507917|25.0|49.0|19865573.0|AshTrio
FastQC|all_sections|TD01-GV1001_L1.R2|/mnt/data/TD01-GV1001_L1_R2/fastqc_data.txt|/mnt/data/TD01-GV1001_L1_R2/fastqc_data.txt|2019-01-22 00:00:00|data|95.2768624146094|27.794053686974|25.0|49.0|21168890.0|AshTrio
FastQC|all_sections|TD01-GV1001_L3.R1|/mnt/data/TD01-GV1001_L3_R1/fastqc_data.txt|/mnt/data/TD01-GV1001_L3_R1/fastqc_data.txt|2019-01-22 00:00:00|data|96.7453750264339|28.8361458354103|25.0|49.0|19865573.0|AshTrio
FastQC|all_sections|TD06-GV1010_R2|/mnt/data/TD06-GV1010_R2_fastqc/fastqc_data.txt|/mnt/data/TD06-GV1010_R2_fastqc/fastqc_data.txt|2019-01-25 00:00:00|data|126.0|22.5016825110702|16.6666666666667|48.0|73074989.0|AshTrio
FastQC|all_sections|TD06-GV1009_R2|/mnt/data/TD06-GV1009_R2_fastqc/fastqc_data.txt|/mnt/data/TD06-GV1009_R2_fastqc/fastqc_data.txt|2019-01-25 00:00:00|data|126.0|21.3620029114608|16.6666666666667|48.0|64304319.0|AshTrio
FastQC|all_sections|TD06-GV1008_R2|/mnt/data/TD06-GV1008_R2_fastqc/fastqc_data.txt|/mnt/data/TD06-GV1008_R2_fastqc/fastqc_data.txt|2019-01-25 00:00:00|data|126.0|23.5528852817332|16.6666666666667|48.0|75193388.0|AshTrio
FastQC|all_sections|TD06-GV1009_R1|/mnt/data/TD06-GV1009_R1_fastqc/fastqc_data.txt|/mnt/data/TD06-GV1009_R1_fastqc/fastqc_data.txt|2019-01-25 00:00:00|data|126.0|21.1391622609802|16.6666666666667|48.0|64304319.0|AshTrio
FastQC|all_sections|TD06-GV1008_R1|/mnt/data/TD06-GV1008_R1_fastqc/fastqc_data.txt|/mnt/data/TD06-GV1008_R1_fastqc/fastqc_data.txt|2019-01-25 00:00:00|data|126.0|23.5838138955758|16.6666666666667|48.0|75193388.0|AshTrio
FastQC|all_sections|TD06-GV1010_R1|/mnt/data/TD06-GV1010_R1_fastqc/fastqc_data.txt|/mnt/data/TD06-GV1010_R1_fastqc/fastqc_data.txt|2019-01-25 00:00:00|data|126.0|22.3713007497403|16.6666666666667|48.0|73074989.0|AshTrio
Hi ChronQC,
Error of
File "/home/polarisbioit/miniconda3/bin/chronqc", line 11, in <module> load_entry_point('chronqc', 'console_scripts', 'chronqc')() File "/mnt/projects/polarisbioit/POLARIS.PIPELINE/POLARIS_PD_v2.5/STATSPLOT/v6.2/Scripts/ChronQC_plots_wConf/ChronQC-master/chronqc/chronqc.py", line 152, in main args.func(args) File "/mnt/projects/polarisbioit/POLARIS.PIPELINE/POLARIS_PD_v2.5/STATSPLOT/v6.2/Scripts/ChronQC_plots_wConf/ChronQC-master/chronqc/chronqc.py", line 19, in run_plot chronqc_plot.main(args) File "/mnt/projects/polarisbioit/POLARIS.PIPELINE/POLARIS_PD_v2.5/STATSPLOT/v6.2/Scripts/ChronQC_plots_wConf/ChronQC-master/chronqc/chronqc_plot.py", line 700, in main df_chart = box_whisker_plot(df, column_name, lower_threshold=lower_threshold, upper_threshold=upper_threshold) File "/mnt/projects/polarisbioit/POLARIS.PIPELINE/POLARIS_PD_v2.5/STATSPLOT/v6.2/Scripts/ChronQC_plots_wConf/ChronQC-master/chronqc/chronqc_plot.py", line 318, in box_whisker_plot outlier_df = pd.DataFrame(df_bp['outliers'].apply(pd.Series).stack()) File "/home/polarisbioit/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 6238, in stack return stack(self, level, dropna=dropna) File "/home/polarisbioit/miniconda3/lib/python3.6/site-packages/pandas/core/reshape/reshape.py", line 542, in stack dtype = dtypes[0] IndexError: list index out of range
appear when trying to plot box and whiskers if current data does not already have an obvious outlier value.
I have run the following command, and I'm getting the following output:
$ chronqc database --create --run-date-info chromqc/date_info.csv -o . multiqc/multiqc_data/multiqc_general_stats.txt AshTrio
Running ChronQC |############--------------------------------------| 25.0% usage: chronqc [-h] {database,plot,annotation,chrongen} ...
positional arguments:
{database,plot,annotation,chrongen}
database Generate ChronQC database for ChronQC plots. Type
"chronqc database -h" for help on generating/updating
ChronQC database.
plot Generate ChronQC plots. Type "chronqc plot -h" for
help on generating ChronQC plots.
annotation Start connectivity for annotating plots. Type "chronqc
annotation -h" for help on starting ChronQC annotation
server.
chrongen Use this option for automating ChronQC plot
generation. Type "chronqc chrongen -h" for details on
required arguments.
optional arguments:
-h, --help show this help message and exit
I believe I'm running it in the exact same way as described in the documentation. My input files are:
multiqc_general_stats.txt
:
Sample FastQC_mqc-generalstats-fastqc-percent_duplicates FastQC_mqc-generalstats-fastqc-percent_gc FastQC_mqc-generalstats-fastqc-avg_sequence_length FastQC_mqc-generalstats-fastqc-percent_fails FastQC_mqc-generalstats-fastqc-total_sequences
TD06-GV1008_R1 23.583813895575773 48.0 126.0 16.666666666666664 75193388.0
TD06-GV1008_R2 23.55288528173324 48.0 126.0 16.666666666666664 75193388.0
TD06-GV1009_R1 21.13916226098023 48.0 126.0 16.666666666666664 64304319.0
TD06-GV1009_R2 21.362002911460763 48.0 126.0 16.666666666666664 64304319.0
TD06-GV1010_R1 22.371300749740257 48.0 126.0 16.666666666666664 73074989.0
TD06-GV1010_R2 22.501682511070243 48.0 126.0 16.666666666666664 73074989.0
chromqc/date_info.csv
:
TD06-GV1008_R1,Run-1,01/01/2018
TD06-GV1008_R2,Run-1,01/01/2018
TD06-GV1009_R1,Run-1,01/02/2018
TD06-GV1009_R2,Run-1,01/02/2018
TD06-GV1010_R1,Run-1,01/03/2018
TD06-GV1010_R2,Run-1,01/03/2018
I also note that I don't have this issue if I use --multiqc-sources
instead of --run-date-info
.
Hi Nilesh, I have noticed an issue with ChronQC reports being stuck at the "Loading Report" stage. At first I thought it was a bug in my script or a proxy issue, but then I realised that even older reports that used to open completely fine are stuck now at that stage - and that behaviour is seen on any machine, any browser, within and outside proxies.
See Screenshot attached. As well as the HTML file that used to open fine.
Can you please have a look and see what the problem is?
Many thanks,
Roxane
Pathology_hyb_PHCP_2.Sample_NA12878.chronqc.17_Dec_2018.html.zip
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.