Comments (9)
Hi Daniel,
Thank you for your question. It seems like you have run some commands several times such that your AGDS file has already had the apc_protein_function
channel, which caused the error. Could you paste the information on your chromosome 22 GDS file before running "Step 3: Generate the annotated GDS (aGDS) file"?
Best,
Xihao
from staarpipeline-tutorial.
Thanks for getting back to me. I remade the chr22 GDS file and uploaded it here https://drive.google.com/file/d/19YOYDN7A7Fodyrce_IkH2e6uG5ByKQKX/view?usp=share_link (it is different than the chr22 GDS file after I tried running step 3). This is the command and output when I remade the GDS file:
Rscript /project/ritchie07/personal/daniel/tools/STAARpipeline/convertVCF2GDS.R NULL vcf chr22_mac1_GDS 1 /project/ritchie07/personal/daniel/A6K/chr22_mac1.vcf.gz
[1] "NULL"
[2] "vcf"
[3] "chr22_mac1_GDS"
[4] "1"
[5] "/project/ritchie07/personal/daniel/A6K/chr22_mac1.vcf.gz"
[1] "/project/ritchie07/personal/daniel/A6K/chr22_mac1.vcf.gz"
Loading required package: gdsfmt
Running with 28 thread(s).
converting VCF
Tue Dec 6 11:52:49 2022
Variant Call Format (VCF) Import:
file(s):
chr22_mac1.vcf.gz (442.8M)
file format: VCFv4.2
the number of sets of chromosomes (ploidy): 2
the number of samples: 6,280
genotype storage: bit2
compression method: LZMA_RA
# of samples: 6280
calculating the total number of variants ...
the total number of variants for import: 1,641,932
Writing to 28 files:
chr22_mac1_GDS_tmp01_79b76e0809c [1..58,640]
chr22_mac1_GDS_tmp02_79b743259777 [58,641..117,282]
chr22_mac1_GDS_tmp03_79b739469439 [117,283..175,922]
chr22_mac1_GDS_tmp04_79b773c1248 [175,923..234,564]
chr22_mac1_GDS_tmp05_79b760d4a4e4 [234,565..293,204]
chr22_mac1_GDS_tmp06_79b718ae12bc [293,205..351,846]
chr22_mac1_GDS_tmp07_79b7250da3bb [351,847..410,486]
chr22_mac1_GDS_tmp08_79b7393bbfa8 [410,487..469,126]
chr22_mac1_GDS_tmp09_79b76daf5ccb [469,127..527,768]
chr22_mac1_GDS_tmp10_79b7320d43c1 [527,769..586,408]
chr22_mac1_GDS_tmp11_79b732eec028 [586,409..645,050]
chr22_mac1_GDS_tmp12_79b72971da6b [645,051..703,690]
chr22_mac1_GDS_tmp13_79b762beae63 [703,691..762,332]
chr22_mac1_GDS_tmp14_79b76832b9ca [762,333..820,972]
chr22_mac1_GDS_tmp15_79b712c800a5 [820,973..879,612]
chr22_mac1_GDS_tmp16_79b7670dd1a3 [879,613..938,254]
chr22_mac1_GDS_tmp17_79b713ae2e68 [938,255..996,894]
chr22_mac1_GDS_tmp18_79b74ffdc65c [996,895..1,055,536]
chr22_mac1_GDS_tmp19_79b749b31d96 [1,055,537..1,114,176]
chr22_mac1_GDS_tmp20_79b73144c505 [1,114,177..1,172,818]
chr22_mac1_GDS_tmp21_79b743cb1ae1 [1,172,819..1,231,458]
chr22_mac1_GDS_tmp22_79b72aa7f3ff [1,231,459..1,290,098]
chr22_mac1_GDS_tmp23_79b77a12170c [1,290,099..1,348,740]
chr22_mac1_GDS_tmp24_79b751c949b4 [1,348,741..1,407,380]
chr22_mac1_GDS_tmp25_79b72d0e378c [1,407,381..1,466,022]
chr22_mac1_GDS_tmp26_79b79b35265 [1,466,023..1,524,662]
chr22_mac1_GDS_tmp27_79b717fa32a2 [1,524,663..1,583,304]
chr22_mac1_GDS_tmp28_79b77536717a [1,583,305..1,641,932]
Done (Tue Dec 6 11:55:49 2022).
Output:
chr22_mac1_GDS.gds
Merging:
opening 'chr22_mac1_GDS_tmp01_79b76e0809c' ... [done]
opening 'chr22_mac1_GDS_tmp02_79b743259777' ... [done]
opening 'chr22_mac1_GDS_tmp03_79b739469439' ... [done]
opening 'chr22_mac1_GDS_tmp04_79b773c1248' ... [done]
opening 'chr22_mac1_GDS_tmp05_79b760d4a4e4' ... [done]
opening 'chr22_mac1_GDS_tmp06_79b718ae12bc' ... [done]
opening 'chr22_mac1_GDS_tmp07_79b7250da3bb' ... [done]
opening 'chr22_mac1_GDS_tmp08_79b7393bbfa8' ... [done]
opening 'chr22_mac1_GDS_tmp09_79b76daf5ccb' ... [done]
opening 'chr22_mac1_GDS_tmp10_79b7320d43c1' ... [done]
opening 'chr22_mac1_GDS_tmp11_79b732eec028' ... [done]
opening 'chr22_mac1_GDS_tmp12_79b72971da6b' ... [done]
opening 'chr22_mac1_GDS_tmp13_79b762beae63' ... [done]
opening 'chr22_mac1_GDS_tmp14_79b76832b9ca' ... [done]
opening 'chr22_mac1_GDS_tmp15_79b712c800a5' ... [done]
opening 'chr22_mac1_GDS_tmp16_79b7670dd1a3' ... [done]
opening 'chr22_mac1_GDS_tmp17_79b713ae2e68' ... [done]
opening 'chr22_mac1_GDS_tmp18_79b74ffdc65c' ... [done]
opening 'chr22_mac1_GDS_tmp19_79b749b31d96' ... [done]
opening 'chr22_mac1_GDS_tmp20_79b73144c505' ... [done]
opening 'chr22_mac1_GDS_tmp21_79b743cb1ae1' ... [done]
opening 'chr22_mac1_GDS_tmp22_79b72aa7f3ff' ... [done]
opening 'chr22_mac1_GDS_tmp23_79b77a12170c' ... [done]
opening 'chr22_mac1_GDS_tmp24_79b751c949b4' ... [done]
opening 'chr22_mac1_GDS_tmp25_79b72d0e378c' ... [done]
opening 'chr22_mac1_GDS_tmp26_79b79b35265' ... [done]
opening 'chr22_mac1_GDS_tmp27_79b717fa32a2' ... [done]
opening 'chr22_mac1_GDS_tmp28_79b77536717a' ... [done]
Digests:
sample.id [md5: a761962496b6b317bf251960be9c76b7]
variant.id [md5: 819a750296c70995fba8b9748ceec990]
position [md5: 950041008e64c71f6f9187d2c86da0e0]
chromosome [md5: b78a494dc5be8a12482aaacfa00b65c0]
allele [md5: 495a3512d3c6c197209ad91c86564c2e]
genotype [md5: 507c9f68d3039161f84c086de22588c3]
phase [md5: 13706a839e623a3b95e55afef017faec]
annotation/id [md5: 47b0eafc0f027da5320cfdc0a7efd78d]
annotation/qual [md5: 9d8f45b58e47bd77724a8b8cfde5a0a6]
annotation/filter [md5: 518197a19b03713e21a5fc174926226d]
annotation/info/PR [md5: b63f542998b4e725f47060b84b2cb3e8]
Done.
Tue Dec 6 11:56:56 2022
Optimize the access efficiency ...
Clean up the fragments of GDS file:
open the file 'chr22_mac1_GDS.gds' (114.5M)
# of fragments: 269
save to 'chr22_mac1_GDS.gds.tmp'
rename 'chr22_mac1_GDS.gds.tmp' (114.5M, reduced: 2.5K)
# of fragments: 56
Tue Dec 6 11:56:58 2022
File: /project/ritchie07/personal/daniel/A6K/STAARpipeline/chr22_mac1_GDS.gds
Format Version: v1.0
Reference: unknown
Ploidy: 2
Number of samples: 6,280
Number of variants: 1,641,932
Chromosomes:
Chr22: 1641932
Contigs:
22, 50808250
Alleles:
ALT: <None>
tabulation: 2, 1641932(100.0%)
Annotation, Quality:
Min: NA, 1st Qu: NA, Median: NA, Mean: NaN, 3rd Qu: NA, Max: NA, NA's: 1641932
Annotation, FILTER:
<None>
Annotation, INFO variable(s):
PR, 0, Flag, Provisional reference allele, may not be based on real reference genome
Annotation, FORMAT variable(s):
GT, 1, String, Genotype
Annotation, sample variable(s):
<None>
from staarpipeline-tutorial.
Hi Daniel,
Thanks for including the output log of generating the GDS files. These GDS files should be the Step 1 input of the FAVORannotator
program. Now given you have run Step 1 and Step 2 of FAVORannotator
successfully, could you please make a copy of these GDS files, and rerun Step 3 of FAVORannotator
on top of this copy?
Please let us know if you encounter this same issue (i.e., The GDS node "apc_protein_function" exists
) again.
Best,
Xihao
from staarpipeline-tutorial.
I just tried re-running Step 3 using the new chr22 GDS file but unfortunately had the same The GDS node "apc_protein_function" exists
issue.
from staarpipeline-tutorial.
Hi Daniel,
Thanks for letting me know. In this case, could you please paste the output of head(FunctionalAnnotation)
, dim(FunctionalAnnotation)
, and colnames(FunctionalAnnotation)
when running through this line of the Step 3 script?
Best,
Xihao
from staarpipeline-tutorial.
Thanks again for the help -- below are the commands and their outputs:
from staarpipeline-tutorial.
Hi Daniel,
This is very helpful. You seemed to be using the FAVOR Full Database to annotate the GDS file. However, you should use the FAVOR Essential Database to annotate the GDS file in Step 2 of FAVORannotator.
Hope this helps, and please let me know how it goes. Thank you.
Best,
Xihao
from staarpipeline-tutorial.
Hi Xihao,
Thanks a lot, it seems to be working now. I'll check back if I'm having other issues.
from staarpipeline-tutorial.
Hi Daniel,
Thanks so much for letting me know.
Best,
Xihao
from staarpipeline-tutorial.
Related Issues (20)
- fit_nullmodel Output is mostly Null and 0 HOT 16
- Fitting NULL model for binary outcomes HOT 5
- Error in Gene Centric Analysis HOT 1
- Error in results_plof_genome[, "cMAC"] : subscript out of bounds HOT 2
- Followup Question to Issue #28 HOT 2
- STAARpipeline_Gene_Centric_Noncoding HOT 2
- Dynamic Window dim(X) error HOT 3
- Can't annotate individual variant results HOT 2
- [Suggestion-Implementation] Add information to summary and annotations of results HOT 1
- Conditional analysis - Summary Gene Centric Noncoding not running to completion HOT 6
- Ukbiobank Agds files generation HOT 16
- Plots for gene centric ncRNA regions HOT 5
- FATAL ERROR - Too many first alleles as the major allele (~21.5%). HOT 1
- warning messages in generating the annotated GDS (aGDS) file. HOT 3
- Controls / cases counts inverted when using binary model HOT 7
- kinship matrix HOT 2
- variant set in gene-centric coding/noncoding analysis HOT 2
- in the Step 2: Individual (single-variant) analysis, Error in if (chr == 1) { : argument is of length zero HOT 3
- Error : Mat::operator(): index out of bounds & Error in apply(emthr_SCANG_O, 2, max) : HOT 1
- What is the difference between the "results_temp" and "results_m" results? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from staarpipeline-tutorial.