Atlas investigators have requested chrX support in the pipeline. This is not too diffi

assorted comments: the 1KG variants of these imputations were

chrX support about plco-analysis HOT 3 OPEN

nci-cgr commented on September 27, 2024

chrX support

from plco-analysis.

Comments (3)

lightning-auriga commented on September 27, 2024

for whoever inherits this project: here are the locations of chrX imputations for PLCO as I've been informed by email:

/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/Oncoarray/IMPUTATION_1000G
/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/Oncoarray/IMPUTATION_TOPMED

/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/OmniX/IMPUTATION_TOPMED

/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/Omni25M/IMPUTATION_1000G
/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/Omni25M/IMPUTATION_TOPMED

/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/Omni5/IMPUTATION_1000G
/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/Omni5/IMPUTATION_TOPMED

/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/GSA/IMPUTATION_1000G/batch1 (batch2,batch3,batch4,batch5)
/DCEG/CGF/Bioinformatics/Production/Shilpa/Projects/PLCO_chrX_Imputation/GSA/IMPUTATION_TOPMED/batch1 (batch2,batch3,batch4,batch5)

from plco-analysis.

lightning-auriga commented on September 27, 2024

assorted comments:

the 1KG variants of these imputations were for other testing purposes and should probably just be ignored for this project
the TOPMed chrX imputation was performed using the open MIS variant, which seems to be vs TOPMed v8, whereas the autosomes (in imputation freeze 2) are vs the private server/TOPMed v5b. so they are not synced, and eventually the autosomes need to catch up to X
the TOPMed chrX imputation was performed using chip data prepared for the first imputation pass, which is now deprecated. the biggest issues with this are (1) the ancestries were not split out before imputation, so the chips weren't appropriately cleaned; and (2) the batch assignments for freeze 2 were shuffled for GSA/Europeans to make them fit into 4 batches instead of 5, and since this wasn't what happened with chrX you have one additional chrX/GSA/European batch than you do for the autosomes

The above eventually just need to get synchronized, by everything getting reimputed to the public server's TOPMed panel with the better input prep. However, at least for the moment, I think the batch count discrepancy isn't that much of an issue (I think). There may be some step that assumes all chromosomes are present; but in general, the pipeline merely processes whatever is present. So it shouldn't be too hard to force it to use these files as-is.

from plco-analysis.

shukwong commented on September 27, 2024

need to have a .sample file linked with each chromosome, in case the samples are slightly different between chromosome X and the autosomes (which is the case in PLCO)

from plco-analysis.

chrX support about plco-analysis HOT 3 OPEN

Comments (3)

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent