Comments (4)
https://github.com/alexpenson/scripts/blob/master/chr_ncbi_to_ucsc.sed
But there must be a better source (where we originally got them from?).
from mutalyzer2.
This is really a mess. Looks like we can sort of get what we want by first going to UCSC genome browser and selecting from ctgPos2
(GRC Map Contigs) everything that starts at 0 (i.e., is the same as the complete chromosome it is located on):
$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19
mysql> select * from ctgPos2 where chromStart = 0;
+----------------------+---------+-----------------------+------------+----------+------+
| contig | size | chrom | chromStart | chromEnd | type |
+----------------------+---------+-----------------------+------------+----------+------+
| HSCHR17_CTG1 | 296626 | chr17 | 0 | 296626 | F |
| HSCHR1_RANDOM_CTG5 | 106433 | chr1_gl000191_random | 0 | 106433 | F |
| HSCHR1_RANDOM_CTG12 | 547496 | chr1_gl000192_random | 0 | 547496 | F |
...
Now go to the GRC website, click on the assembly accession you want, and click Download the full sequence report:
In that file, lookup the contig we got from the UCSC database, (e.g., HSCHR1_RANDOM_CTG5
) and observe the corresponding refseq accession number (e.g., NT_113878.1
).
from mutalyzer2.
@p.e.m.taschner What do you think? Is there an easier way to get the RefSeq accession numbers and UCSC chromosome name couplings?
Here's an example of what we already have.
from mutalyzer2.
After discussing this:
The mappings on these contigs that have no chromosome definition are bogus anyway, since there is no way to use them in the position converter (Mutalyzer will always first try to get the corresponding chromosome definition). So we decided to discard them for now.
At some point we should decide what to do with all contigs that are not part of the primary assembly. But currently, our user interface is not really suited for using them, so that needs some more thinking. It may really confuse users if we start adding more mappings on these non-primary assembly contigs.
By the way, we only have transcript mappings on the non-primary assembly contigs from our old UCSC imports. The current NCBI mapview import discards them completely. We're not fixing this before we decided how to handle and present them to the user.
from mutalyzer2.
Related Issues (20)
- Mutalyzer configuration HOT 3
- "This is not an LRG record." error
- About the reference of 13:46108853 HOT 1
- Inconsistent results between mutalyzer and ref genome fasta HOT 2
- 2 allele descrption not recognized
- Position Converter error "not found in database" suggestion
- Position Converter "Chr" error suggestion
- Silent mutations HOT 1
- Alleles HOT 1
- HGVS link outdated
- submitBatchJob error HOT 3
- Inconsistent NameChecker results for transcript-reference disagree HOT 1
- IGV hg38 TTN gene no NM transcript id, only XM id HOT 2
- Emit warning on position correction
- Mutalyzer down? HOT 1
- Error When using local mutalyzer HOT 1
- Alternate sequence extractor tool/feature? HOT 4
- intron numbering - missing warning HOT 3
- Local Mutalyzer runs slowly HOT 2
- The check position is not exactly the same as my query pos HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mutalyzer2.