Giter VIP home page Giter VIP logo

Comments (5)

saketkc avatar saketkc commented on June 15, 2024 1

Thanks for the bug report @dweemx. From a first look, I can confirm this is indeed a bug. I will revert with a possible solution/explanation shortly.

from pysradb.

saketkc avatar saketkc commented on June 15, 2024

HI @dweemx, It looks like the origin of this bug is at the NCBI's search interface. Looking up SRP125768 on https://www.ncbi.nlm.nih.gov/sra only shows up 128 hits while the total hits clearly should be 136 (corresponding to the total runs). These are the missing run ids:

'SRR6327103', 'SRR6327106', 'SRR6327114', 'SRR6327120', 'SRR6327118', 'SRR6327122', 'SRR6327135', 'SRR6327116'

I will have to look for a way to ensure such runs are not missed. Thanks once again for reporting this.

from pysradb.

dweemx avatar dweemx commented on June 15, 2024

Hi,
I contacted the SRA team and they told me that there was an issue with the SRA file pairing system when the data was ported from GEO to SRA database. This issue should be fixed now.

However, some samples are still missing when I'm using SRAweb: 'SRR6327106', 'SRR6327114', 'SRR6327120', 'SRR6327118', 'SRR6327122', 'SRR6327116'

from pysradb.

saketkc avatar saketkc commented on June 15, 2024

Thanks for the update @dweemx. It seems https://www.ncbi.nlm.nih.gov/sra/?term=SRP125768 still sends only 128 results. I will have time to work on a way to fix this in the coming few weeks. Thanks for your patience and sorry for the trouble this has been causing you.

from pysradb.

saketkc avatar saketkc commented on June 15, 2024

Hi @dweemx
Thanks for your patience. I was finally able to fix this in v0.9.9.
See this notebook for example with this ID: https://colab.research.google.com/drive/1C60V-jkcNZiaCra_V5iEyFs318jgVoUR

The web mode's default --detailed output gives all the metadata you see on SRA's run table.

> pysradb metadata SRP125768 --detailed | head
study_accession experiment_accession experiment_title                                                                                    experiment_desc                                                                                     organism_taxid  organism_name            library_strategy library_source  library_selection sample_accession sample_title instrument           total_spots total_size   run_accession run_total_spots run_total_bases run_alias      experiment_alias source_name                                      age        genotype/variation          tissue genotype 
 SRP125768       SRX4084637           GSM3142622: w1118_1d_WholeBrain_Unstranded_RNA-seq; Drosophila melanogaster; RNA-Seq                GSM3142622: w1118_1d_WholeBrain_Unstranded_RNA-seq; Drosophila melanogaster; RNA-Seq                7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301695                    NextSeq 500          3552575     79516196     SRR7166639    3552575         176271295       GSM3142622_r1  GSM3142622       w1118_1d_WholeBrain_Unstranded_RNA-seq           1 Day      W[1118]                     brain  NaN     
 SRP125768       SRX4084636           GSM3142621: w1118_1d_WholeBrain_Stranded_RNA-seq; Drosophila melanogaster; RNA-Seq                  GSM3142621: w1118_1d_WholeBrain_Stranded_RNA-seq; Drosophila melanogaster; RNA-Seq                  7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301693                    NextSeq 500          4513696     100655283    SRR7166638    4513696         220693988       GSM3142621_r1  GSM3142621       w1118_1d_WholeBrain_Stranded_RNA-seq             1 Day      W[1118]                     brain  NaN     
 SRP125768       SRX4084635           GSM3142620: DGRP-551_1d_WholeBrain_Unstranded_RNA-seq; Drosophila melanogaster; RNA-Seq             GSM3142620: DGRP-551_1d_WholeBrain_Unstranded_RNA-seq; Drosophila melanogaster; RNA-Seq             7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301694                    NextSeq 500          19374029    433332434    SRR7166637    19374029        961111968       GSM3142620_r1  GSM3142620       DGRP-551_1d_WholeBrain_Unstranded_RNA-seq        1 Day      DGRP-551                    brain  NaN     
 SRP125768       SRX4084634           GSM3142619: DGRP-551_1d_WholeBrain_Stranded_RNA-seq; Drosophila melanogaster; RNA-Seq               GSM3142619: DGRP-551_1d_WholeBrain_Stranded_RNA-seq; Drosophila melanogaster; RNA-Seq               7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301692                    NextSeq 500          2936449     65552609     SRR7166636    2936449         145074237       GSM3142619_r1  GSM3142619       DGRP-551_1d_WholeBrain_Stranded_RNA-seq          1 Day      DGRP-551                    brain  NaN     
 SRP125768       SRX4084633           GSM3142618: DGRP-551_1d_WholeBrainNuclei_Unstranded_Rep2_RNA-seq; Drosophila melanogaster; RNA-Seq  GSM3142618: DGRP-551_1d_WholeBrainNuclei_Unstranded_Rep2_RNA-seq; Drosophila melanogaster; RNA-Seq  7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301691                    NextSeq 500          24342212    458751469    SRR7166635    24342212        1207043823      GSM3142618_r1  GSM3142618       DGRP-551_1d_WholeBrainNuclei_Unstranded_RNA-seq  1 Day      DGRP-551                    brain  NaN     
 SRP125768       SRX4084632           GSM3142617: DGRP-551_1d_WholeBrainNuclei_Unstranded_Rep1_RNA-seq; Drosophila melanogaster; RNA-Seq  GSM3142617: DGRP-551_1d_WholeBrainNuclei_Unstranded_Rep1_RNA-seq; Drosophila melanogaster; RNA-Seq  7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301696                    Illumina HiSeq 4000  7398351     236600904    SRR7166634    7398351         551705108       GSM3142617_r1  GSM3142617       DGRP-551_1d_WholeBrainNuclei_Unstranded_RNA-seq  1 Day      DGRP-551                    brain  NaN     
 SRP125768       SRX4084631           GSM3142616: Adapted_SMART_seq2_R23E10_Cell_9; Drosophila melanogaster; RNA-Seq                      GSM3142616: Adapted_SMART_seq2_R23E10_Cell_9; Drosophila melanogaster; RNA-Seq                      7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301688                    NextSeq 500          267487      6409898      SRR7166633    267487          13266487        GSM3142616_r1  GSM3142616       Adapted_SMART_seq2_R23E10_Cell                   0-7 Days   R23E10-Gal4 x UAS-CD8::GFP  brain  NaN     
 SRP125768       SRX4084630           GSM3142615: Adapted_SMART_seq2_R23E10_Cell_8; Drosophila melanogaster; RNA-Seq                      GSM3142615: Adapted_SMART_seq2_R23E10_Cell_8; Drosophila melanogaster; RNA-Seq                      7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301690                    NextSeq 500          192550      4678011      SRR7166632    192550          9548043         GSM3142615_r1  GSM3142615       Adapted_SMART_seq2_R23E10_Cell                   0-7 Days   R23E10-Gal4 x UAS-CD8::GFP  brain  NaN     
 SRP125768       SRX4084629           GSM3142614: Adapted_SMART_seq2_R23E10_Cell_7; Drosophila melanogaster; RNA-Seq                      GSM3142614: Adapted_SMART_seq2_R23E10_Cell_7; Drosophila melanogaster; RNA-Seq                      7227            Drosophila melanogaster  RNA-Seq          TRANSCRIPTOMIC  cDNA              SRS3301689                    NextSeq 500          199223      4833365      SRR7166631    199223          9885888         GSM3142614_r1  GSM3142614       Adapted_SMART_seq2_R23E10_Cell                   0-7 Days   R23E10-Gal4 x UAS-CD8::GFP  brain  NaN   

Please let me know if you run into any issues.

from pysradb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.