Giter VIP home page Giter VIP logo

rawrr's People

Contributors

cpanse avatar jasenfinch avatar jwokaty avatar nturaga avatar tobiasko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rawrr's Issues

Low performance when reading a lot of spectra

rawrr::readSpectrum is very slow, making it unuseable to read files with 10,000s of spectra

By slow I mean it takes ~1 second on my 1 year old Macbook Pro to read a spectrum.
(I do call the function once, with list of spectrum ids.)

It would take 3 hours just to read a single file. That renders the package unuseable by some two orders of magnitude.

I will be investigating to figure out what is the culprit. It might be necessary to add switches that remove some "advanced" functionality from spectrum reads to get the performance back (?).

Installation fails

Hi,

Very nice tool!

At the moment, sadly:

install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawR_0.1.0.tar.gz')
Warning in install.packages :
package ‘http://fgcz-ms.uzh.ch/~cpanse/rawR_0.1.0.tar.gz’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages

R.version
_
platform x86_64-apple-darwin17.0
arch x86_64
os darwin17.0
system x86_64, darwin17.0
status
major 4
minor 0.3
year 2020
month 10
day 10
svn rev 79318
language R
version.string R version 4.0.3 (2020-10-10)
nickname Bunny-Wunnies Freak Out

RAW file still being acquired

Hi developers,

Thanks for the really great package! Saves me so much file conversion time.

I have a question regarding viewing RAW-files during acquisition. When I load those (also after making a copy of the file to 'fix' it), I get an error that the 'RAW file still being acquired', which of course is indeed the case.

Is there any possibilty to view it anyway, like you would do in vendor software to do some quick checks? Or is critical info save at the end of the run?

> RAW_chrom <- readChromatogram(rawfile = rawfile, tol = 3, mass = masses_of_intest)
RAW file still being acquired

Thanks!

the function “sample()” should be rename

Hi, Thankful for your great work to support the useful R packages, But the sample() is a import function in R base. Many R script and packages base on the "sample()" function. If you can rename the example database function like "example()" . IT will help researcher use more fluently.

Thanks for you grateful work for R packages“rawR”

Spectrum scan centroid mZ, intensity and noises values do not match

Hi:
I'm using "rawrr" to read and represent 2 spectrum scans. With the first scan, I had no problems when using the plot function to represent "spectrum.scan$centroid.mZ" values on the x axis, and "spectrum.scan$centroid.intensity" and" spectrum.scan$noises" as a rate on the y axis: The plotted spectrum is the same as the expected spectrum in this case, considering the obtained results with another software to represent them. However, with the second one, the plotted image is different as the expected one. I think that this is due to that mZ, intensity and noises vectors differ in length for this second scan. How can I approach this situation? How is it possible that these vector lengths differ? In the first scan, this wasn't the case.
Thank you for your time.

Functional test using raw files (autoQC01) from different LC-MS systems @FGCZ

determine input

last two files of each available instrument

PFILE=/srv/www/htdocs/Data2San/sync_LOGS/pfiles.txt ; 
cat ${PFILE} \
  | cut -f4 -d";" \
  | cut -d"/" -f3 \
  | sort \
  | uniq \
  |while read i ; 
  do 
        grep -E "/${i}/.*autoQC01.*raw$" ${PFILE} \
          | tail -n 2; 
  done \
  | cut -d";" -f4 \
  | while read raw ; 
    do 
        [[ -f /srv/www/htdocs/${raw} ]] && echo ${raw} ; 
  done

readChromatogram does not return 0 values

Hi,

Thanks for making a great tool! I have found it quite useful so far 👍
I have an issue for XIC values when I wish to plot a certain peptides.

Firstly, I can successfully extract and plot the XICs using your inbuilt functions, but cannot figure out how to constrain the retention times plotted/extracted.

I did manage to access the S3 elements in the chromatogram object and plot them myself in ggplot, but then had an issue where rawR does not report the 0 values for M/Zs at certain times. This is useful to see the shape of the eluting peptide, though I acknowledge it will likely increase the object size...

Is it possible to clarify (1) how to constrain the XIC for a certain retention time range, and (2) how to access (or at least impute from RT of MS1 scans) the 0 values of XICs.

Thanks again for making this tool,
Tara

readSpectrum(...)

refactor rawDiag::readScans()

arguments

rawfile : path to raw file
scans : numeric vector for selection based on scan index
filter : scan filter for logical selection of scans (e.g. MS, MS2, +, HCD, ...)

returns

S3 object, nested list of type Spectrum (not peaklist)

for FTMS scans the list should contain vectors for: mz, intensity, resolution, noise, charge

In addition header information is needed:

  • raw file
  • scan#
  • RT (Scan Start Time)
  • NL (normalized level = base beak intensity)
  • TIC (Total Ion Current)
  • Base Peak Mass
  • Scan Mode (e.g. FTMS + c NSI Full ms [350.0000 -1800.000]
  • Scan Low Mass
  • Scan High Mass

consider to build rawrr assemblies through using msbuild

benefit

no binary code would be contained in the source package.

requires

all RawFileReader Assemblies need to be installed

    if (Sys.which("msbuild") == "" && Sys.which("xbuild") == "")
    {
        warning ("could not find msbuild or xbuild in path; will not be able to use rDotNet unless corrected and rebuilt")
        return()
    }

see also

Expanded readIndex table

Hi,
I worked a lot with RawDiag in the past and really love this package, many thx! For certain reasons I need to switch to rawrr and I miss the information you already got from the index table in rawDiag. Especially the ScanDescription is missing and for QC many of the other values were of great help. Do you think there is a way to expand the exported column set similar to the one from rawDiag? It would speed up my code tremendously. My current solution is to read each spectrum and get the information from there, but it's very time consuming.
All the best
Henrik

manuscript preparation

https://axial.acs.org/2020/08/04/call-for-papers-second-biennial-special-issue-on-software-tools-and-resources/

The Journal of Proteome Research is preparing to publish its Second Biennial Special Issue on Software Tools and Resources in February 2021.

Software tools and data resources are essential to research in all omics domains, including proteomics and metabolomics. The goal of this recurring special issue is to highlight the latest novel and significantly updated software tools, web applications, and databases scientists can use for data analysis and visualization in proteomics and related research.

For readers, this provides an easily identifiable source of tools specifically reviewed for their applicability and ease of adoption. For authors, this provides visibility for and wider adoption of their tools in the proteomics community through dissemination and documentation.

The same team that led the first Special Issue on Software Tools and Resources will also lead this one:

Journal of Proteome Research Associate Editor Susan Weintraub, The University of Texas Health Science Center at San Antonio
Michael Hoopmann, Institute for Systems Biology
Magnus Palmblad, Leids Universitair Medisch Centrum
They invite you to submit a manuscript by October 31, 2020.

What to Submit—Deadline: October 31, 2020
For inclusion in this special issue, authors must present either a complete description of a relevant novel tool, library, web application, or database (article) or a substantial and meaningful update of a previously published tool or resource (technical note). The full working tool or database must be available free-of-charge to editors and reviewers for evaluation at the time of manuscript submission.

Tools with a graphical or web browser interface are preferred, but the editors will also consider well-documented web service APIs or libraries of functional building blocks for custom data analysis pipelines.

Manuscript Requirements
Manuscripts must be submitted electronically through the ACS Paragon Plus Environment online submission system by October 31, 2020, and conform to the Journal of Proteome Research Author Guidelines.

Authors, please:

 Indicate in your cover letter that the manuscript is for the Special Issue on Software Tools and Resources.
Remember, the full working tool or database must be available free-of-charge to editors and reviewers for evaluation at the time of manuscript submission.
Be concise and focus the manuscript on the unique or novel functionality of the tool. It should be clear to any reader what problem the system addresses and how it is used.
 For tools and libraries, use the table form to describe the input, operations, and output of each tool or function. A screenshot of the interface may be included if this has novel or unusual features.
LEARN MORE: Read the first Special Issue on Software Tools and Resources, including the Editorial by Susan Weintraub, Michael Hoopmann, and Magnus Palmblad.

rawR function that checks for required .NET framework

What about a rawR function that checks for the installed .NET framework? The C# code could be based on

 
const string subkey = @"SOFTWARE\Microsoft\NET Framework Setup\NDP\v4\Full\";

using (var ndpKey = RegistryKey.OpenBaseKey(RegistryHive.LocalMachine, RegistryView.Registry32).OpenSubKey(subkey))
{
    if (ndpKey != null && ndpKey.GetValue("Release") != null)
    {
        Console.WriteLine($".NET Framework Version: {CheckFor45PlusVersion((int)ndpKey.GetValue("Release"))}");
    }
    else
    {
        Console.WriteLine(".NET Framework Version 4.5 or later is not detected.");
    }
}

// Checking the version using >= enables forward compatibility.
static string CheckFor45PlusVersion(int releaseKey)
{
    if (releaseKey >= 528040)
        return "4.8 or later";
    if (releaseKey >= 461808)
        return "4.7.2";
    if (releaseKey >= 461308)
        return "4.7.1";
    if (releaseKey >= 460798)
        return "4.7";
    if (releaseKey >= 394802)
        return "4.6.2";
    if (releaseKey >= 394254)
        return "4.6.1";
    if (releaseKey >= 393295)
        return "4.6";
    if (releaseKey >= 379893)
        return "4.5.2";
    if (releaseKey >= 378675)
        return "4.5.1";
    if (releaseKey >= 378389)
        return "4.5";
    // This code should never execute. A non-null release key should mean
    // that 4.5 or later is installed.
    return "No 4.5 or later version detected";
}

as shown here by Microsoft.

We would just need to write into a tmp file instead to the console. The R function could be something like dotNetInfo() aligned with sessionInfo() output. Or we directly check if >= 4.5.1 and return a logical, but then the function should be named is.NET() and contain a parameter named release and we would set it to 4.5.1 as default. This would also be very useful for CI, since we probably can't request specific .NET version on the test infrastructure.

What do you think, @cpanse ?

package structure

Just came across this when running R CMD check:

checking for executable files ...
   Found the following executable files:
     exec/ThermoFisher.CommonCore.BackgroundSubtraction.dll
     exec/ThermoFisher.CommonCore.Data.dll
     exec/ThermoFisher.CommonCore.MassPrecisionEstimator.dll
     exec/ThermoFisher.CommonCore.RawFileReader.dll
     exec/rawR.exe
   Source packages should not contain undeclared executable files.
   See section ‘Package structure’ in the ‘Writing R Extensions’ manual.

Did that and found:

1.1.7 Non-R scripts in packages

Code which needs to be compiled (C, C++, Fortran …) is included in the src subdirectory and discussed elsewhere in this document.

Subdirectory exec could be used for scripts for interpreters such as the shell, BUGS, JavaScript, Matlab, Perl, php (amap), Python or Tcl (Simile), or even R. However, it seems more common to use the inst directory, for example WriteXLS/inst/Perl, NMF/inst/m-files, RnavGraph/inst/tcl, RProtoBuf/inst/python and emdbook/inst/BUGS and gridSVG/inst/js.

So shouldn't we put the rawR.exe and the dlls in src instead of exec @cpanse ?

readFileHeader(...)

function reads file header information. In Freestyle key:value pairs

Sample Name	autoQC01	
Comment		
Seq Row	10	
Sample Type	Unknown	
Path	D:\Data2San\p2469\Proteomics\QEXACTIVEHF_2\bpfister_20200714	
Cal Level		
Cal File		
Inj Volume	2	
Sample Weight	0	
Sample Volume	0	
Sample Id	NA	
Istd Amount	0	
CD Factor	0	
Bar Code		
Bar Code Status	0	
Inst Method	C:\Xcalibur\methods\__autoQC\trap\autoQC01.meth	
Proc Method		
User Text1	2469	
User Text2		
User Text3	FGCZ	
User Text4		
User Text5		
Tray Index	80	
Tray Name	ANSI-48Vial2mLHolder/ANSI-48Vial2mLHolder	
Tray Shape	Rectangular	
Vial Index	48	
Vials Per Tray	48	
Vials Per TrayX	8	
Vials Per TrayY	6	
Instrument Name	Q Exactive HF Orbitrap	
Instrument Model	Q Exactive HF Orbitrap	
Instrument Number	Exactive Series slot #2496	
Instrument SoftWare	2.9-290204/2.9.3.2948	
Instrument Hardware	rev. 1	
Flags		
Mass Tolerance	0.5 amu	
Created by	Administrator	

returns S3 object, (nested) list

ReadChromatogram intensity or area under the curve

Hi, I really appreciated your efforts in making this package.
I have a few questions about the readChromatogram function, I get the XIC for an analyte, such as the 836.07492 at tol:100.
Once I got the XIC, there are an equal number of retention time and intensity, when I look at the details of the RT and intensity, for example, at rt33.03 min, the output intensity from this function is 23020686, but for the raw data, the NL is 7.11E6. So I was wondering, how is the output 23020686 calculated?

Another question is that, is there any way to get the area under the curve of XIC? I want to do the quantitation analysis.

Thank you.

How to access S/N values

Dear all,
is there a way of accessing the S/N values from a readSpectrum-Object? I cant seem to find the info in the list.
Thanks

merge scan index and file header into a single `rawRindex` object?

Hi @cpanse,

I had a look at the return values of readIndex() and readFileHeader() and I think it would make sense to combine them into a single object. The object would be structured into a data portion which is the data.frame returned by readIndex. All items in the list returned by readFileHeader would become attributes of the object. The object class could be something like rawRindex.

Extracted Ion Chromatogram

Hi everyone,
I have a problem while completing XIC graphic, here is the code:

iRT.mZ <- c(487.2571, 547.2984, 622.8539, 636.8695, 644.8230, 669.8384,
683.8282, 683.8541, 699.3388, 726.8361, 776.9301)
c<- rawrr::readChromatogram(rawfile, mass = iRT.mZ, tol = 10, type = 'xic', filter = 'ms')
#Extracted Ion Chromatogram

plot(c, diagnostic = TRUE)

The problem is --> Error in xy.coords(x, y) : 'x' and 'y' lengths differ

issue

Can anyone help me?
Thanks all!

Noise value for individual mass peaks

Hello, and thank you for your package!

I was wondering if it is somehow possible to get the Noise value that is reported for every single mass peak in Thermo's .raw files.

Thanks!

citation()

We should update the package in a way that `citation("rawrr") returns the desired information. The current state is:

> citation(package = "rawrr")

To cite package ‘rawrr’ in publications use:

  Christian Panse and Tobias Kockmann (NA). rawrr: Access to Thermo Fisher Scientific raw
  files from R. R package version 0.1.7. https://github.com/fgcz/rawR/

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {rawrr: Access to Thermo Fisher Scientific raw files from R},
    author = {Christian Panse and Tobias Kockmann},
    note = {R package version 0.1.7},
    url = {https://github.com/fgcz/rawR/},
  }

Warning messages:
1: In citation(package = "rawrr") :
  no date field in DESCRIPTION file of package ‘rawrr’
2: In citation(package = "rawrr") :
  could not determine year for ‘rawrr’ from package DESCRIPTION file

I would suggest to reference our bioRxiv manuscript for now.

package naming

solve the naming conflict with

https://CRAN.R-project.org/package=rawr

R CMD check rawR_0.1.1.tar.gz

will produces

* package encoding: UTF-8
* checking CRAN incoming feasibility ... ERROR
Maintainer: 'Christian Panse <[email protected]>'
New submission
Conflicting package names (submitted: rawR, existing: rawr [https://CRAN.R-project.org])
Conflicting package names (submitted: rawR, existing: rawr [CRAN archive])
The Title field should be in title case. Current version is:
'Access to Thermo Fisher Scientific raw files from R'
In title case that is:
R> BiocCheck("rawR_0.1.1.tar.gz")
This is BiocCheck version 1.26.0. BiocCheck is a work in
progress. Output and severity of issues may change. Installing
package...
* Checking Package Dependencies...
* Checking if other packages can import this one...
* Checking to see if we understand object initialization...
* Checking for deprecated package usage...
* Checking for remote package usage...
* Checking version number...
* Checking for version number mismatch...
* Checking version number validity...
    Package version 0.1.1; pre-release
* Checking R Version dependency...
* Checking package size...
* Checking individual file sizes...
    * WARNING: The following files are over 5MB in size:
      'rawRcolor.tif'
* Checking biocViews...
* Checking that biocViews are present...
    * ERROR: No biocViews terms found.
See http://bioconductor.org/developers/how-to/biocViews/
* Checking build system compatibility...
* Checking for blank lines in DESCRIPTION...
* Checking if DESCRIPTION is well formatted...
* Checking for proper Description: field...
* Checking for whitespace in DESCRIPTION field names...
* Checking that Package field matches directory/tarball
  name...
* Checking for Version field...
* Checking for valid maintainer...
* Checking DESCRIPTION/NAMESPACE consistency...
    * WARNING: Import grDevices, graphics, utils in
      DESCRIPTION as well as NAMESPACE.
* Checking vignette directory...
    This is an unknown type of package
    * ERROR: No 'vignettes' directory.
* Checking library calls...
* Checking for library/require of rawR...
* Checking coding practice...
    * NOTE: Avoid sapply(); use vapply()
      Found in files:
        rawR.R (line 1011, column 29)
    * NOTE: Avoid 1:...; use seq_len() or seq_along()
      Found in files:
        rawR.R (line 600, column 36)
        rawR.R (line 745, column 70)
Warning in readLines(infile) :
  incomplete final line found on '/tmp/RtmpLJg2l6/filedebd713f5408/rawR/tests/testthat/test-header.R'
    * WARNING: Avoid class() == or class() != ; use is() or
      !is()
      Found in files:
        R/rawR.R (line 68)
* Checking parsed R code in R directory, examples,
  vignettes...
* Checking function lengths..........
    * NOTE: Recommended function length <= 50 lines.
      There are 5 functions > 50 lines.
      The longest 5 functions are:
        plot.rawRspectrum() (R/rawR.R, line 711): 108 lines
        readChromatogram() (R/rawR.R, line 429): 105 lines
        print.rawRspectrum() (R/rawR.R, line 839): 84 lines
        readFileHeader() (R/rawR.R, line 106): 61 lines
        validate_rawRspectrum() (R/rawR.R, line 635): 52 lines
* Checking man page documentation...
    * WARNING: Add non-empty \value sections to the following
      man pages: man/plot.rawRchromatogram.Rd,
      man/plot.rawRchromatogramSet.Rd,
      man/plot.rawRspectrum.Rd, man/print.rawRspectrum.Rd,
      man/summary.rawRspectrum.Rd
      man/plot.rawRspectrum.Rd, man/print.rawRspectrum.Rd,
      man/summary.rawRspectrum.Rd
    * ERROR: At least 80% of man pages documenting exported
      objects must have runnable examples. The following pages
      do not:
      new_rawRspectrum.Rd, plot.rawRchromatogramSet.Rd,
  validate_rawRspectrum.Rd
    * NOTE: Usage of dontrun{} / donttest{} found in man page
      examples.
      14% of man pages use one of these cases.
      Found in the following files:
        readChromatogram.Rd
        readSpectrum.Rd
    * NOTE: Use donttest{} instead of dontrun{}.
      Found in the following files:
        readChromatogram.Rd
        readSpectrum.Rd
* Checking package NEWS...
    * NOTE: Consider adding a NEWS file, so your package news
      will be included in Bioconductor release announcements.
* Checking unit tests...
* Checking skip_on_bioc() in tests...
* Checking formatting of DESCRIPTION, NAMESPACE, man pages, R
  source, and vignette source...
    * NOTE: Consider shorter lines; 32 lines (2%) are > 80
      characters long.
    First 6 lines:
      R/rawR.R:7 .writeRData <- function(rawfile, outputfile=paste0...
      R/rawR.R:14         list(scanType=rv$scanType, mZ=rv$mZ, inte...
      R/rawR.R:28                 warning("Can not find Mono JIT co...
      R/rawR.R:41             rvs <- system2(Sys.which('mono'), arg...
      R/rawR.R:64 #' pathToRawFile <- file.path(path.package(packag...
      R/rawR.R:154                 e$info$`Instrument method` <- ba...
    * NOTE: Consider 4 spaces instead of tabs; 5 lines (0%)
      contain tabs.
    First 5 lines:
      R/zzz.R:5 	if(interactive()){
      R/zzz.R:6 		version <- packageVersion('rawR')
      R/zzz.R:7 		packageStartupMessage("Package 'rawR' version ", ...
      R/zzz.R:8 	  invisible()
      R/zzz.R:9 	}
    * NOTE: Consider multiples of 4 spaces for line indents,
      233 lines(14%) are not.
    First 6 lines:
      R/rawR.R:107    mono = if(Sys.info()['sysname'] %in% c("Darwi...
      R/rawR.R:108    exe = system.file('exec/rawR.exe',package = '...
      R/rawR.R:109    mono_path = "",
      R/rawR.R:110    argv = "infoR",
      R/rawR.R:111    system2_call = TRUE,
      R/rawR.R:112                            method = "thermo"){
    See
      http://bioconductor.org/developers/how-to/coding-style/
    See styler package:
      https://cran.r-project.org/package=styler as described
      in the BiocCheck vignette.
* Checking if package already exists in CRAN...
    * ERROR: Package must be removed from CRAN.
* Checking for bioc-devel mailing list subscription...
    * NOTE: Cannot determine whether maintainer is subscribed
      to the bioc-devel mailing list (requires admin
      credentials). Subscribe here:
      https://stat.ethz.ch/mailman/listinfo/bioc-devel
* Checking for support site registration...
    Maintainer is registered at support site.
Summary:
ERROR count: 4
WARNING count: 4
NOTE count: 10
For detailed information about these checks, see the BiocCheck
vignette, available at
https://bioconductor.org/packages/3.12/bioc/vignettes/BiocCheck/inst/doc/BiocCheck.html#interpreting-bioccheck-output
BiocCheck FAILED.
$error
[1] "No biocViews terms found."                                                                                      
[2] "No 'vignettes' directory."                                                                                      
[3] "At least 80% of man pages documenting exported objects must have runnable examples. The following pages do not:"
[4] "Package must be removed from CRAN."                                                                             
$warning
[1] "The following files are over 5MB in size: 'rawRcolor.tif'"                                                                                                                                                 
[2] "Import grDevices, graphics, utils in DESCRIPTION as well as NAMESPACE."                                                                                                                                    
[3] " Avoid class() == or class() != ; use is() or !is()"                                                                                                                                                       
[4] "Add non-empty \\value sections to the following man pages: man/plot.rawRchromatogram.Rd, man/plot.rawRchromatogramSet.Rd, man/plot.rawRspectrum.Rd, man/print.rawRspectrum.Rd, man/summary.rawRspectrum.Rd"
$note
 [1] " Avoid sapply(); use vapply()"                                                                                                                                                     
 [2] " Avoid 1:...; use seq_len() or seq_along()"                                                                                                                                        
 [3] "Recommended function length <= 50 lines."                                                                                                                                          
 [4] "Usage of dontrun{} / donttest{} found in man page examples."                                                                                                                       
 [5] "Use donttest{} instead of dontrun{}."                                                                                                                                              
 [6] "Consider adding a NEWS file, so your package news will be included in Bioconductor release announcements."                                                                         
 [7] "Consider shorter lines; 32 lines (2%) are > 80 characters long."                                                                                                                   
 [8] "Consider 4 spaces instead of tabs; 5 lines (0%) contain tabs."                                                                                                                     
 [9] "Consider multiples of 4 spaces for line indents, 233 lines(14%) are not."                                                                                                          
R> 

Error in Example: Length of "x" and "y" are not matching

Hi everyone,
to test this package I wanted to load the .raw file and follow the provided example code.
Somehow my R sends me an error message that the "x and "y" coordinates are not matching:
Bildschirmfoto 2022-07-21 um 22 46 14

Now when I run I get:

> plot(S[[1]], centroid=TRUE)
Error in xy.coords(x, y, xlabel, ylabel, log) : 
  Length of 'x' and 'y' do not match

I have absolutly no Idea what I'm doing wrong and am super lost.
I would appreciate some help here!

Also I am new to R and working with bio-informatics data so if anyone could provide any help how to come up with the number in the scan vector (paper just mentions some database seach?) that would be awesome aswell.
Thanks in advance!

Error if too many scans are selected

Hello again :-)

Unfortunately I have a little problem, which I don't know how to solve...

readSpectrum() gives the following error message:
Error in source(tfo, local = TRUE) : negative length vectors are not allowed

A little example how my code looks:

library(rawrr)
library(tidyverse)

#reading Index and selecting scans with ms_order = "Ms"
ms_order <- "Ms"
IDX <- as_tibble(readIndex(path))
scans <- IDX %>% filter(MSOrder == ms_order) %>% pull(scan)

SPC <- readSpectrum(path, scan = scans)

I have about 12000 scans in total and about 2500 MS1 scans in my raw file. It works fine as long as I only read about 2000 scans. After that I receive the error message. I don't know if it due to memory limits on my machine.

Thanks for your help in advance!
kaempfro

Peak charges for MS1 spectras

Dear all,
thanks for this very useful package!
Is there a way to extract the peak "charges" of MS1 spectras with rawrr::readSpectrum? It seems to only work for MS2 spectras at the moment.
Thanks a lot.

Enhancement - Complete readIndex() function

Hi,

First, thank you for developping such a nice package. I've been using it for a few days and have been really amazed by it so far ! For some context, I'm a big fan of MSnbase, but it requires data conversion which can be inconvenient... One handy function of MSnbase that is partly missing (or ) in rawrr is header which gives access to a myriad of useful information in a data.frame as follows :

print(names(header(msfile)))

 [1] "seqNum"                     "acquisitionNum"            
 [3] "msLevel"                    "polarity"                  
 [5] "peaksCount"                 "totIonCurrent"             
 [7] "retentionTime"              "basePeakMZ"                
 [9] "basePeakIntensity"          "collisionEnergy"           
[11] "ionisationEnergy"           "lowMZ"                     
[13] "highMZ"                     "precursorScanNum"          
[15] "precursorMZ"                "precursorCharge"           
[17] "precursorIntensity"         "mergedScan"                
[19] "mergedResultScanNum"        "mergedResultStartScanNum"  
[21] "mergedResultEndScanNum"     "injectionTime"             
[23] "filterString"               "spectrumId"                
[25] "centroided"                 "ionMobilityDriftTime"      
[27] "isolationWindowTargetMZ"    "isolationWindowLowerOffset"
[29] "isolationWindowUpperOffset" "scanWindowLowerLimit"      
[31] "scanWindowUpperLimit"      

I don't know if this information is available but readIndex() could ideally contain (some of) these data, which would allow broader application of the package (for example, "isolationWindowLowerOffset" and "isolationWindowUpperOffset" give critical information for DIA applications).

Thanks again,
Vivian

generate function that transforms centroided spectrum into sparse matrix

Basic idea

Spectra are 2D data items (x, y data):

x : position (m/z)
y : intensity

All other information can be assumed to be meta data for the moment. The most basic idea to represent this data in R is to use two numeric vectors and pair according to the vector indices, so (xi, yi) are corresponding values in the 2D space generated by the vectors.

Collections of scans that are connected by a further dimension, for instance RT, could be handled as lists of vector tuples.

L 
   | - (xi, yi)
   | - (xi, yi)
   | - (xi, yi)

But there are some problems to this: If we use the RT to generate an index for L let's use j here, than we can only select scan according to index position, but this Lj may not be equal to the original scan# nor does it allow for RT-based access.

But we could add RT as data dimension directly and arrive at:

x : position (m/z)
y : intensity
z : RT

A 3D data type for numeric is simply an array. Array have nice properties, since they can be sliced along all dimensions as needed. The only problem left to solve is: Centroided data generates vectors of unequal length, but transforming these to sparse vectors/matrixes would solve the problem.

Two steps:

  1. Generate a function that takes the rawRspectrum object as input and returns a sparse matrix.
  2. Concat the sparse matrices to a sparse array.

The first would also be handy if one would like to compute dot products or alike.

Issue with readChromatogram when type = "xic"

I have been using this package for ~1 year now, specifically the readChromatogram function for extracting "tic" and "base peak". It works nicely and has been very useful. However, I have just for the first time tried to extract an "xic" for some masses of interest, and here is what I get:

XICs <- lapply(Raws, function(raw) { readChromatogram(raw, masses, tols) })
Error in .rawrrSystem2Source(rawfile, input = mass, rawrrArgs = sprintf("xic %f %s",  : 
  **Rcode file to parse does not exist. 'C:\Users\MyUserName\AppData\Local/R/cache/R/rawrr/rawrrassembly/rawrr.exe' failed for an unknown reason.
Please check the debug files:
	C:\Users\MyUserName\AppData\Local\Temp\2\RtmpeQSRv0\file62986016698d.stderr
	C:\Users\MyUserName\AppData\Local\Temp\2\RtmpeQSRv0\file629849cf6634.stdout
and the System Requirements
Called from: .rawrrSystem2Source(rawfile, input = mass, rawrrArgs = sprintf("xic %f %s", 
    tol, shQuote(filter)))**

This is on a Windows 2019 Server machine, using R version 4.1.0 (2021-05-18) in RStudio 1.4.1717.
File "C:/Users/MyUserName/AppData/Local/R/cache/R/rawrr/rawrrassembly/rawrr.exe" does exist, but maybe this is an issue with slashes in Windows, since the error uses inconsistently backwards (Windows) and forward (Linux) slashes? In which case, including normalizePath(..., winslash = "/") would probably be enough to fix it?

Missing Data readSpectrum(...)

Hi cpanse and tobiakso

I've noticed that there are some parameters missing in the rawRspectrum after importing rawfile.
However not all those parameters may be set in our experiment. But I also get a wrong reading of Base Peak Intensity and Base Peak Mass.

Thanks for helping!

RawRspectrum:

> Total Ion Current:	 4870947
> Scan Low Mass:	 50
> Scan High Mass:	 250
> Scan Start Time (Min):	 0
> Scan Number:	 1
> Base Peak Intensity:	 -1
> Base Peak Mass:	 -1
> Scan Mode:	 FTMS + p NSI Full ms [50.00-250.00]
> ======= Instrument data =====   : 	
> 
> Multiple Injection: 	
> 
> Multi Inject Info: 	
> 
> AGC:	On
> Micro Scan Count:	1
> Scan Segment:	0
> Scan Event:	0
> Master Index:	0
> Charge State:	1
> Monoisotopic M/Z:	78.0468
> Ion Injection Time (ms):	100.000
> Max. Ion Time (ms): 	
> 
> FT Resolution:	30000
> MS2 Isolation Width:	0.0
> MS2 Isolation Offset: 	
> 
> AGC Target: 	
> 
> HCD Energy: 	
> 
> Analyzer Temperature: 	
> 
> === Mass Calibration: 	
> 
> Conversion Parameter B:	47557789.235
> Conversion Parameter C:	-2547049.695
> Temperature Comp. (ppm): 	
> 
> RF Comp. (ppm): 	
> 
> Space Charge Comp. (ppm): 	
> 
> Resolution Comp. (ppm): 	
> 
> Number of Lock Masses: 	
> 
> Lock Mass #1 (m/z): 	
> 
> Lock Mass #2 (m/z): 	
> 
> Lock Mass #3 (m/z): 	
> 
> LM Search Window (ppm): 	
> 
> LM Search Window (mmu): 	
> 
> Number of LM Found: 	
> 
> Last Locking (sec): 	
> 
> LM m/z-Correction (ppm): 	
> 
> === Ion Optics Settings: 	
> 
> S-Lens RF Level: 	
> 
> S-Lens Voltage (V): 	
> 
> Skimmer Voltage (V): 	
> 
> Inject Flatapole Offset (V): 	
> 
> Bent Flatapole DC (V): 	
> 
> MP2 and MP3 RF (V): 	
> 
> Gate Lens Voltage (V): 	
> 
> C-Trap RF (V): 	
> 
> ====  Diagnostic Data: 	
> 
> Dynamic RT Shift (min): 	
> 
> Intens Comp Factor: 	
> 
> Res. Dep. Intens: 	
> 
> CTCD NumF: 	
> 
> CTCD Comp: 	
> 
> CTCD ScScr: 	
> 
> RawOvFtT: 	
> 
> LC FWHM parameter: 	
> 
> Rod: 	
> 
> PS Inj. Time (ms): 	
> 
> AGC PS Mode: 	
> 
> AGC PS Diag: 	
> 
> HCD Energy eV: 	
> 
> AGC Fill: 	
> 
> Injection t0: 	
> 
> t0 FLP: 	
> 
> Access Id: 	
> 
> Analog Input 1 (V): 	
> 
> Analog Input 2 (V): 	

Get information on gradient

Hello,

really great package! I was wondering if it was possible to also get information on the chromatography via your package? For example, getting the LC gradient or the LC pressure curve would be great!

Thank you in advance!
Yasin

install test cases

runs on fgcz-c-073

TEST CASE 1 - no mono runtime

docker run -a stdin -a stdout -i -t rocker/verse:4.0.5 R
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawrr_0.99.13_19.tar.gz', repo=NULL)


rawfile <- rawrr::sampleFilePath()

h <- rawrr::readFileHeader(rawfile)
i <- rawrr::readIndex(rawfile)
x <- rawrr::readChromatogram(rawfile=rawfile, type="tic")
s <- rawrr::readSpectrum(rawfile, 1:9)

TEST CASE 2 - runtime installed

docker run -a stdin -a stdout -i -t c95c10872a5d
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawrr_0.99.13_19.tar.gz', repo=NULL)

rawfile <- rawrr::sampleFilePath()

h <- rawrr::readFileHeader(rawfile)
i <- rawrr::readIndex(rawfile)
x <- rawrr::readChromatogram(rawfile=rawfile, type="tic")
s <- rawrr::readSpectrum(rawfile, 1:9)

Listing of the Dockerfile

FROM rocker/verse:4.0.5
 
RUN apt-get update \
&& sudo apt-get install mono-runtime -y

CMD ["R"]

TEST CASE 3 - msbuild is installed

docker run -a stdin -a stdout -i -t f53000645fca
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawrr_0.99.13_19.tar.gz', repo=NULL)

rawfile <- rawrr::sampleFilePath()


h <- rawrr::readFileHeader(rawfile)
i <- rawrr::readIndex(rawfile)
x <- rawrr::readChromatogram(rawfile=rawfile, type="tic")
s <- rawrr::readSpectrum(rawfile, 1:9)

Listing of the Dockerfile

FROM rocker/verse:4.0.5
 
RUN apt-get update \
&& sudo apt-get install mono-mcs mono-xbuild -y

CMD ["R"]

TEST CASE 4 - msbuild is installed and MONO_PATH set

docker run -a stdin -a stdout -i -t -v /usr/local/lib/RawFileReader/:/usr/local/lib/RawFileReader/ d6cec6026a70
docker run -i -v /usr/local/lib/RawFileReader/:/usr/local/lib/RawFileReader/ d6cec6026a70 R --no-save << EOF

install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawrr_0.99.13_19.tar.gz', repo=NULL)
Sys.getenv("MONO_PATH")


rawfile <- rawrr::sampleFilePath()

h <- rawrr::readFileHeader(rawfile)
i <- rawrr::readIndex(rawfile)
x <- rawrr::readChromatogram(rawfile=rawfile, type="tic")
s <- rawrr::readSpectrum(rawfile, 1:9)

EOF

Listing of the Dockerfile

FROM rocker/verse:4.0.5
 
RUN apt-get update \
&& sudo apt-get install mono-mcs mono-xbuild -y

CMD ["R"]

readDetectorList(...) and details

Each raw file header contains a detector list. c# methods are:

int GetInstrumentCountOfType (Device type)

Device GetInstrumentType(int index);

int InstrumentCount { get; }

see page 17 of UsingRawFileReader.

details are available through InstrumentData GetInstrumentData();

readChromatogram

Hi,

I really like the package. Thank you for that! I was just going through the vignette and when I do readChromatogramm it gives me the following error. I guess I will not be the only one with that. How do you solve that?

plot(rawR::readChromatogram(rawfile = rawfile, type = "tic"))
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
duplicate 'row.names' are not allowed

Also:

C <- rawR::readChromatogram(rawfile, mass = iRTmZ, tol = 10, type = "xic", filter = "ms")

plot(C, diagnostic = TRUE)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In min(x, na.rm = na.rm) :
no non-missing arguments to min; returning Inf
2: In max(x, na.rm = na.rm) :
no non-missing arguments to max; returning -Inf
3: In min(x, na.rm = na.rm) :
no non-missing arguments to min; returning Inf
4: In max(x, na.rm = na.rm) :
no non-missing arguments to max; returning -Inf

Cheers!

Titles in .plotChromatogramAndFit?

Hi guys,

When plotting multiple raw files with iRT peptides, I'm using the function .plotChromatogramAndFit that you showed.
I want to add a title being the name of each of those raw files to the plot, though I'm not being successful. Any ideas?

plot(x, main=???); legend("topright", legend=i, title='Instrument Model', bty = "n", cex=0.75)

Thanks a lot for the great library :)

readChromatogram(...)

refactor rawDiag::readXICs(rawfile, masses=unique(RAW$PrecursorMass), tol=1000)

returns a nested S3 list

[[22]]
$mass
[1] 554.2606

$times
[1] 0.1216408 0.1516450 0.4810452 0.5409557 0.7801059

$intensities
[1] 3005.061 4328.104 3658.515 3862.011 4992.357

$filename
[1] "sample.raw"

attr(,"class")
[1] "list" "XIC" 

attr(,"class")
[1] "list" "XICs"
> X[[20]]
$mass
[1] 653.3617

$times
 [1] 0.001619751 0.031642766 0.061663615 0.091651065 0.121640750 0.151644970
 [7] 0.181667770 0.211526280 0.241284530 0.271307600 0.301222200 0.331145000
[13] 0.361147270 0.391168050 0.421057700 0.450970250 0.481045180 0.510989030
[19] 0.540955750 0.570893130 0.600724580 0.630620300 0.660428770 0.690318400
[25] 0.720320350 0.750197620 0.780105920

$intensities
 [1] 374171.6 405717.2 350914.7 373948.4 328768.2 425965.4 360327.9 453483.1
 [9] 445894.1 430538.9 422901.3 545305.5 433117.8 357588.1 435593.4 351018.2
[17] 407768.5 406468.8 446027.8 385148.5 579871.8 461409.0 390769.3 458988.6
[25] 378339.6 480078.4 467780.0

$filename
[1] "sample.raw"

attr(,"class")

with additional attributes:

input:

  • scan filter
  • type : XIC, BPC, TIC

type BPC, TIC need no additional parameters. XIC requires mz and tolerance in addition.

output:

  • type, e.g, XIC, TIC, BPC (base peak chromatogram)
  • tolerance (in ppm)
  • mz
  • scan filter

Usage of `@importFrom`

"If you are using just a few functions from another package, the recommended option is to note the package name in the Imports: field of the DESCRIPTION file and call the function(s) explicitly using ::, e.g., pkg::fun(). Alternatively, though no longer recommended due to its poorer readability, use @importFrom, e.g., @importFrom pgk fun, and call the function(s) without ::."

taken from https://roxygen2.r-lib.org/articles/namespace.html#imports

Example found in rawrr.R:

#' Plot \code{rawrrChromatogramSet} objects
#'
#' @param x A \code{rawrrChromatogramSet} object to be plotted.
#' @param ... Passes additional arguments.
#' @param diagnostic Show diagnostic legend?
#' @author Tobias Kockmann, 2020.
#' @export
#' @importFrom grDevices hcl.colors
#' @importFrom graphics lines text

and many many more!

include query functions for ProteomicsDB / Prosit

The goal would be that Spectra could not only be read from local raw files, but also public repositories like ProteomicsDB and prediction services like Prosit. A REST endpoint is already available and used by USE. This REST interface should also work for queries from R.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.